correlated variational auto encoders
play

Correlated Variational Auto-Encoders Da Tang 1 Dawen Liang 2 Tony - PowerPoint PPT Presentation

Correlated Variational Auto-Encoders Da Tang 1 Dawen Liang 2 Tony Jebara 1 , 2 Nicholas Ruozzi 3 1 Columbia University 2 Netflix Inc. 3 The University of Texas at Dallas June 11, 2019 Variational Auto-Encoders (VAEs) Learn stochastic low


  1. Correlated Variational Auto-Encoders Da Tang 1 Dawen Liang 2 Tony Jebara 1 , 2 Nicholas Ruozzi 3 1 Columbia University 2 Netflix Inc. 3 The University of Texas at Dallas June 11, 2019

  2. Variational Auto-Encoders (VAEs) ◮ Learn stochastic low dimensional latent representations for high dimensional data: q λ (z | x) p θ (x | z) Data x Latent representa3on z Reconstruc3on x ⌃

  3. Variational Auto-Encoders (VAEs) ◮ Learn stochastic low dimensional latent representations for high dimensional data: q λ (z | x) p θ (x | z) Data x Latent representa3on z Reconstruc3on x ⌃ ◮ Model the likelihood and the inference distribution independent among data points in the objective (the ELBO ): n � L ( λ , θ ) = ( E q λ ( z i | x i ) [log p θ ( x i | z i )] − KL( q λ ( z i | x i ) || p 0 ( z i ))) . i =1

  4. Motivation ◮ VAEs assume the prior is i.i.d. among data points.

  5. Motivation ◮ VAEs assume the prior is i.i.d. among data points. ◮ If we know information about correlations between data points (e.g., networked data), we can incorporate it into the generative process of VAEs.

  6. Learning with a Correlation Graph ◮ Given an undirected correlation graph G = ( V , E ) for data x 1 , . . . , x n , where V = { v 1 , . . . , v n } and E = { ( v i , v j ) : x i and x j are correlated } .

  7. Learning with a Correlation Graph ◮ Given an undirected correlation graph G = ( V , E ) for data x 1 , . . . , x n , where V = { v 1 , . . . , v n } and E = { ( v i , v j ) : x i and x j are correlated } . ◮ Directly applying a correlated prior of z = ( z 1 , . . . , z n ) on general undirected graphs is hard.

  8. Correlated Priors Define the prior of z as a uniform mixture over all Maximal Acyclic Sub- graphs of G : … 1 � p G ′ p corr g ( z ) = 0 ( z ) . 0 |A G | G ′ =( V , E ′ ) ∈A G

  9. Correlated Priors We apply a uniform mixture over acyclic subgraphs since we have closed- form correlated distributions for acyclic graphs: … n p 0 ( z i , z j ) p G ′ � � 0 ( z ) = p 0 ( z i ) p 0 ( z i ) p 0 ( z j ) . i =1 ( v i , v j ) ∈ E ′

  10. Correlated Priors We apply a uniform mixture over acyclic subgraphs since we have closed- form correlated distributions for acyclic graphs: … n p 0 ( z i , z j ) p G ′ � � 0 ( z ) = p 0 ( z i ) p 0 ( z i ) p 0 ( z j ) . i =1 ( v i , v j ) ∈ E ′

  11. Correlated Priors We apply a uniform mixture over acyclic subgraphs since we have closed- form correlated distributions for acyclic graphs: … n p 0 ( z i , z j ) p G ′ � � 0 ( z ) = p 0 ( z i ) p 0 ( z i ) p 0 ( z j ) . i =1 ( v i , v j ) ∈ E ′

  12. Inference with a Weighted Objective Define a new ELBO for general graphs: log p θ ( x ) = log E p ( z ) [ p θ ( x | z )] corr g 0 1 � � � λ ( z | x ) [log p θ ( x | z )] − KL( q G ′ λ ( z | x ) || p G ′ ≥ 0 ( z )) E q G ′ |A G | G ′ ∈A G := L ( λ , θ ) where q G ′ λ is defined in the same way as for the priors: n q λ ( z i , z j | x i , x j ) q G ′ � � λ ( z ) = q λ ( z i | x i ) q λ ( z i | x i ) q λ ( z j | x j ) . i =1 ( v i , v j ) ∈ E ′

  13. Inference with a Weighted Objective ◮ The loss function is intractable due to the potentially exponential many subgraphs.

  14. Inference with a Weighted Objective ◮ The loss function is intractable due to the potentially exponential many subgraphs. ◮ Represent the average loss on acyclic subgraphs as a weighted average loss on edges. ½ ½ ⅔ 1 ⅔ ½ ½ ½ ⅔ … ½

  15. Inference with a Weighted Objective ◮ The loss function is intractable due to the potentially exponential many subgraphs. ◮ Represent the average loss on acyclic subgraphs as a weighted average loss on edges. ½ ½ ⅔ 1 ⅔ ½ ½ ½ ⅔ … ½ ◮ The weighted loss is tractable. The weights can be computed from the pseudo-inverse of the Laplacian matrix of G .

  16. Empirical Results Table: Link prediction test NCRR Table: Spectral clustering scores Method Test NCRR Method NMI scores vae 0 . 0052 ± 0 . 0007 vae 0 . 0031 ± 0 . 0059 GraphSAGE 0 . 0115 ± 0 . 0025 GraphSAGE 0 . 0945 ± 0 . 0607 cvae 0 . 0171 ± 0 . 0009 cvae 0 . 2748 ± 0 . 0462 Table: User matching test RR Method Test RR 0 . 3498 ± 0 . 0167 vae cvae 0 . 7129 ± 0 . 0096

  17. Conclusion and Future Work ◮ CVAE accounts for correlations between data points that are known a priori . It can adopt a correlated variational density function to achieve a better variational approximation.

  18. Conclusion and Future Work ◮ CVAE accounts for correlations between data points that are known a priori . It can adopt a correlated variational density function to achieve a better variational approximation. ◮ Future work includes extending to correlated VAEs with higher-order correlations.

  19. Thanks! Poster #219 Code available at https://github.com/datang1992/Correlated-VAEs.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend