variational network inference strong and stable with
play

Variational Network Inference: Strong and Stable with Concrete - PowerPoint PPT Presentation

Variational Network Inference: Strong and Stable with Concrete Support Amir Dezfouli, Edwin V. Bonilla and Richard Nock Network Structure Discovery: A Flexible Approach N nodes, T observations: = { y i , t i } 1 Goal: Learn network


  1. Variational Network Inference: 
 Strong and Stable with Concrete Support Amir Dezfouli, Edwin V. Bonilla and Richard Nock

  2. Network Structure Discovery: A Flexible Approach N nodes, T observations: 𝒠 = { y i , t i } 1 Goal: Learn network structure W 15 W 12 Existence, directionality and strengths 5 2 Model ϵ it ∼ Normal (0, σ 2 W 54 W 32 y ) W 43 4 3 y i ( t ) = f i ( t ) + ϵ it ξ jt ∼ Normal (0, σ 2 f ) N ∑ f i ( t ) = z i ( t ) + A ij W ij [ f j ( t ) + ξ jt ] Network parameters j =1, j ≠ i A ij ∈ {0,1}, W ij ∈ ℝ p ( A , W ) = ∏ Network-independent p ( A ij ) p ( W ij ) trend ij p ( A ij ) = Bern ( ρ ) z i ( t ) ∼ GP (0, κ ( t , t ′ � ; θ )) p ( W ij ) = Normal (0, σ 2 w ) 1

  3. 
 Inference Goal: Estimate p ( A , W | 𝒠 ) Complications: Trick 1: Derive “inverse” model f defined cyclically GPs notoriously unscalable 
 f ( t ) = ( I − A ⊙ W ) − 1 ( z ( t ) + A ⊙ W ξ t ) O(N 3 T 3 ) Complicated marginal likelihood 
 Trick 2: Marginalise f analytically ( f depends on A , W ) Trick 3: Relate to 
 p ( y | A , W ) = Normal ( 0 , Σ y ) Multi-task learning (MTL) Σ y = K f ⊗ K t + K σ ⊗ I MTL with product covariance (Bonilla et al, 2008; Rakitsch et al, 2013) Nodes are “tasks” How to deal with complex Sum of two Kronecker products Covariances determined by A , W dependency on A , W ? Modern variational inference More efficient computation O(N 3 + T 3 ) 2

  4. Modern Variational Inference p ( A , W ) p ( y | A , W ) ℒ elbo = ℒ kl + ℒ ell 4 2 0 ℒ kl = − KL ( q ( A , W ) || p ( A , W )) -2 log p ( y ) ℒ ell = 𝔽 q ( A , W ) log p ( y | A , W ) ELBO Expectations using Monte Carlo Re-parameterization trick q old ( A , W ) q new ( A , W ) posterior Cannot be applied to discrete rv Trick 4: Concrete Distribution: q ( A ij ) = Concrete ( α ij , λ c ) α ij are variational parameters Aka Gumble-Softmax trick, can sample and evaluate log q(A ij ) It helps us get stability for free 3

  5. Theory: Numerical Stability Usually imposes the non-singularity of I − A ⊙ W Sometimes with additional constraints (boundedness of coordinates, eigenvalues) Theorem 1: “We get stability for free” For any is non-singular with λ c ≥ 0, α ij ≥ 0( i ≠ j ), I − A ⊙ W probability 1. Theorem 2: For any λ c ≥ 0, α ij ≥ 0( i ≠ j ), σ 2 y ≥ 0, | ℒ ell | ≪ ∞ 4

  6. Theory: Model Stability Bounds the signal’s log likelihood as a function of external parameters Theorem 3: Statistical “robustness” W ij ∼ Normal ( μ ij , σ 2 A ij ∼ Bern ( ρ ij ) If and , then under a condition ij ) on the network signal, it holds with large probability: , where − log p ( y | W , A ) ∈ [ g ( λ ∘ , y ), g ( λ ∙ , y )] , ∀ y g ( z , y ) = θ (log z + z ∥ y ∥ 2 λ ∘ = λ ↓ ( K t )/2 + σ 2 y , λ ∙ = 2 λ ↑ ( K t ) + σ 2 f + σ 2 2 ), y and condition appears in various forms in previous work. Important practical consequences 5

  7. Experiments and Conclusions Sydney property prices, brain fMRI, yeast genome Pittwater Manly Mosman Hunters Hill Woollahra Bayesian approach for network structure discovery Efficient inference stability for “free”, robustness and easy estimation 6

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend