Large Graph Limits of Learning Algorithms Matt Dunlop, Xiyang - PowerPoint PPT Presentation

Large Graph Limits of Learning Algorithms Matt Dunlop, Xiyang (Michael) Luo Computing and Mathematical Sciences, Caltech Department of Mathematics, UCLA Andrea Bertozzi (UCLA), Xiyang Luo (UCLA) Andrew Stuart (Caltech) and Kostas Zygalakis (Edinburgh) JUQ, to appear ⋆ Matt Dunlop (Caltech), Dejan Slepˇ cev (CMU) Andrew Stuart (Caltech) and Matt Thorpe (Cambridge) In preparation 1

Talk Overview Learning and Inverse Problems Graph Laplacian Inverse Problem Formulation Large Graph Limits Probability Conclusions 2

Regression Let D ⊂ R d be a bounded open set. Let D ′ ⊂ D . Ill-Posed Inverse Problem Find u : D �→ R given x ∈ D ′ . y ( x ) = u ( x ) , Strong prior information needed. 4

Classification Let D ⊂ R d be a bounded open set. Let D ′ ⊂ D . Ill-Posed Inverse Problem Find u : D �→ R given � � x ∈ D ′ . y ( x ) = sign u ( x ) , Even stronger prior information needed. 5

y = sign ( u ) . Red = 1. Blue = − 1 . Yellow: no information. 6

Reconstruction of the function u on D 7

Graph Laplacian Graph Laplacian: Similarity graph G with n vertices Z = { 1 , . . . , n } . � � Weighted adjacency matrix W = { w j , k } , w j , k = η ε ( x j − x k ) . Diagonal D = diag { d jj } , d jj = � k ∈ Z w j , k . L = s n ( D − W ) (unnormalized). Spectral Properties: j ∼ k w j , k | u j − u k | 2 . L is positive semi-definite: � u , Lu � R n ∝ � Lq j = λ j q j ; Fully connected ⇒ λ 1 > λ 0 = 0 . Fiedler Vector: q 1 . 9

Example: Voting Records U.S. House of Representatives 1984, 16 key votes. For each congress representative we have an associated feature vector x j ∈ R 16 such as x j = ( 1 , − 1 , 0 , · · · , 1 ) T ; 1 is “yes”, − 1 is “no” and 0 abstain/no-show. Here d = 16 and n = 435 . Figure: Strong Prior Information: Fiedler Vector and Spectrum (Normalized) 10

Example of Underlying Gaussian (Voting Records) Figure: Two point correlation of sign ( u ) for 3 Democrats 11

Problem Statement (Optimization) Semi-Supervised Learning Input : � x j ∈ R d , � j ∈ Z := { 1 , . . . , n } Unlabelled data ; j ∈ Z ′ ⊂ Z � � y j ∈ {± 1 } , . Labelled data Output : � � y j ∈ {± 1 } , j ∈ Z Labels . Classification based on sign ( u ) , u the optimizer of: J ( u ; y ) = 1 2 � u , C − 1 u � R n + Φ( u ; y ) . u is an R − valued function on the graph nodes. C = ( L + τ 2 I ) − α � � from unlabelled data: w j , k = η ε ( x j − x k ) . Φ( u ; y ) links real-valued u to the binary-valued labels y . 13

Problem Statement (Bayesian Formulation) Semi-Supervised Learning Input : � x j ∈ R d , � Unlabelled data j ∈ Z := { 1 , . . . , n } ; prior j ∈ Z ′ ⊆ Z � � Labelled data y j ∈ {± 1 } , . likelihood Output : � � Labels y j ∈ {± 1 } , j ∈ Z . posterior Connection between probability and optimization: J ( n ) ( u ; y ) = 1 2 � u , C − 1 u � R n + Φ ( n ) ( u ; y ) . − J ( n ) ( u ; y ) � � P ( u | y ) ∝ exp − Φ ( n ) ( u ; y ) � � ∝ exp × N ( 0 , C ) ∝ P ( y | u ) × P ( u ) . 14

Probit Rasmussen and Williams, 2006. (MIT Press) Bertozzi, Luo, Stuart and Zygalakis, 2017. (SIAM-JUQ) Probit Model p ( u ; y ) = 1 J ( n ) 2 � u , C − 1 u � R n + Φ ( n ) p ( u ; y ) . Here C = ( L + τ 2 I ) − α , Φ ( n ) � � � p ( u ; y ) := − log Ψ( y j u j ; γ ) j ∈ Z ′ where Ψ is the smoothed Heaviside function: � v 1 � − t 2 / 2 γ 2 � Ψ( v ; γ ) = exp d t . � 2 πγ 2 −∞ 15

Level Set Iglesias, Lu and Stuart, 2016. (IFB) Level Set Model ls ( u ; y ) = 1 J ( n ) 2 � u , C − 1 u � R n + Φ ( n ) ls ( u ; y ) . Here C = ( L + τ 2 I ) − α , and 1 Φ ( n ) � | 2 . � � � ls ( u ; y ) := � y j − sign u j 2 γ 2 j ∈ Z ′ 16

Sampling Algorithm Cotter, Roberts, Stuart, White, 2013. (Statis. Sci.) The preconditioned Crank-Nicolson (pCN) Method 1: Define: α ( u , v ) = min { 1 , exp(Φ( u ) − Φ( v ) } . C = ( L + τ 2 I ) − α 2: while k < M do v ( k ) = 1 − β 2 u ( k ) + βξ ( k ) , where ξ ( k ) ∼ N ( 0 , C ) . � 3: Calculate acceptance probability α ( u ( k ) , v ( k ) ) . 4: Accept: u ( k + 1 ) = v ( k ) with probability α ( u ( k ) , v ( k ) ) , otherwise 5: Reject: u ( k + 1 ) = u ( k ) . 6: 7: end while Bertozzi, Luo, Stuart, 2018. (In preparation.) E ( α ( u , v )) = O ( Z 2 Z 0 = µ ( { S ( u ( j )) = y ( j ) | j ∈ Z ′ } ) 0 ) , 17

Example of UQ (Hyperspectral) Here d = 129 and N ≈ 3 × 10 5 . Use Nystr¨ om . Figure: Spectral Approximation. Uncertain classification in red. 18

Limit Theorem for the Dirichlet Energy Garcia-Trillos and Slepˇ cev, 2016. (ACHA) Unlabelled data { x j } sampled i.i.d. from density ρ supported on bounded D ⊂ R d . Let ∂ u L u = − 1 � � ρ 2 ∇ u ρ ∇ · x ∈ D , ∂ n = 0 , x ∈ ∂ D . Theorem 2 2 Let s n = C ( η ) n ε 2 . Then under connectivity conditions on ε = ε ( n ) in η ε , the scaled Dirichlet energy Γ − converges in the TL 2 metric: 1 n � u , Lu � R n → � u , L u � L 2 as n → ∞ . ρ 20

Limit Theorem for Probit Dunlop, Slepˇ cev, Stuart and Thorpe, In preparation, 2018. D ± two disjoint bounded subsets of D , define D ′ = D + ∪ D − and y ( x ) = − 1 , x ∈ D − . y ( x ) = + 1 , x ∈ D + ; Assume that # D n / n → const. as n → ∞ . For α > 0, define C = ( L + τ 2 I ) − α . Recall L u = − 1 ρ ∇ · ( ρ 2 ∇ u ) , and no flux boundary conditions. Theorem 3 2 Let s n = C ( η ) n ε 2 . Then under connectivity conditions on ε = ε ( n ) the scaled probit objective function Γ − converges in the TL 2 metric: 1 n J ( n ) p ( u ; y ) → J p ( u ; y ) as n → ∞ , J p ( u ; y ) = 1 � u , C − 1 u � ρ + Φ p ( u ; y ) , L 2 2 � � � Φ p ( u ; y ) := − D ′ log Ψ( y ( x ) u ( x ) ; γ ) ρ ( x ) d x . 21

Limit Theorem for Probit Dunlop, Slepˇ cev, Stuart and Thorpe, In preparation, 2018. Assume now that # D n is fixed as n → ∞ . Theorem 4 2 Let s n = C ( η ) n ε 2 with ε = ε ( n , α ) . Suppose that either 1 2 α → ∞ ; or α > d / 2 and ε ( n , α ) n 1 α < d / 2. 2 Then with probability one, sequences of minimizers of J ( n ) converge p to zero in the TL 2 metric. 22

Example (PDE Two Moons – Unlabelled Data) Figure: Sampling density ρ of unlabelled data. 24

Example (PDE Two Moons – Label Data) Figure: Labelled Data. 25

Example (PDE Two Moons – Fiedler Vector of L ) Figure: Fiedler Vector. 26

Example (PDE Two Moons – Posterior Labelling) Figure: Posterior mode of u and sign ( u ) . 27

Connecting Probit, Level Set and Regression Dunlop, Slepˇ cev, Stuart and Thorpe, In preparation, 2017. Probit and Level Set Probabilistic Models Prior: Gaussian P ( d u ) = N ( 0 , C ) . � � Probit Posterior: P γ ( d u | y ) ∝ exp − Φ p ( u ; y ) P ( d u ) . � � Level Set Posterior: P γ ( d u | y ) ∝ exp − Φ ls ( u ; y ) P ( d u ) . Theorem 4 Let α > d 2 . We have P γ ( u | y ) ⇒ P ( u | y ) as γ → 0 where P ( d u | y ) ∝ 1 A ( u ) P ( d u ) , P ( d u ) = N ( 0 , C ) � � x ∈ D ′ } . A = { u : sign u ( x ) = y ( x ) , Compare with regression ( Zhu, Ghahramani, Lafferty 2003, (ICML): ) x ∈ D ′ } . A �→ A 0 = { u : u ( x ) = y ( x ) , 28

Example (MNIST: Human-in-the-loop labelling) Figure: 100 most uncertain digits, 200 labels. Mean uncertainty: 14 . 0 % 29

Example (MNIST) Figure: 100 most uncertain digits, 300 labels. Mean uncertainty: 10 . 3 % 30

Example (MNIST) Figure: 100 most uncertain digits, 400 labels. Mean uncertainty: 8 . 1 % 31

Summary: Graph Based Learning Single optimization framework for classification algorithms. Single Bayesian framework for classification algorithms. Large graph limit reveals novel inverse problem structure. Links between probit, level set and regression. Gaussian measure conditioned on its sign. UQ for human-in-the-loop learning. Efficient MCMC algorithms. 33

Large Graph Limits of Learning Algorithms Matt Dunlop, Xiyang - PowerPoint PPT Presentation

Large Graph Limits of Learning Algorithms Matt Dunlop, Xiyang (Michael) Luo Computing and Mathematical Sciences, Caltech Department of Mathematics, UCLA Andrea Bertozzi (UCLA), Xiyang Luo (UCLA) Andrew Stuart (Caltech) and Kostas Zygalakis

City Limits Lions Clubs City Limits Lions Clubs City Limits Lions Clubs City Limits Lions

Different Types of Limits Besides ordinary, two-sided limits, there are one-sided limits (left-

Large Graph Limits of Learning Algorithms Andrew M Stuart Computing and Mathematical Sciences,

MAT 166 Calculus for Bus/Soc Chapter 3 Notes Limits The Deriviative David J. Gisch Limits

Graph Algorithms Chapter 22 1 CPTR 430 Algorithms Graph Algorithms Why Study Graph Algorithms?

Graph Algorithms L.F.O.A. Lecture Full Of Acronyms Graph Search Algorithms The most basic graph

Limits (the size of the pie) allocation limits minimum reliability flow of supply Limits

Medical Programs Overview Table 1. Caption Medical SNAP TANF Programs Income Limits Income

Scope & Limits of Scope & Limits of Scope & Limits of Legal Authority Legal

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

REEF TM Reef Warm Reef Series 8 faces . 5 colors . 3 sizes BIANCO WARM PEARL MATT MATT MATT

Graph Algorithms Graph Algorithms g Undirected: edge ( u , v ) = ( v , u ); for all v , ( v ,

Verified Graph Algorithms in ACL2 Nathan Guermond Kestrel Institute November 5, 2018 Another

Leveraging Graph Algorithms In Visualizations With Neovis.js William Lyon @lyonwj lyonwj.com

Dynamic Graph Algorithms Christian Wulff-Nilsen University of Copenhagen November 14 , 2019 1 /

Modeling Limits Jaroslav Neetil Patrice Ossona de Mendez Charles University CAMS, CNRS/EHESS

Mock Modular Mathieu Moonshine Shamit Kachru Strings 2014 Saturday, June 21, 14 Any new

pollev.com Ingridsuther150 Tap Join I am @agilescrumster #AgileNinjas from #IntraEdge at

Occasion-level Classifiers or Event-level Classifiers? -Evidence from Child Language Acquisition

CS 241 Data Organization Ciphers March 22, 2018 Cipher In cryptography, a cipher (or

THE gareth@fathom.pro @dunlop71 MAGIC OF UX exposing bias through the medium of card magic

Transitioning t to Industry 4 y 4.0 Event Sponsors Welcome and Introduction Wabash

Working Together to Manage Substance Use November 12, 2008 and Mental Health Issues Wednesday, 25

An Active Learning Approach to STEM Writing Intensive Courses Dr. Corey Ptak Director of

Large Graph Limits of Learning Algorithms Matt Dunlop, Xiyang - PowerPoint PPT Presentation

Large Graph Limits of Learning Algorithms Matt Dunlop, Xiyang (Michael) Luo Computing and Mathematical Sciences, Caltech Department of Mathematics, UCLA Andrea Bertozzi (UCLA), Xiyang Luo (UCLA) Andrew Stuart (Caltech) and Kostas Zygalakis

City Limits Lions Clubs City Limits Lions Clubs City Limits Lions Clubs City Limits Lions

Different Types of Limits Besides ordinary, two-sided limits, there are one-sided limits (left-

Large Graph Limits of Learning Algorithms Andrew M Stuart Computing and Mathematical Sciences,

MAT 166 Calculus for Bus/Soc Chapter 3 Notes Limits The Deriviative David J. Gisch Limits

Graph Algorithms Chapter 22 1 CPTR 430 Algorithms Graph Algorithms Why Study Graph Algorithms?

Graph Algorithms L.F.O.A. Lecture Full Of Acronyms Graph Search Algorithms The most basic graph

Limits (the size of the pie) allocation limits minimum reliability flow of supply Limits

Medical Programs Overview Table 1. Caption Medical SNAP TANF Programs Income Limits Income

Scope &amp; Limits of Scope &amp; Limits of Scope &amp; Limits of Legal Authority Legal

GRAPH MINING AND GRAPH KERNELS Part I: Graph Mining Karsten Borgwardt^ and Xifeng Yan*

REEF TM Reef Warm Reef Series 8 faces . 5 colors . 3 sizes BIANCO WARM PEARL MATT MATT MATT

Graph Algorithms Graph Algorithms g Undirected: edge ( u , v ) = ( v , u ); for all v , ( v ,

Verified Graph Algorithms in ACL2 Nathan Guermond Kestrel Institute November 5, 2018 Another

Leveraging Graph Algorithms In Visualizations With Neovis.js William Lyon @lyonwj lyonwj.com

Dynamic Graph Algorithms Christian Wulff-Nilsen University of Copenhagen November 14 , 2019 1 /

Modeling Limits Jaroslav Neetil Patrice Ossona de Mendez Charles University CAMS, CNRS/EHESS

Mock Modular Mathieu Moonshine Shamit Kachru Strings 2014 Saturday, June 21, 14 Any new

pollev.com Ingridsuther150 Tap Join I am @agilescrumster #AgileNinjas from #IntraEdge at

Occasion-level Classifiers or Event-level Classifiers? -Evidence from Child Language Acquisition

CS 241 Data Organization Ciphers March 22, 2018 Cipher In cryptography, a cipher (or

THE gareth@fathom.pro @dunlop71 MAGIC OF UX exposing bias through the medium of card magic

Transitioning t to Industry 4 y 4.0 Event Sponsors Welcome and Introduction Wabash

Working Together to Manage Substance Use November 12, 2008 and Mental Health Issues Wednesday, 25

An Active Learning Approach to STEM Writing Intensive Courses Dr. Corey Ptak Director of

Scope & Limits of Scope & Limits of Scope & Limits of Legal Authority Legal