Multiresolution Gaussian Processes Emily Fox ICERM 2012 - PowerPoint PPT Presentation

Multiresolution Gaussian Processes Emily ¡Fox ¡ ICERM ¡2012 ¡ Providence, ¡RI ¡ Joint work with David Dunson (Duke)

Goals Data from Neuronal Recordings Many ¡:me ¡series ¡exhibit: ¡ 0.5 • Long-‑range ¡ correla:ons ¡ Observations 0 • Non-‑Markovian ¡ dynamics ¡ ¡ − 0.5 In ¡a ¡mul:variate ¡seAng: ¡ • Time-‑varying ¡correla4ons ¡ − 1 0 50 100 150 200 250 300 Time Some:mes ¡also… ¡ • Func:onal ¡data ¡analysis ¡ ¡ ¡ ¡ ¡ à ¡sharing ¡common ¡global ¡trend ¡

Magnetoencephalography (MEG) Helmet with 102 sensors COW . . .

Magnetoencephalography (MEG) Helmet with 102 sensors • Long-range dependencies • Time-varying correlations COW . . .

Trial-to-Trial Variability • Data are noisy (low SNR) § Multiple trials recorded for each stimulus • Each trial records the same process § Capture common global trajectory § Allow trial-to-trial variability • Functional data analysis setting

MEG Noise

Build Word-Specific Model Stimulus: w = HOUSE y t ∼ N ( µ ( w ) ( x t ) , Σ ( w ) ( x t )) Hierarchy captures trial-to-trial variability

Build Word-Specific Model Stimulus: w = HOUSE y t ∼ N ( µ ( w ) ( x t ) , Σ ( w ) ( x t )) Capturing heteroscedasticity is key x 3 = µ ( x ) Time 3 Sensor 1 x 2 = Time 2 Σ ( x 3 ) x 1 = Time 1 Σ ( x 2 ) Σ ( x 1 ) Sensor 2

Build Word-Specific Model Stimulus: w = HOUSE y t ∼ N ( µ ( w ) ( x t ) , Σ ( w ) ( x t )) Harness k-dim latent space R 102 R k

Low-Rank Covariance Evolution Matrix ¡of ¡“dic:onary ¡elements” ¡ • E.g., ¡Gaussian ¡processes ¡ § ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡elements ¡ § p × k λ 11 ( · ) λ 12 ( · ) λ 21 ( · ) λ 22 ( · ) X k << p λ p 1 ( · ) λ p 2 ( · ) p × k Σ ( x ) = Λ ( x ) Λ ( x ) � + Σ 0 Fox and Dunson , “Bayesian Nonparametric Covariance Regression”, under review.

Low-Rank Covariance Evolution λ 11 ( · ) λ 12 ( · ) λ 21 ( · ) λ 22 ( · ) X + X λ p 1 ( · ) λ p 2 ( · ) Σ ( x ) = Λ ( x ) Λ ( x ) � + Σ 0 Fox and Dunson , “Bayesian Nonparametric Covariance Regression”, under review.

One Step Further…   θ 11 θ 12 θ 13 λ 11 ( · ) λ 12 ( · ) ξ 11 ( · ) ξ 12 ( · ) θ 21 θ 22 θ 23   λ 21 ( · ) λ 22 ( · )   ξ 21 ( · ) ξ 22 ( · )     X  . . .  ξ 31 ( · ) ξ 32 ( · ) . . .   . . .       ξ ( · )   . . . . . .   X . . .   λ p 1 ( · ) λ p 2 ( · )   θ p 1 θ p 2 θ p 3 Θ Λ ( · ) Σ ( x ) = Θ ξ ( x ) ξ ( x ) � Θ � + Σ 0 Fox and Dunson , “Bayesian Nonparametric Covariance Regression”, under review.

Changing Correlations – MEG 102 sensors: Correlations between sensors change with processing of word “kick”

Mean Hierarchy µ ( w, 1) ( x ) Trial 1 µ ( w ) ( x ) µ ( w,J ) ( x ) Trial J (Note: defined in a k-dim space and projected up) Fyshe, Fox, Dunson, and Mitchell , “Hierarchical Latent Dictionary Learning for Word Classification using Brain Activation Patterns”, AISTATS 2012.

Data Collection • 4 word categories, 5 words per category Food Animals Tools Buildings • 20 repetitions per word (400 total) § 15 train/word (300 total) § 5 test/word (100 total) Fyshe, Fox, Dunson, and Mitchell , “Hierarchical Latent Dictionary Learning for Word Classification using Brain Activation Patterns”, AISTATS 2012.

Classification Performance Fyshe, Fox, Dunson, and Mitchell , “Hierarchical Latent Dictionary Learning for Word Classification using Brain Activation Patterns”, AISTATS 2012.

MEG Data – 1 Sensor 0.5 3 trials, Observations 0 1 sensor − 0.5 − 1 0 50 100 150 200 250 300 Time Yes: ¡ What ¡we ¡missed: ¡ • Long-‑range ¡ correla:ons ¡ • Abrupt ¡changes ¡ • Non-‑Markovian ¡ dynamics ¡ • Locally ¡ sta4onary ¡ dynamics ¡ Long-‑range ¡correla:ons ¡ span ¡ changepoints ¡

MEG Data – 1 Sensor 0.5 3 trials, Observations 0 1 sensor − 0.5 Sample Correlation Matrix − 1 (20 trials) 0 50 100 150 200 250 300 Time Key ¡features: ¡ 50 • Long-‑range ¡ correla:ons ¡ 100 Time 150 • Abrupt ¡changes ¡ 200 250 • Locally ¡ smooth ¡ 300 50 100 150 200 250 300 Time

GPs on Nested Partition Parent ¡func+on: ¡ x 1 x 2 x 3 x n . . . • Smooth ¡global ¡trajectory ¡ x • Long-‑range ¡correla:ons ¡ f 0 ( x ) ∼ N (0 , K 0 ) • Non-‑Markovian ¡dynamics ¡ • Sta4onary ¡ 20 40 60 80 100 120 140 160 Fox and Dunson , “Multiresolution Gaussian Proccesses”, 180 to appear NIPS 2012. 200 20 40 60 80 100 120 140 160 180 200

GPs on Nested Partition A 1 A 1 1 2 changepoint = break in stationarity Fox and Dunson , “Multiresolution Gaussian Proccesses”, to appear NIPS 2012.

GPs on Nested Partition f 1 ( A 1 1 ) ∼ GP( f 0 ( A 1 1 ) , c 1 A 1 A 1 1 ) 1 2 Fox and Dunson , “Multiresolution Gaussian Proccesses”, to appear NIPS 2012.

GPs on Nested Partition f 1 ( A 1 1 ) ∼ GP( f 0 ( A 1 1 ) , c 1 A 1 A 1 1 ) 1 2 f 1 ( A 1 2 ) ∼ GP( f 0 ( A 1 2 ) , c 1 2 ) Fox and Dunson , “Multiresolution Gaussian Proccesses”, to appear NIPS 2012.

GPs on Nested Partition f 1 ( A 1 1 ) ∼ GP( f 0 ( A 1 1 ) , c 1 A 1 A 1 1 ) 1 2 f 1 ( A 1 2 ) ∼ GP( f 0 ( A 1 2 ) , c 1 2 ) f 1 ( x ) | f 0 ∼ N (0 , K 1 ) 20 40 60 80 100 120 140 160 Fox and Dunson , “Multiresolution Gaussian Proccesses”, 180 to appear NIPS 2012. 200 20 40 60 80 100 120 140 160 180 200

GPs on Nested Partition f 1 ( A 1 1 ) ∼ GP( f 0 ( A 1 1 ) , c 1 A 1 A 1 1 ) 1 2 f 1 ( A 1 2 ) ∼ GP( f 0 ( A 1 2 ) , c 1 2 ) A 2 A 2 A 2 A 2 1 2 3 4 Fox and Dunson , “Multiresolution Gaussian Proccesses”, to appear NIPS 2012.

GPs on Nested Partition f 1 ( A 1 1 ) ∼ GP( f 0 ( A 1 1 ) , c 1 A 1 A 1 1 ) 1 2 f 1 ( A 1 2 ) ∼ GP( f 0 ( A 1 2 ) , c 1 2 ) A 2 A 2 A 2 A 2 1 2 3 4 . . . . . . Fox and Dunson , “Multiresolution Gaussian Proccesses”, to appear NIPS 2012.

GPs on Nested Partition f 1 ( A 1 1 ) ∼ GP( f 0 ( A 1 1 ) , c 1 A 1 A 1 1 ) 1 2 f 1 ( A 1 2 ) ∼ GP( f 0 ( A 1 2 ) , c 1 2 ) A 2 A 2 A 2 A 2 1 2 3 4 20 40 f ` ( x ) | f ` − 1 ∼ N (0 , K ` ) 60 80 100 120 140 160 Fox and Dunson , “Multiresolution Gaussian Proccesses”, 180 to appear NIPS 2012. 200 20 40 60 80 100 120 140 160 180 200

GPs on Nested Partition A 1 A 1 1 2 . . . A 2 A 2 A 2 A 2 1 2 3 4 g = f L Fox and Dunson , “Multiresolution Gaussian Proccesses”, to appear NIPS 2012.

Induced Marginal GP Conditioned on partition, A 1 A 1 marginalize GPs 1 2 A 2 A 2 A 2 A 2 1 2 3 4

Induced Marginal GP Equivalent to GP with A 1 A 1 1 2 partition-dependent A 2 A 2 A 2 A 2 (non-stationary) 1 2 3 4 covariance function 20 40 60 80 100 120 140 160 180 200 20 40 60 80 100 120 140 160 180 200

corr( y i , y j | A ) Correlation Structure c A 0 ( x i , x j ) + A 1 A 1 1 2 1 ( x i , x j ) c A 1 + A 2 A 2 A 2 A 2 1 2 3 4 0 locations x j x i y j observations y i

corr( y i , y j | A ) Correlation Structure c A 0 ( x i , x j ) • Correlation spans changepoints + A 1 A 1 1 2 1 ( x i , x j ) • Higher corr for sharing c A 1 more partition sets + A 2 A 2 A 2 A 2 1 2 3 4 0 Lowest tree level in same partition set P L ij ` =0 c ` i ( x i , x j ) r ` corr( y i , y j | A ) = r ( σ 2 + P L − 1 i ( x i , x i ))( σ 2 + P L − 1 ` =0 c ` ` =0 c ` j ( x j , x j )) r ` r `

Covariance Function – Length scale A 1 A 1 1 2 Length-‑scale ¡hyperparam: ¡ • Fractal-‑like ¡smoothness ¡ • Locally ¡as ¡smooth ¡as ¡parent ¡fcn ¡ A 2 A 2 A 2 A 2 1 2 3 4 • Lower ¡levels ¡capture ¡more ¡detail ¡ • Only ¡one ¡param ¡

Covariance Function – Variance A 1 A 1 1 2 Variance ¡hyperparam: ¡ A 2 A 2 A 2 A 2 • Decreasing ¡variability ¡from ¡parent ¡ 1 2 3 4 • Finite ¡var ¡regardless ¡of ¡tree ¡depth ¡ • Lower ¡levels ¡are ¡less ¡influen:al ¡

Covariance Function – Variance A 1 A 1 1 2 Resulting function is similar to higher level Variance ¡hyperparam: ¡ function despite adding A 2 A 2 A 2 A 2 • Decreasing ¡variability ¡from ¡parent ¡ 1 2 3 4 changepoints • Finite ¡var ¡regardless ¡of ¡tree ¡depth ¡ • Lower ¡levels ¡are ¡less ¡influen:al ¡ . . .

Balanced Binary Trees A 0 A 1 1 A 1 A 1 = 1 2 A 2 A 2 A 2 A 2 1 2 3 4 A 2 3

Multiresolution Gaussian Processes Emily Fox ICERM 2012 - PowerPoint PPT Presentation

Multiresolution Gaussian Processes Emily Fox ICERM 2012 Providence, RI Joint work with David Dunson (Duke) Goals Data from Neuronal Recordings Many :me series exhibit: 0.5 Long-range

Multiresolution Modeling A Very Brief Introduction 1 Spring 2010 Multiresolution

Multiresolution Analysis (MRA) WTBV January 10, 2017 WTBV Multiresolution Analysis (MRA)

Wavelets and Multiresolution Processing Thinh Nguyen Multiresolution Analysis (MRA) Analysis

Concepts and Algorithms of Scientific and Visual Computing Multiresolution Analysis CS448J,

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee

Gaussian Processes Dan Cervone NYU CDS November 10, 2015 Dan Cervone (NYU CDS) Gaussian

CMPUT 466 Introduction to Gaussian Processes Dan Lizotte The Plan Introduction to Gaussian

Non-Gaussian likelihoods for Gaussian Processes Alan Saul Outline Motivation Non-Gaussian

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading

State Space Gaussian Processes with Non-Gaussian Likelihoods Hannes Nickisch 1 Arno Solin 2

Another introduction to Gaussian Processes Richard Wilkinson School of Maths and Statistics

Gaussian Processes for Big Data James Hensman joint work with Nicol o Fusi, Neil D. Lawrence

Multiresolution Cluster AnalysisAddressing Trust in Climate Classifications Derek DeSantis

multiresolution analysis for the statistical analysis of incomplete rankings Eric Sibony Anna

Multiresolution analysis & wavelets (quick tutorial) Application : image modeling Andr

Markov Crawler Machine learning platform aiding CS188 Crawler can execute a reflex agent or

Thinking about thinking How children learn to talk about mental states Micha Elsner

LCCC Workshop: System Design Meets Equation-based Languages Sep 20, 2012 Carl D. Laird, Assistant

Enterprise Storage Architecture Fall 2018 Introduction Tyler Bletsch Duke University Slides

Summary Individuals and interactions over processes and tools. This statement is one of the

Amplifier Lab Introduction: In this laboratory project you will build an amplifier circuit. This

October 19, 2009 Current Issues Wi-Fi Tracking Systems Infrared Cameras Required

Welcome It used to be easy they all looked pretty much alike NoSQL BigData MapReduce Graph

Multiresolution Gaussian Processes Emily Fox ICERM 2012 - PowerPoint PPT Presentation

Multiresolution Gaussian Processes Emily Fox ICERM 2012 Providence, RI Joint work with David Dunson (Duke) Goals Data from Neuronal Recordings Many :me series exhibit: 0.5 Long-range

Multiresolution Modeling A Very Brief Introduction 1 Spring 2010 Multiresolution

Multiresolution Analysis (MRA) WTBV January 10, 2017 WTBV Multiresolution Analysis (MRA)

Wavelets and Multiresolution Processing Thinh Nguyen Multiresolution Analysis (MRA) Analysis

Concepts and Algorithms of Scientific and Visual Computing Multiresolution Analysis CS448J,

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee

Gaussian Processes Dan Cervone NYU CDS November 10, 2015 Dan Cervone (NYU CDS) Gaussian

CMPUT 466 Introduction to Gaussian Processes Dan Lizotte The Plan Introduction to Gaussian

Non-Gaussian likelihoods for Gaussian Processes Alan Saul Outline Motivation Non-Gaussian

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading

State Space Gaussian Processes with Non-Gaussian Likelihoods Hannes Nickisch 1 Arno Solin 2

Another introduction to Gaussian Processes Richard Wilkinson School of Maths and Statistics

Gaussian Processes for Big Data James Hensman joint work with Nicol o Fusi, Neil D. Lawrence

Multiresolution Cluster AnalysisAddressing Trust in Climate Classifications Derek DeSantis

multiresolution analysis for the statistical analysis of incomplete rankings Eric Sibony Anna

Multiresolution analysis &amp; wavelets (quick tutorial) Application : image modeling Andr

Markov Crawler Machine learning platform aiding CS188 Crawler can execute a reflex agent or

Thinking about thinking How children learn to talk about mental states Micha Elsner

LCCC Workshop: System Design Meets Equation-based Languages Sep 20, 2012 Carl D. Laird, Assistant

Enterprise Storage Architecture Fall 2018 Introduction Tyler Bletsch Duke University Slides

Summary Individuals and interactions over processes and tools. This statement is one of the

Amplifier Lab Introduction: In this laboratory project you will build an amplifier circuit. This

October 19, 2009 Current Issues Wi-Fi Tracking Systems Infrared Cameras Required

Welcome It used to be easy they all looked pretty much alike NoSQL BigData MapReduce Graph

Multiresolution analysis & wavelets (quick tutorial) Application : image modeling Andr