outline
play

Outline 1. Motivation 2. Gaussian process introduction 3. Change - PDF document

5/22/2016 Scalable Gaussian Processes for Characterizing Multidimensional Change Surfaces April 18, 2016 William Herlands Committee: Daniel Neill, Alex Smola, Wilbert Van Panhuis Chair: Dave Choi Outline 1. Motivation 2. Gaussian process


  1. 5/22/2016 Scalable Gaussian Processes for Characterizing Multidimensional Change Surfaces April 18, 2016 William Herlands Committee: Daniel Neill, Alex Smola, Wilbert Van Panhuis Chair: Dave Choi Outline 1. Motivation 2. Gaussian process introduction 3. Change surface model 4. Analysis of measles in the United States 1

  2. 5/22/2016 Complex Changes • In human systems changes are often complex – Policy interventions take time to trickle through government bureaucracy – Environmental hazards affect populations differentially • Simple changepoint models are not sufficiently expressive Why do we care? • Understand past changes – Explore spatio-temporal heterogeneity – Model the rate of changes in different areas • Enable more accurate or equitable policies • Applications – Measles incidence in the U.S – Concerns about lead-tainted 08 Jul 2014 21 Oct 2015 water in NYC 2

  3. 5/22/2016 Our objectives • Model complex changes in real world data – Multiple, flexible function Gaussian processes regimes for flexible functions – Non-discrete changes – Non-monotonic changes “Change surfaces” for complex changes – Heterogeneous changes over space, time, etc. Gaussian Processes (GP) • Non-parametric prior over smooth functions f ( x ) ~ GP ( m ( x ), k ( x , x ')) m ( x )  E [ f ( x )] k ( x , x ')  cov( f ( x ), f ( x ')) • Covariance function is a kernel. Defines the covariance of function values 3

  4. 5/22/2016 Gaussian Processes (GP) • Any finite set of f( x ) is Normally distributed   ~ N ( m ( x ), K ) f ( x 1 ),..., f ( x m ) • Observation model  ~ N (0,   ) y ( x )  f ( x )   , • Marginal log likelihood optimization log p ( y |  )  log | K    I |  y T ( K    I )  1 y Full Model • Our model is a convex combination of f i __ __ __ __ y ( x )  s 1 ( x ) f 1 ( x )  ...  s r ( x ) f r ( x )   n Switching functions Functional regimes i ( x )   r s 4

  5. 5/22/2016 Model part 1: Functional Regimes • GP prior for each functional regime – Use flexible stationary kernels f i ~ GP (0, K i ), i  1,..., r Model part 2: Change Surfaces • Changepoint 1 i  I ( t  T s i ) 0.5 0 −10 0 10 • Non-discrete changepoint 1 i  softmax( t  T s i ) 0.5 0 • Change surface −10 0 10 i  softmax( w 1 s i ( t )) 0.5 i   ( w i ( t )) s 0 −10 0 10 5

  6. 5/22/2016 Model part 2: Change Surfaces w i ( x ) • Random Kitchen Sink features for – Variable rate of change – Non-monotonic – Heterogeneous over input q  w i ( x )  cos(  j T x  b j ) a j j  1 Full Model • Gaussian process change surface model r  y ( x )   ( w i ( x )) f i ( x )   n ______ i  1 f i ( x ) ~ GP (0, K i ) • Can depict this as a single Gaussian process with covariance function r  k all ( x , x ')   ( w i ( x )) k i ( x , x ')  ( w i ( x ')) i  1 6

  7. 5/22/2016 Scalable Inference • Log likelihood naively O(n 3 ) log p ( y |  )  log | K    I |  y T ( K    I )  1 y • We develop scalable Kronecker inference using the Weyl bound, O(Dn D+1/D ) Measles in the United States • Data – Monthly incidence rates 1935 – 2003 – Continental United States and D.C. x   3 , 2D space and 1D time – – Measles vaccine introduced in 1963 7

  8. 5/22/2016 Measles in 3 states California 1 300 Incidence (1000s) CA 200 s(w(x)) 0.5 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 Maine 1 200 Incidence (1000s) ME s(w(x)) 0.5 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 Michigan 500 1 Incidence (1000s) 400 MI 300 s(w(x)) 0.5 200 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 Measles in 3 states California 1 300 Incidence (1000s) CA 200 s(w(x)) 0.5 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 Maine 1 200 Incidence (1000s) ME s(w(x)) 0.5 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 Michigan 500 1 Incidence (1000s) 400 MI 300 s(w(x)) 0.5 200 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 8

  9. 5/22/2016 Measles in 3 states • GP change surface – 2 functional regimes w i ( x ) – as RKS with 5 features • Not a causal model! Measles in 3 states California 300 1 Incidence (1000s) CA 200 s(w(x)) 0.5 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 Maine 1 200 Incidence (1000s) ME s(w(x)) 0.5 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 Michigan 500 1 Incidence (1000s) 400 MI 300 s(w(x)) 0.5 200 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 9

  10. 5/22/2016 Measles in 3 states California 300 1 Incidence (1000s) CA 200 s(w(x)) 0.5 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 Maine 1 200 Incidence (1000s) ME s(w(x)) 0.5 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 Michigan 500 1 Incidence (1000s) 400 MI 300 s(w(x)) 0.5 200 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 Measles in 3 states “Change slope” from σ(w(x)) = 0.25  0.75 . Michigan 500 1 Incidence (1000s) 400 “Change date” per state 300 s(w(x)) MI 0.5 σ(w(x)) = 0.5 200 100 0 0 1940 1940 1950 1950 1960 1960 1970 1970 1980 1980 1990 1990 2000 2000 2010 2010 10

  11. 5/22/2016 Change date for measles in U.S. 1961.5 1967.2 For each state, date where σ(w(x)) = 0.5 Change slope for measles in U.S. 0.156 0.297 For each state, slope of σ(w(x)) = 0.75  0.25 11

  12. 5/22/2016 Regression Analysis • Explore factors that affect the change date – Birth and death rates – Population numbers per age segment – Income information – Government hospital and health workers – Slope of change surface – Average temperature Demographic Analysis 12

  13. 5/22/2016 Regression Analysis • Gini of family income 1961.5 1967.2 – Economically depressed communities – Rural regions • Slope of change surface – Fewer cases nationwide enable more effective immunization later Conclusions • Introduced model for “change surfaces” in real world data • Developed scalable inference for additive, non-stationary Gaussian processes • Identified heterogeneity in first years of the measles vaccine • Used the results of the change surface model for policy relevant conclusions 13

  14. 5/22/2016 Acknowledgements • Committee – Daniel Neill, Alex Smola, Wilbert van Panhuis • Chair – Dave Choi • Collaborators* – Andrew Wilson – Seth Flaxman – Hannes Nickisch *Subset of paper accepted to AISTATS 2016 Questions? Fin. 28 14

  15. 5/22/2016 Backup slides Conclusions • Introduced model for “change surfaces” in real world data • Developed scalable inference for additive, non-stationary Gaussian processes • Identified heterogeneity in first years of the measles vaccine • Used the results of the change surface model for policy relevant conclusions 15

  16. 5/22/2016 Spectral Mixture Kernels Inference • Compute log marginal likelihood • General Kronecker methods for scalability – Assume: – Assume: multiplicative kernel across D – Then we can decompose kernel matrix, 16

  17. 5/22/2016 Inference • For additive kernels • K -1 can be computed efficiently using LCG* • But how can we compute the log|K| ? *See Flaxman et al. (2015) Inference 17

  18. 5/22/2016 Inference • Choosing indices i, j Method Complexity Minimization for best pair O(n 2 ) “Middle” heuristic i=j O(n) OR i=j+1 Greedy search of s pairs O(2sn) below and above previous pair Inference • Scaling functions, σ(w(x)) 18

  19. 5/22/2016 Inference 3 Kernels 3 Kernels 4 10 1 Weyl exact 0.8 Weyl middle Log determinant approximation ratio Weyl greedy 0.6 Cheb−Hutch 2 10 True log det 0.4 0.2 Time (sec) 0 10 0 −0.2 −0.4 −2 10 −0.6 −0.8 −4 −1 10 2 3 4 2 4 6 10 10 10 10 10 10 Observations (#) Observations (#) Inference – so what?! • Linear complexity for additive kernels – O(Dn D+1/D ) • Scalable inference for non-separable kernels in space and time • Scalable inference for non-stationary kernels 19

  20. 5/22/2016 Numerical Experiments • 2500 points of synthetic data • 2 functional regimes defined by squared exponential kernels • Change surface define by Results - Numerical 20

  21. 5/22/2016 Demographic Analysis Demographic Analysis 21

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend