modelling in fm uenza like illness using online search
play

Modelling in fm uenza-like illness using online search Vasileios - PowerPoint PPT Presentation

Modelling in fm uenza-like illness using online search Vasileios Lampos Computer Science , UCL @lampos lampos.net www Mapping online search to fm u estimates 12 10 ILI percentage 8 6 4 2 0 2004 2005 2006 2007 2008 Year Why estimate


  1. Modelling in fm uenza-like illness using online search Vasileios Lampos Computer Science , UCL @lampos lampos.net www

  2. Mapping online search to fm u estimates 12 10 ILI percentage 8 6 4 2 0 2004 2005 2006 2007 2008 Year

  3. Why estimate fm u rates from online search? • Complement traditional syndromic surveillance ‣ timeliness ‣ broader demographic coverage, larger cohort ‣ broader geographic coverage ‣ not a fg ected by closure days ‣ lower cost • Applicable to locations that la cl an established healthcare system

  4. Google Flu Trends — discontinued — popularising an established idea Ginsberg et al . (2009); Eysenbach (2006); Polgreen et al . (2008)

  5. Google Flu Trends — why did it fail ? CDC 0.09 Google Flu Trends 0.08 0.07 0.06 ILI rates 0.05 0.04 0.03 0.02 0.01 0 Jan. '09 Jan. '10 Jan. '11 Jan. '12 Jan. '13 rsv — 25% ILI rate � β 0 � β 1 ⨉ Q , fm u symptoms — 18% where Q is the average benzonatate — 6% symptoms of pneumonia — 6% query frequency upper respiratory infection — 4%

  6. Google Flu Trends — why did it fail ? CDC 0.09 Google Flu Trends 0.08 0.07 0.06 ILI rates 0.05 0.04 0.03 0.02 0.01 0 Jan. '09 Jan. '10 Jan. '11 Jan. '12 Jan. '13 • non-ideal query selection , model simplicity • inappropriate evaluation ( less than 1 fm u season! )

  7. Multivariate, nonlinear, generative models • Treat single search queries as distinct variables • Model nonlinearities 0.08 Raw data Linear fit 0.07 0.06 0.05 ILI rate ILI rates 0.04 0.03 0.02 0.01 0 0 1 2 3 4 5 6 Frequency of query 'how long is flu contagious' -7 frequency of “ how long is fm u contagious ”

  8. Multivariate, nonlinear, generative models • Treat single search queries as distinct variables • Model nonlinearities • Model groups of queries that share common temporal pa tu erns Gaussian Processes (GPs) 
 — distribution over functions that can explain the data 
 — allow some room for model interpretability 
 — can model uncertainty

  9. Correcting the de fj ciencies of Google Flu Trends CDC 0.09 Gaussian Process 0.08 0.07 0.06 ILI rates 0.05 0.04 0.03 0.02 0.01 0 Jan. '09 Jan. '10 Jan. '11 Jan. '12 Jan. '13 • 42% mean absolute error reduction compared to Google Flu Trends • .95 Pearson correlation ( previously .89) with CDC

  10. Modelling uncertainty CDC 0.09 Gaussian Process 0.08 0.07 0.06 ILI rates 0.05 0.04 0.03 0.02 0.01 0 Jan. '09 Jan. '10 Jan. '11 Jan. '12 Jan. '13

  11. Combining GPs with autoregression (AR) CDC 0.09 Gaussian Process with AR 0.08 0.07 0.06 ILI rates 0.05 0.04 0.03 0.02 0.01 0 Jan. '10 Jan. '11 Jan. '12 Jan. '13 • 1 week delay in incorporating historical CDC estimates • 27% mean absolute error reduction over GFT with AR • 52% mean absolute error reduction over GP without AR • .99 Pearson correlation with CDC

  12. Qv ery selection based on meaning • Select search queries based on their semantic similarity to the topic of fm u • Make this possible by using word embeddings , i . e . word representations in a common vector space 
 — learn them using a corpus of 215 million tweets

  13. Qv ery selection based on meaning • Select search queries based on their semantic Analogy: A ( is to ) → B what X ( is to ) → ? similarity to the topic of fm u Rome → Italy London → [ UK , Denmark, Sweden] • Make this possible by using word embeddings , i . e . go → went do → [ did , doing, happened] word representations in a common vector space 
 Messi → football Lebron → [ basketball , bball, NBA] — learn them using a corpus of 215 million tweets Elvis → Presley Aretha → [ Franklin , Ru ffj n, Vandross] UK → Brexit Greece → [ Grexit , Syriza, Tsipras] UK → Farage USA → [ Trump , Farrage, Putin]

  14. Qv ery selection based on meaning • Select search queries based on their semantic similarity to the topic of fm u • Make this possible by using word embeddings , i . e . word representations in a common vector space 
 — learn them using a corpus of 215 million tweets • Combine temporal correlation with semantic similarity ( hybrid similarity ) for optimal feature selection

  15. Qv ery selection based on meaning — Results RCGP (England) 35 Correlation-based feature selection 30 25 ILI rates 20 15 10 5 0 2013 2014 2015 Examples of spurious selected queries prof. surname (70%) name surname recipes (21%) name surname (27%) blood game (12.3%) heating oil (21%) swine fm u vaccine side e fg ects (7.2%)

  16. Qv ery selection based on meaning — Results RCGP (England) 35 Hybrid feature selection 30 25 ILI rates 20 15 10 5 0 2013 2014 2015 • 12.3% performance improvement • .913 Pearson correlation with RCGP ILI rates

  17. i-sense fm u ( Flu Detector ) ! fludetector.cs.ucl.ac.uk @isenseflu

  18. i-sense fm u ( Flu Detector ) • daily fm u estimates for England, publicly accessible • transferred to Public Health England (PHE) • its estimates have been included in the two most recent annual fm u reports of PHE ( gov.uk/ government/statistics/annual-flu-reports ) • open source , github.com/UCL/fludetector-flask • credit to David Guzman for constantly re fj ning it ! fludetector.cs.ucl.ac.uk @isenseflu

  19. Forecasting fm u rates — Ongoing work RCGP (England) 3-weeks ahead forecasts (preliminary model) 50 ILI rates (per 100,000) 40 30 20 10 0 Jan. '15 Jan. '16 Jan. '17 Jan. '18 mean absolute error = 2.56 (cases per 100,000) r = .901 led by Simon Moura

  20. Forecasting fm u rates (US) — Ongoing work 8 CDC (US) 3-weeks ahead forecasts (preliminary model) 7 6 ILI rates (%) 5 4 3 2 1 0 Jan. '15 Jan. '16 Jan. '17 Jan. '18 mean absolute error = 0.33% r = .927 led by Simon Moura

  21. Multi-task learning for fm u Multi-task learning (MTL) vs. single-task learning (STL) • learns models jointly instead of independently • for related tasks is performing be tu er than STL solutions • provides good performance with fewer training samples Flu models with MTL • limit performance loss under sporadic training data • improve accuracy ‣ of regional models within a country ‣ across di fg erent countries

  22. Modelling fm u across US regions with MTL surveil- WA o-fold. Firstly, MT ME ND various ge- VT OR MN NH ID countries — can SD WI NY MA WY MI CT RI to assist IA PA NE NJ NV OH eillance MD UT IL IN DE CO WV CA data. We ex- VA KS MO KY multi-task NC TN AZ OK cess, and NM AR SC formulations. We use MS AL GA eriments on TX LA health and FL AK indicate national mod- Region 1 Region 2 Region 3 Region 4 Region 5 absolute HI Region 6 Region 7 Region 8 Region 9 Region 10 reduced Train 10 US regional models for fm u jointly

  23. MTL across US and US regions Performance for US — 1 year of training data single-task learning multi-task learning 0.88 0.85 0.51 0.44 Pearson correlation mean absolute error

  24. MTL across US and US regions Performance for US regions — 1 year of training data single-task learning multi-task learning 0.87 0.84 0.54 0.47 Pearson correlation mean absolute error

  25. MTL across US and US regions Performance for US regions — 1 year of training data 
 50% of the data lost single-task learning multi-task learning 0.82 0.77 0.59 0.48 Pearson correlation mean absolute error

  26. MTL across US and England Performance for England — 1 year of training data single-task learning multi-task learning 0.98 0.88 0.85 0.59 Pearson correlation mean absolute error

  27. Why estimate fm u rates from online search? • Complement traditional syndromic surveillance ‣ timeliness ‣ broader demographic coverage, larger cohort ‣ broader geographic coverage ‣ not a fg ected by closure days ‣ lower cost • Applicable to locations that la cl an established healthcare system ‣ oxymoron: healthcare data is 
 required for training the models!

  28. Transfer learning for fm u modelling Main task • train a model for a source location where historical syndromic surveillance data is available • transfer it to a target location where syndromic surveillance data is not available or, in our experiments , ignored Transfer learning steps 1. Learn a regression model for a source location 2. Map search queries from the source to the target domain 3. Transfer the source regression weights to the target domain

  29. Mapping source to target queries • Direct translation does not work • Two similarity components ‣ Semantic similarity ( meaning ) using cross-lingual word embedding representations ( Θ s ) ‣ Temporal similarity based on their frequency time series ( Θ c ) • Joint similarity : Θ = γΘ s + (1 − γ ) Θ c , γ ∈ [0,1]

  30. Source: US, Target: France How similar are their fm u rates? 5 US FR 4 ILI rates (z-scored) 3 2 1 0 -1 2008 2009 2010 2011 2012 2013 2014 2015 2016

  31. Source: US, Target: France MAE = 61.5 , r = .835 MAE = 46.8 , r = .956 MAE = 34.1 , r = .959

  32. Source: US, Target: Australia How similar are their fm u rates? 5 US AU 4 ILI rates (z-scored) 3 2 1 0 -1 2008 2009 2010 2011 2012 2013 2014 2015 2016

  33. Source: US, Target: Australia MAE = 42.6 , r = .7 MAE = 30.3 , r = .915 MAE = 22 , r = .921

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend