Forecasting Word Model: Twitter-based Influenza Surveillance and - PowerPoint PPT Presentation

Forecasting Word Model: Twitter-based Influenza Surveillance and Prediction Hayate ISO, Shoko WAKAMIYA, Eiji ARAMAKI

Twitter for Public health 2 • Many users tweet when they caught a disease • # of tweets is in proportion to # of flu patients ■ # of flu related tweets ■ # of flu patients Counts Time

Noise included in tweets 3 Influencer Website : @not_influenza For more information about bird flu link High Fever By patients : @flu_patient I got a flu… I couldn’t do anymore… Healthy person @organic I’ve never caught a flu By healthy people : Injection lover @prevention I got a flu shot yesterday

Noise included in tweets 4 Influencer Website : @not_influenza For more information about bird flu link High Fever By patients : @flu_patient I got a flu… I couldn’t do anymore… Only counts this type of tweets Healthy person @organic I’ve never caught a flu By healthy people : Injection lover @prevention I got a flu shot yesterday

5 Our lab runs flu surveillance system Aramaki, Eiji, Sachiko Maskawa, and Mizuki Morita. "Twitter catches the flu: detecting influenza epidemics using Twitter." In Proc of EMNLP 2011 . http://mednlp.jp/influ_map/

6 Similarity between Tweets and Patients Tweets about flu is slightly earlier than reports of flu in patients

7 7 Each word has a specific time-lag ■ # of flu related tweets Counts ■ # of flu patients Time The word “Fever” The word “Injection” 16 days time lag 55 days time lag ■ # of the word “fever” ■ # of the word “Injection” ■ Time shifted ■ Time shifted Counts Counts ■ # of flu patients ■ # of flu patients Time Time

What is Forecasting Words? 8 • Twitter tends to be an early indicator of actual condition • We observed that each word has a specific time lag with actual condition • Our objective: more flexible modeling - Estimate time-difference - Extend future forecasting model

Outline 9 Time shift: Time shift: Data Nowcasting Forecasting

Training data: Twitter Corpus 11 • Query : The word ’’flu’’ in Japanese (INFLU / I-N-FU-RU/ ) • Period : Aug 2012 ~ Jan 2016 (3 year 5 month) • Size of corpus : 7.7 Million tweets

Gold standard: IDSC reports 12 • Infectious Disease Surveillance Center (IDSC) reports # of flu patients once a week • They gather the number of flu patients during the period of epidemic • We split IDSC reports into three seasons as follows: • Season 1: Dec 1, 2012 ~ May 31, 2013 • Season 2: Dec 1, 2013 ~ May 31, 2014 • Season 3: Dec 1, 2014 ~ May 24, 2014

Time lag measure: Cross Correlation 14 • Cross Correlation is used to search for the most suitable time shift width for each word frequency as between # of tweets τ days before and # of actual patients where ※ The cross correlation is exactly the same as the Pearson’s correlation when τ = 0 .

Motivating examples 15 • Cross Correlation r : • When τ = 0, r is 0.75 B/T tweet and IDSC reports ■ # of the word “fever” ■ # of flu patients

Motivating examples 16 • Cross Correlation r : • When τ increases, word counts moves to right side: ■ # of the word “fever” ■ # of flu patients

Motivating examples 17 • Cross Correlation r : • When τ = 16, r is 0.95 B/T tweet and IDSC reports ■ # of the word “fever” ■ # of flu patients

Estimate optimal time-lag 18 • We define optimal time-lag τ by maximizing the cross ^ correlation

19 Heatmap representation of Matrix Raw word counts # of patients Apply time-shift X y X y

Effectiveness of time shift 20 • Regression for nowcasting with applying time-shift or not: • Lasso (Tibshirani, 1994) • Elastic-Net (Zou and Hastie, 2005) • The searching range of time shift τ is in [0, …, 60] Train Season 2 Season 3 Season 1 Season 3 Season 1 Season 2 Avg. Test Season 1 Season 2 Season 3 time-shift Lasso+ 0.952 0.907 0.951 0.888 0.955 0.963 0.936 with ENet+ 0.944 0.898 0.960 0.878 0.967 0.959 0.934 time-shift 0.854 0.916 0.768 0.894 0.770 0.753 0.825 Lasso without ENet 0.900 0.927 0.809 0.914 0.792 0.805 0.858 ※ Higher is better

Limitation 22 22 • To estimate specific day of the epidemic through Twitter, we need to gather same day’s tweet • How to predict future disease outbreaking ? Past Future ■ # of flu related tweets Counts ■ # of flu patients ? Time

Restrict time shift estimation 23 23 • In order to forecast Δ t days future epidemics, we restrict searching interval of time shift at least Δ t days Searching interval

Motivating example 24 24 • Nowcasting case: τ ∈ [0, τ max ] ■ # of the word “fever” ■ # of flu patients

Motivating example 25 25 • Forecasting case (10 days future): τ ∈ [10, τ max ] ■ # of the word “fever” ■ # of the word “fever” ■ # of the word “fever” ■ # of the word “fever” (10 days shifted) ■ # of flu patients ■ # of the word “fever” (10 days shifted) ■ # of flu patients ■ # of the word “fever” (16 days shifted) ■ # of flu patients

Motivating example 26 26 • Forecasting case (30 days future): τ ∈ [30, τ max ] ■ # of the word “fever” ■ # of the word “fever” ■ # of the word “fever” (30 days shifted) ■ # of flu patients ■ # of flu patients

Motivating example 27 27 • Forecasting case (30 days future): τ ∈ [30, τ max ] ■ # of the word “Injection” ■ # of the word “Injection” ■ # of the word “Injection” ■ # of the word “Injection” (30 days shifted) ■ # of flu patients ■ # of the word “Injection” (30 days shifted) ■ # of flu patients ■ # of the word “Injection” (55 days shifted) ■ # of flu patients ^ r = 0.87

Forecasting Modeling 28 28 • In each Δ t , we search optimal time shift for all words. • Estimate model by Lasso & ENet using these features. Searching interval

Our model beyonds baseline 29 29 • BaseLine: ※ Higher is better

Summary 30 30 • We discovered the time difference between twitter and actual phenomena . • We proposed but handling such difference to improve the nowcasting performance and extend for forecasting model. • Our method is widely applicable for other time series data which has time-lag between response and predictors. Code and Data available at http://sociocom.jp/~iso/forecastword

Forecasting Word Model: Twitter-based Influenza Surveillance and - PowerPoint PPT Presentation

Forecasting Word Model: Twitter-based Influenza Surveillance and Prediction Hayate ISO, Shoko WAKAMIYA, Eiji ARAMAKI Twitter for Public health 2 Many users tweet when they caught a disease # of tweets is in proportion to # of flu

2009 Influenza Update Influenza Facts Influenza Disease Protection, Treatment and

Influenza Tim Uyeki MD, MPH, MPP, FAAP Influenza Division National Center for Immunization and

Influenza vaccines Cheryl Cohen cherylc@nicd.ac.za Overview Burden of influenza and risk

Nothing to disclose. Influenza Update Lisa Winston, MD UCSF / San Francisco General Hospital

Memory Memory Decoders M bits M bits RWM NVRWM ROM S 0 S 0 Word 0 Word 0 S 1 Word 1 Word

Flood Forecasting Initiative Guy Shalev Flooding impact Flood Forecasting Flood Forecasting

Forecasts and potential futures Rob Hyndman Author, forecast Forecasting Using R Sample

The A(H7N9) influenza outbreak in China Anne Kelso Director WHO Collaborating Centre for

Franciscan Alliance Mandatory Workforce Influenza Vaccination Program Why a Mandatory Influenza

References References References Abbate R, Di Giuseppe G, Marinelli P, et al. Knowledge,

Swine Influenza Dr Paba Palihawadana Chief Epidemiologist Swine Influenza Respiratory

Surveillance of Avian Influenza in Animals FETP Avian Influenza Training Photo by Dr. Sue Trock

Influenza Session Robert L. Atmar, MD Chair, Influenza Work Group Advisory Committee on

Lecture 10 Forecasting and Model Fitting Colin Rundel 02/20/2017 1 Forecasting 2 Forecasting

Forecasting 21 January 2013 1 FCAS Agenda Business Goals & Forecasting Approach

Welcome to Forecasting Using R Rob Hyndman Author, forecast Forecasting Using R What you will

NOT FOR DISTRIBUTION OR RELEASE IN THE UNITED STATES NOT FOR DISTRIBUTION OR RELEASE IN THE UNITED

Canterbury Collaborative Approach Dr Phil Schroeder: Primary Care Coordinator Kelly Robertson:

Community Resilience and Engagement Strategy What is the strategy? Public facing document

INVESTOR PRESENTATION August, September, and October 2019 FORWARD-LOOKING STATEMENTS AND

Addressing Councils Concerns With Backyard Chickens With proper husbandry and management

IN INTE TERIM RIM R RES ESUL ULTS TS PRES PRESENT ENTATION TION FOR THE SIX MONTHS

Welcome to our partners in the present and future 1 2/26/2014 Who are we? ME ME ME C C C

Christopher J. Ryan, Executive Chairman Charles D. Roberson, President and Chief Executive Officer

Sambuz

Useful Links

Newsletter

Mail Us

Forecasting Word Model: Twitter-based Influenza Surveillance and - PowerPoint PPT Presentation

Forecasting Word Model: Twitter-based Influenza Surveillance and Prediction Hayate ISO, Shoko WAKAMIYA, Eiji ARAMAKI Twitter for Public health 2 Many users tweet when they caught a disease # of tweets is in proportion to # of flu

2009 Influenza Update Influenza Facts Influenza Disease Protection, Treatment and

Influenza Tim Uyeki MD, MPH, MPP, FAAP Influenza Division National Center for Immunization and

Influenza vaccines Cheryl Cohen cherylc@nicd.ac.za Overview Burden of influenza and risk

Nothing to disclose. Influenza Update Lisa Winston, MD UCSF / San Francisco General Hospital

Memory Memory Decoders M bits M bits RWM NVRWM ROM S 0 S 0 Word 0 Word 0 S 1 Word 1 Word

Flood Forecasting Initiative Guy Shalev Flooding impact Flood Forecasting Flood Forecasting

Forecasts and potential futures Rob Hyndman Author, forecast Forecasting Using R Sample

The A(H7N9) influenza outbreak in China Anne Kelso Director WHO Collaborating Centre for

Franciscan Alliance Mandatory Workforce Influenza Vaccination Program Why a Mandatory Influenza

References References References Abbate R, Di Giuseppe G, Marinelli P, et al. Knowledge,

Swine Influenza Dr Paba Palihawadana Chief Epidemiologist Swine Influenza Respiratory

Surveillance of Avian Influenza in Animals FETP Avian Influenza Training Photo by Dr. Sue Trock

Influenza Session Robert L. Atmar, MD Chair, Influenza Work Group Advisory Committee on

Lecture 10 Forecasting and Model Fitting Colin Rundel 02/20/2017 1 Forecasting 2 Forecasting

Forecasting 21 January 2013 1 FCAS Agenda Business Goals &amp; Forecasting Approach

Welcome to Forecasting Using R Rob Hyndman Author, forecast Forecasting Using R What you will

NOT FOR DISTRIBUTION OR RELEASE IN THE UNITED STATES NOT FOR DISTRIBUTION OR RELEASE IN THE UNITED

Canterbury Collaborative Approach Dr Phil Schroeder: Primary Care Coordinator Kelly Robertson:

Community Resilience and Engagement Strategy What is the strategy? Public facing document

INVESTOR PRESENTATION August, September, and October 2019 FORWARD-LOOKING STATEMENTS AND

Addressing Councils Concerns With Backyard Chickens With proper husbandry and management

IN INTE TERIM RIM R RES ESUL ULTS TS PRES PRESENT ENTATION TION FOR THE SIX MONTHS

Welcome to our partners in the present and future 1 2/26/2014 Who are we? ME ME ME C C C

Christopher J. Ryan, Executive Chairman Charles D. Roberson, President and Chief Executive Officer

Sambuz

Useful Links

Newsletter

Mail Us

Forecasting 21 January 2013 1 FCAS Agenda Business Goals & Forecasting Approach