Forecasting Word Model: Twitter-based Influenza Surveillance and - - PowerPoint PPT Presentation

forecasting word model twitter based influenza
SMART_READER_LITE
LIVE PREVIEW

Forecasting Word Model: Twitter-based Influenza Surveillance and - - PowerPoint PPT Presentation

Forecasting Word Model: Twitter-based Influenza Surveillance and Prediction Hayate ISO, Shoko WAKAMIYA, Eiji ARAMAKI Twitter for Public health 2 Many users tweet when they caught a disease # of tweets is in proportion to # of flu


slide-1
SLIDE 1

Forecasting Word Model: Twitter-based Influenza Surveillance and Prediction

Hayate ISO, Shoko WAKAMIYA, Eiji ARAMAKI

slide-2
SLIDE 2
  • Many users tweet when they caught a disease
  • # of tweets is in proportion to # of flu patients

Twitter for Public health

Time

Counts

■ # of flu related tweets ■ # of flu patients

2

slide-3
SLIDE 3

Noise included in tweets

By patients: By healthy people: Website:

High Fever @flu_patient

I got a flu… I couldn’t do anymore…

Healthy person @organic

I’ve never caught a flu

Injection lover @prevention

I got a flu shot yesterday

Influencer @not_influenza

For more information about bird flu link

3

slide-4
SLIDE 4

Noise included in tweets

By healthy people: Website:

High Fever @flu_patient

I got a flu… I couldn’t do anymore…

Healthy person @organic

I’ve never caught a flu

Injection lover @prevention

I got a flu shot yesterday

Influencer @not_influenza

For more information about bird flu link

4

By patients:

Only counts this type of tweets

slide-5
SLIDE 5

Our lab runs flu surveillance system

Aramaki, Eiji, Sachiko Maskawa, and Mizuki Morita. "Twitter catches the flu: detecting influenza epidemics using Twitter." In Proc of EMNLP 2011. http://mednlp.jp/influ_map/

5

slide-6
SLIDE 6

Similarity between Tweets and Patients

Tweets about flu is slightly earlier than reports of flu in patients

6

slide-7
SLIDE 7

■ # of the word “fever” ■ Time shifted ■ # of flu patients

Time

Counts

■ # of flu related tweets ■ # of flu patients

7

Each word has a specific time-lag

Time

Counts

Time

Counts

The word “Fever” 16 days time lag The word “Injection” 55 days time lag

7

■ # of the word “Injection” ■ Time shifted ■ # of flu patients

slide-8
SLIDE 8
  • Twitter tends to be an early indicator of actual condition
  • We observed that each word has a specific time lag with

actual condition

  • Our objective: more flexible modeling
  • Estimate time-difference
  • Extend future forecasting model

What is Forecasting Words?

8

slide-9
SLIDE 9

9

Outline

Data Time shift: Forecasting Time shift: Nowcasting

slide-10
SLIDE 10

10

Outline

Data Time shift: Forecasting Time shift: Nowcasting

slide-11
SLIDE 11

Training data: Twitter Corpus

  • Query: The word ’’flu’’ in Japanese

(INFLU / I-N-FU-RU/ )

  • Period: Aug 2012 ~ Jan 2016

(3 year 5 month)

  • Size of corpus: 7.7 Million tweets

11

slide-12
SLIDE 12
  • Infectious Disease Surveillance Center (IDSC) reports

# of flu patients once a week

  • They gather the number of flu patients during the period
  • f epidemic
  • We split IDSC reports into three seasons as follows:
  • Season 1: Dec 1, 2012 ~ May 31, 2013
  • Season 2: Dec 1, 2013 ~ May 31, 2014
  • Season 3: Dec 1, 2014 ~ May 24, 2014

Gold standard: IDSC reports

12

slide-13
SLIDE 13

13

Outline

Data Time shift: Forecasting Time shift: Nowcasting

slide-14
SLIDE 14
  • Cross Correlation is used to search for the most

suitable time shift width for each word frequency as between # of tweets τ days before and # of actual patients

Time lag measure: Cross Correlation 14

※ The cross correlation is exactly the same as the Pearson’s correlation when τ = 0.

where

slide-15
SLIDE 15
  • Cross Correlation r:
  • When τ = 0, r is 0.75 B/T tweet and IDSC reports

Motivating examples

15

■ # of the word “fever” ■ # of flu patients

slide-16
SLIDE 16
  • Cross Correlation r:
  • When τ increases, word counts moves to right side:

Motivating examples

16

■ # of the word “fever” ■ # of flu patients

slide-17
SLIDE 17
  • Cross Correlation r:
  • When τ = 16, r is 0.95 B/T tweet and IDSC reports

Motivating examples

17

■ # of the word “fever” ■ # of flu patients

slide-18
SLIDE 18
  • We define optimal time-lag τ by maximizing the cross

correlation

Estimate optimal time-lag

18

^

slide-19
SLIDE 19

Heatmap representation of Matrix

19

Apply time-shift

X y X y Raw word counts # of patients

slide-20
SLIDE 20

Effectiveness of time shift

20

  • Regression for nowcasting with applying time-shift or not:
  • Lasso (Tibshirani, 1994)
  • Elastic-Net (Zou and Hastie, 2005)
  • The searching range of time shift τ is in [0, …, 60]

Train Season 2 Season 3 Season 1 Season 3 Season 1 Season 2

Avg.

Test Season 1 Season 2 Season 3

Lasso+

0.952 0.907 0.951 0.888 0.955 0.963 0.936

ENet+

0.944 0.898 0.960 0.878 0.967 0.959 0.934

Lasso

0.854 0.916 0.768 0.894 0.770 0.753 0.825

ENet

0.900 0.927 0.809 0.914 0.792 0.805 0.858 ※ Higher is better

without time-shift with time-shift

slide-21
SLIDE 21

21

Outline

Data Time shift: Forecasting Time shift: Nowcasting

slide-22
SLIDE 22
  • To estimate specific day of the epidemic through

Twitter, we need to gather same day’s tweet

  • How to predict future disease outbreaking?

22

Limitation

Time

Counts

?

Past Future

■ # of flu related tweets ■ # of flu patients

22

slide-23
SLIDE 23

23

Restrict time shift estimation

23

  • In order to forecast Δt days future epidemics,

we restrict searching interval of time shift at least Δt days

Searching interval

slide-24
SLIDE 24
  • Nowcasting case: τ ∈ [0, τmax]

24

Motivating example

24

■ # of the word “fever” ■ # of flu patients

slide-25
SLIDE 25

■ # of the word “fever” ■ # of the word “fever” (10 days shifted) ■ # of the word “fever” (16 days shifted) ■ # of flu patients ■ # of the word “fever” ■ # of the word “fever” (10 days shifted) ■ # of flu patients ■ # of the word “fever” ■ # of flu patients

25

Motivating example

25

  • Forecasting case (10 days future): τ ∈ [10, τmax]
slide-26
SLIDE 26

■ # of the word “fever” ■ # of the word “fever” (30 days shifted) ■ # of flu patients ■ # of the word “fever” ■ # of flu patients

26

Motivating example

26

  • Forecasting case (30 days future): τ ∈ [30, τmax]
slide-27
SLIDE 27

■ # of the word “Injection” ■ # of the word “Injection” (30 days shifted) ■ # of the word “Injection” (55 days shifted) ■ # of flu patients

r = 0.87

^

■ # of the word “Injection” ■ # of the word “Injection” (30 days shifted) ■ # of flu patients ■ # of the word “Injection” ■ # of flu patients

  • Forecasting case (30 days future): τ ∈ [30, τmax]

27

Motivating example

27

slide-28
SLIDE 28

28

Forecasting Modeling

28

  • In each Δt, we search optimal time shift for all words.
  • Estimate model by Lasso & ENet using these features.

Searching interval

slide-29
SLIDE 29

29

Our model beyonds baseline

29

※ Higher is better

  • BaseLine:
slide-30
SLIDE 30

30

Summary

30

  • We discovered the time difference between twitter

and actual phenomena.

  • We proposed but handling such difference to improve

the nowcasting performance and extend for forecasting model.

  • Our method is widely applicable for other time series

data which has time-lag between response and predictors.

Code and Data available at http://sociocom.jp/~iso/forecastword