members
play

Members: Raghuram Krishnamachari Manish Maheshwari Maryam El - PowerPoint PPT Presentation

Members: Raghuram Krishnamachari Manish Maheshwari Maryam El Kherba Guided by: Prof. Alan Mislove Flu Prediction / Activity CDC Flu Activity Reports Influenza like Illness (ILI) for each region Google Flu Trends Aggregates


  1. Members: Raghuram Krishnamachari Manish Maheshwari Maryam El Kherba Guided by: Prof. Alan Mislove

  2. Flu Prediction / Activity  CDC Flu Activity  Reports Influenza like Illness (ILI) for each region  Google Flu Trends  Aggregates search data to estimate flu activity  Our experiment (Twitter)  Analyze Twitter data (tweets) to estimate flu activity

  3. Google Flu Trends  CDC’s ILI data VS Google Flu Trends

  4. Google Flu Trends Vs Twitter 12000 HHS Region 1 (CT, ME, MA, NH, RI, VT) HHS Region 2 (NJ, NY) 10000 HHS Region 3 (DE, DC, MD, PA, 8000 VA, WV) HHS Region 4 (AL, FL, GA, KY, MS, NC, SC, TN) HHS Region 5 (IL, IN, MI, MN, 6000 OH, WI) HHS Region 6 (AR, LA, NM, OK, 4000 TX) HHS Region 7 (IA, KS, MO, NE) 2000 HHS Region 8 (CO, MT, ND, SD, UT, WY) HHS Region 9 (AZ, CA, HI, NV) 0 HHS Region 10 (AK, ID, OR, WA) United States 0.009 Region 1 0.008 0.007 Region 2 0.006 Region 3 0.005 Region 4 0.004 Region 5 0.003 Region 6 0.002 0.001 Region 7 0 Region 8 Region 9 Region 10

  5. Google Flu Trends Vs Twitter 7000 6000 5000 4000 3000 G-R3 2000 T-R3 1000 0 8000 7000 6000 5000 4000 3000 G-R9 2000 T-R9 1000 0

  6. Tweets, Phrases "having a cold" 4 "have a cold“ 7 "feel feverish" "flu" 5 "headache" "flu" 8 "sick" "flu" 9 "flu" "fever“ 5 "came down with the flu" 7 "chills" "flu" 7 "catching the flu" 6 "cough" "flu" 6 "fatigue" "flu" 8 "weakness" "flu" 6 "flu like symptoms" 4 "runny nose" "flu" 5 "sore throat" "flu" 7 "stomach ache" "flu" 6 "stuffy nose" "flu" 6 "tiredness" "flu" 4 "vomiting" "flu" 4 "watery eyes" "flu" 6 "body hurts" "flu" 7

  7. Process • Filter flu tweets from twitter data Filter • Store data for each state (FIPS) • Count flu tweets (weekly) Count • Count total tweets (weekly) • Ratio of flu related to total tweets Plot • Compare against Google/CDC

  8. Implementation Linux bash shell script  Filtering  find fips -name "*.gz" -exec zcat {} \; | grep "$1"  Counting  find … -exec zcat {} \; | awk ‘{ print $3 }' | awk '{ print $3 " " $2 " " $6 }  sort -k 3n -k 2M -k 1n | uniq -c  Plotting  pr -mft -s, dates.txt NJ.tot NY.tot > RE2.tot  Microsoft Excel

  9. Challenges  Filtering  Phrases that express flu symptoms  Processing time  Segregation based on location  Counting  Processing time  Storage format  Plotting  Lack of consistent CDC data  Handling of large numeric data

  10. Future  Better prediction algorithm  Live Tweet monitoring  Flu propagation  Facebook application

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend