in inferrin ing g mood ood in instabil ilit ity on on soc
play

IN INFERRIN ING G MOOD OOD IN INSTABIL ILIT ITY ON ON SOC - PowerPoint PPT Presentation

IN INFERRIN ING G MOOD OOD IN INSTABIL ILIT ITY ON ON SOC OCIA IAL MEDIA DIA BY LEVERAGING ECOLOGICAL CAL MOMENTAR ARY Y ASSE ASSESSM SSMENTS Ko Koustuv Saha, Larry Chan, Ka Kaya de Ba Barbaro, , Gregory D. . Ab Abowd, , Mu


  1. IN INFERRIN ING G MOOD OOD IN INSTABIL ILIT ITY ON ON SOC OCIA IAL MEDIA DIA BY LEVERAGING ECOLOGICAL CAL MOMENTAR ARY Y ASSE ASSESSM SSMENTS Ko Koustuv Saha, Larry Chan, Ka Kaya de Ba Barbaro, , Gregory D. . Ab Abowd, , Mu Munmun De De Choudhury Saha, K., Chan, L., De Barbaro, K., Abowd, G. D., & De Choudhury, M. (2017). Inferring mood instability on social media by leveraging ecological momentary assessments. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies , 1 (3), 95, https://dl.acm.org/citation.cfm?id=3130960

  2. Backg Ba kground Survey Instruments • Self-Report Questionnaires Quantifying attributes of Active Sensing • Ecological Momentary Assessments mental well-being (EMAs) Passive Sensing • Smartphones and Wearables • Social Media SOCIAL MEDIA AS PASSIVE SENSOR! Challenge Ground-truth Data 2

  3. Mood Instability 3

  4. Go Goals & als & 200 Combination of Active and Number of Responses 150 Co Contri ributions Passive Sensing 100 50 0 1 3 5 7 9 11 13 15 17 19 21 23 A machine learning framework Broad tasks identifying mood instability for a larger population 1000 Psycholinguistic cues and Number of Users 100 Mood Instability Lexicon 10 1 0 0.2 0.4 0.6 0.8 1 Value 4

  5. Objective: Inferring Mood Instability Passively Sensed Actively Sensed EMAs Social Media Social Media Participants (Dataset 1) Public (Dataset 2) • Small-scale • Large-scale • Actively sensed data as Ground-truth • Unlabeled 5

  6. Study and Data CampusLife, Georgia Tech 6

  7. Data Recruitment ■ 51 participants ■ Mean age: 22 Years Survey Active ■ Incentives: $40-$120 Questionnaire Sensing: (Entry and EMAs (Daily) ■ 5 weeks (Spring 2016) Exit) Smartphone Social Undergrad Grad Male Female sensors Media (Barometer, Call, (Facebook, Acceleromete Twitter) 40% 46% r, App usage..) 54% 60% 7

  8. Priv ivac acy & ■ IRB approval Et Ethics ■ Data sharing consent ■ Secure servers ■ De-identification 8

  9. EM EMA D Data Photographic Affect Meter (PAM) (Pollak et al., 2011) Pollak, J. P., Adams, P., & Gay, G. (2011, May). PAM: a photographic affect meter for frequent, in situ measurement of affect. In Proceedings of the SIGCHI conference on Human factors in computing systems (pp. 725-734). ACM. 1,606 EMA Responses (Mean responses/participant: 32) 9

  10. Soc Social ial Media ia 23 Participants Da Data-I 13k+ status updates 10 Participants CampusLife Population 1.5k tweets One-time collection 10

  11. Social Soc ial Media ia • Self-Disclosure of Bipolar Disorder Bipolar Da Data-II II • Eg: I have been diagnosed with Bipolar Disorder Unlabeled Twitter data with self- • Self-Disclosure of Borderline Borderline Personality Disorder disclosure • Eg: I suffer from bpd (Coppersmith et al., 2014) • Random Twitter Stream Control • Excludes Bipolar and Borderline Coppersmith, G., Dredze, M., & Harman, C. (2014). Quantifying mental health signals in Twitter. In Proceedings of the Workshop on CLPsych: From Linguistic Signal to Clinical Reality . 37m+ tweets, 19k+ unique users 11

  12. Lin Linguis istic ic Pair-wise comparison of word-vectors (cosine similarities) Eq Equivalence Cross-platform & Cross- Cross-platform Linguistic Equivalence 0.90 ( Facebook and Twitter ) population (Baldwin et al., 2013) Cross-population Linguistic Equivalence 0.95 ( College and General population ) Baldwin, T., Cook, P., Lui, M., MacKinlay, A., & Wang, L. (2013, October). How noisy social media text, how diffrnt social media sources?. In IJCNLP(pp. 356-364). 12

  13. Data: Recap Passively Sensed Actively Sensed EMA Data Facebook Data Twitter Data CampusLife (Dataset 1) Public (Dataset 2) • Small-scale • Large-scale • Actively sensed data as Ground-truth • Unlabeled 13

  14. Methods and Results 14

  15. Methods: Overview Passively Sensed Actively Sensed EMA Data Facebook Data Twitter Data Psycholinguistic Psycholinguistic Mood Instability Features Features Seed Final CampusLife (Dataset 1) Public (Dataset 2) Classifier Classifier Lexicon 15

  16. Quan Quantif ifying ing Non-uniform time differences in EMA responses Mo Mood Instability x i +1 − x i ASD i +1 = [( t i +1 − t i ) /Mdn ( t i +1 − t i )] λ Adjusted Successive Differences (ASDs) (Jahng et al., 2008) Valence Arousal 3 Jahng, S., Wood, P. K., & Trull, T. J. (2008). 2 Analysis of affective instability in ecological 1 momentary assessment: Indices using successive ASD difference and group comparison via multilevel 0 modeling. Psychological methods . − 1 − 2 Apr 19 Apr 24 Apr 29 TimeStamp 16

  17. Labeling Mood Instability High MI MI > Mdn (MI) ASD EMA MI Valence Valence Arousal Calculate Calculate MI ASD MAD(ASD) ASD EMA MI Arousal Arousal Valence Low MI MI ≤Mdn (MI) ■ ASD: Adjusted Successive Differences ■ MAD: Mean Absolute Deviation ■ MI: Mood Instability 17

  18. Mach chine Learning ■ Psycholinguistic Lexicon: Linguistic Inquiry and Word Count (LIWC) Cl Classifier er Supervised machine learning classifier ■ 23 CampusLife participants – – k-fold cross-validation (k=5) for parameter tuning Seed Classifier Naïve Bayes, Logistic Regression, Random Forest, – Support Vector Machine Linguistic Inquiry and Word Count (Pennebaker et al., 2003) Pennebaker, J. W., Mehl, M. R., & Niederhoffer, K. Mood Psycholinguistic G. (2003). Psychological aspects of natural language Instability Features use: Our words, our selves. Annual review of Label psychology . (X) Y 18

  19. Seed Classifier: Accuracy Metrics Metric mean stdev. max. Naïve Bayes 0.58 0.54 0.83 Logistic Regression 0.51 0.35 0.80 Random Forest 0.48 0.64 0.83 SVM (Knl.=Poly.) 0.56 0.24 0.80 SVM (Knl.=RBF) 0.51 0.35 0.80 SVM (Knl.=Linear) 0.68 0.29 0.83 Challenge Unstable Classification? 19

  20. Se Semi-Su Supervis vised ■ K -Means clustering ( K =2) Cl Classifier ■ Classification of centroids using seed classifier Self-Training (Dara et al., 2002) Dimension 2 Dara, R., Kremer, S. C., & Stacey, D. A. (2002). Clustering unlabeled data with SOMs improves classification of labeledreal-worlddata. Proc.ofthe2002IJCNN. IEEE. Dimension 1 20

  21. Semi-supervised Classifier: Stability Data k-fold CV accuracies of SS Classifier (%High MI) stdev. Folds 1 2 3 4 5 mean stdev. 0.39 Bipolar 62.87 63.64 62.66 63.18 63.38 63.15 0.39 0.68 Borderline 61.06 61.81 62.44 62.84 62.31 62.09 0.68 Control 36.70 36.54 36.56 36.47 37.26 36.71 0.32 0.32 Addressed Stable Classification 21

  22. Re Results High Accuracy ■ Machine Learning Classification Higher Occurrence of High MI in Bipolar and ■ Borderline datasets as compared to Control 6000 5000 4000 3000 2000 1000 0 Bipolar Borderline Control High MI Low MI 22

  23. An Analyzing the Psycholinguistic Features Lan Languag age Psycholinguistic Group H. MI vs. L. MI Affective Attributes 83% Cognitive Attributes 521% Interpersonal Focus 124% Psycholinguistic Features Lexical Density and Awareness 195% Mood Instability Lexicon Social/Personal Concerns 90% Low MI High MI 23

  24. Discussion 24

  25. Implic Implicatio ions ns ■ Social media as a passive sensor ■ Ability to detect Mood Instability ■ Tackle the challenges of lack of labeled data ■ Application in other health sensing problems ■ Integrate multiple sensors 25

  26. Limit Limitatio ions & & ■ Clinical Relevance Fut Futur ure W e Work rk ■ Causal Claims ■ Self-Reported and Social Media Data ■ Multimodal Data 26

  27. Acknowledgements ■ CampusLife Consortium ■ StudentLife Project ■ Human-Facing Privacy Thrust of the IISP Institute at Georgia Tech Thank You koustuv.saha@gatech.edu Saha, K., Chan, L., De Barbaro, K., Abowd, G. D., & De Choudhury, koustuv.com M. (2017). Inferring mood instability on social media by leveraging ecological momentary assessments. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies , 1 (3), 95, https://dl.acm.org/citation.cfm?id=3130960 27

  28. 28

  29. Seed Classifier: Accuracy Metrics Metric mean stdev. max. Naïve Bayes 0.58 0.54 0.83 Logistic Regression 0.51 0.35 0.80 Random Forest 0.48 0.64 0.83 Challenge SVM (Knl.=Poly.) 0.56 0.24 0.80 SVM (Knl.=RBF) 0.51 0.35 0.80 Unstable Classification? SVM (Knl.=Linear) 0.68 0.29 0.83 Data k-fold CV accuracies of Seed Classifier (%High MI) stdev. Folds 1 2 3 4 5 mean stdev. 10.30 Bipolar 66.81 69.86 64.64 43.76 62.82 51.38 10.30 11.76 Borderline 61.37 63.81 54.41 34.04 56.13 45.06 11.76 Control 42.04 46.05 37.35 24.79 37.94 31.40 7.99 7.99 29

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend