IN INFERRIN ING G MOOD OOD IN INSTABIL ILIT ITY ON ON SOC - - PowerPoint PPT Presentation

in inferrin ing g mood ood in instabil ilit ity on on soc
SMART_READER_LITE
LIVE PREVIEW

IN INFERRIN ING G MOOD OOD IN INSTABIL ILIT ITY ON ON SOC - - PowerPoint PPT Presentation

IN INFERRIN ING G MOOD OOD IN INSTABIL ILIT ITY ON ON SOC OCIA IAL MEDIA DIA BY LEVERAGING ECOLOGICAL CAL MOMENTAR ARY Y ASSE ASSESSM SSMENTS Ko Koustuv Saha, Larry Chan, Ka Kaya de Ba Barbaro, , Gregory D. . Ab Abowd, , Mu


slide-1
SLIDE 1

IN INFERRIN ING G MOOD OOD IN INSTABIL ILIT ITY ON ON SOC OCIA IAL MEDIA DIA BY LEVERAGING ECOLOGICAL CAL MOMENTAR ARY Y ASSE ASSESSM SSMENTS

Ko Koustuv Saha, Larry Chan, Ka Kaya de Ba Barbaro, , Gregory D. . Ab Abowd, , Mu Munmun De De Choudhury

Saha, K., Chan, L., De Barbaro, K., Abowd, G. D., & De Choudhury, M. (2017). Inferring mood instability on social media by leveraging ecological momentary assessments. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 1(3), 95, https://dl.acm.org/citation.cfm?id=3130960

slide-2
SLIDE 2

Ba Backg kground

Quantifying attributes of mental well-being

2

Survey Instruments

  • Self-Report Questionnaires

Passive Sensing

  • Smartphones and Wearables
  • Social Media

Active Sensing

  • Ecological Momentary Assessments

(EMAs)

Ground-truth Data Challenge

SOCIAL MEDIA AS PASSIVE SENSOR!

slide-3
SLIDE 3

Mood Instability

3

slide-4
SLIDE 4

Go Goals & als & Co Contri ributions

Broad tasks

4

Combination of Active and Passive Sensing

50 100 150 200 1 3 5 7 9 11 13 15 17 19 21 23 Number of Responses

A machine learning framework identifying mood instability for a larger population Psycholinguistic cues and Mood Instability Lexicon

1 10 100 1000 0.2 0.4 0.6 0.8 1 Number of Users Value
slide-5
SLIDE 5

Objective: Inferring Mood Instability

5

Participants (Dataset 1)

EMAs Social Media

Public (Dataset 2)

Social Media

  • Small-scale
  • Actively sensed data as Ground-truth
  • Large-scale
  • Unlabeled

Actively Sensed Passively Sensed

slide-6
SLIDE 6

Study and Data

CampusLife, Georgia Tech

6

slide-7
SLIDE 7

Recruitment

■ 51 participants ■ Mean age: 22 Years ■ Incentives: $40-$120 ■ 5 weeks (Spring 2016)

Survey Questionnaire (Entry and Exit)

Active Sensing: EMAs (Daily)

Smartphone sensors (Barometer, Call, Acceleromete r, App usage..)

Social Media (Facebook, Twitter)

7

Data

60% 40%

Male Female

46% 54%

Undergrad Grad

slide-8
SLIDE 8

8

Priv ivac acy & Et Ethics

■ IRB approval ■ Data sharing consent ■ Secure servers ■ De-identification

slide-9
SLIDE 9

9

EM EMA D Data

Photographic Affect Meter (PAM)

(Pollak et al., 2011)

1,606 EMA Responses (Mean responses/participant: 32)

Pollak, J. P., Adams, P., & Gay, G. (2011, May). PAM: a photographic affect meter for frequent, in situ measurement of affect. In Proceedings of the SIGCHI conference on Human factors in computing systems (pp. 725-734). ACM.

slide-10
SLIDE 10

10

Soc Social ial Media ia Da Data-I

CampusLife Population

23 Participants 13k+ status updates 10 Participants 1.5k tweets

One-time collection

slide-11
SLIDE 11

11

Soc Social ial Media ia Da Data-II II

Unlabeled Twitter data with self- disclosure

(Coppersmith et al., 2014)

  • Self-Disclosure of Bipolar Disorder
  • Eg: I have been diagnosed with Bipolar

Disorder

Bipolar

  • Self-Disclosure of Borderline

Personality Disorder

  • Eg: I suffer from bpd

Borderline

  • Random Twitter Stream
  • Excludes Bipolar and Borderline

Control

37m+ tweets, 19k+ unique users

Coppersmith, G., Dredze, M., & Harman, C. (2014). Quantifying mental health signals in

  • Twitter. In Proceedings of the Workshop on

CLPsych: From Linguistic Signal to Clinical Reality.

slide-12
SLIDE 12

Pair-wise comparison of word-vectors (cosine similarities)

12

Lin Linguis istic ic Eq Equivalence

Cross-platform & Cross- population

(Baldwin et al., 2013)

Cross-platform Linguistic Equivalence (Facebook and Twitter)

0.90

Cross-population Linguistic Equivalence (College and General population)

0.95

Baldwin, T., Cook, P., Lui, M., MacKinlay, A., & Wang, L. (2013, October). How noisy social media text, how diffrnt social media sources?. In IJCNLP(pp. 356-364).

slide-13
SLIDE 13

Data: Recap

13

CampusLife (Dataset 1)

EMA Data Facebook Data

Public (Dataset 2)

Twitter Data

  • Small-scale
  • Actively sensed data as Ground-truth
  • Large-scale
  • Unlabeled

Actively Sensed Passively Sensed

slide-14
SLIDE 14

Methods and Results

14

slide-15
SLIDE 15

Methods: Overview

15

Public (Dataset 2) CampusLife (Dataset 1)

EMA Data Facebook Data Twitter Data

Actively Sensed Passively Sensed

Mood Instability Psycholinguistic Features

Seed Classifier

Psycholinguistic Features

Final Classifier

Lexicon

slide-16
SLIDE 16

Non-uniform time differences in EMA responses

16

Quan Quantif ifying ing Mo Mood Instability

Adjusted Successive Differences (ASDs)

(Jahng et al., 2008)

ASDi+1 = xi+1 − xi [(ti+1 − ti)/Mdn(ti+1 − ti)]λ

Apr 19 Apr 24 Apr 29 −2 −1 1 2 3

TimeStamp ASD

Valence Arousal

Jahng, S., Wood, P. K., & Trull, T. J. (2008). Analysis of affective instability in ecological momentary assessment: Indices using successive difference and group comparison via multilevel

  • modeling. Psychological methods.
slide-17
SLIDE 17

Labeling Mood Instability

■ ASD: Adjusted Successive Differences ■ MAD: Mean Absolute Deviation ■ MI: Mood Instability

17

ASD Arousal ASD Valence MI Valence MI Arousal MI EMA Arousal EMA Valence

Calculate ASD Calculate MAD(ASD)

MI>Mdn(MI) MI≤Mdn(MI)

Low MI High MI

slide-18
SLIDE 18

■ Psycholinguistic Lexicon: Linguistic Inquiry and Word Count (LIWC) ■ Supervised machine learning classifier

– 23 CampusLife participants – k-fold cross-validation (k=5) for parameter tuning – Naïve Bayes, Logistic Regression, Random Forest, Support Vector Machine

18

Mach chine Learning Cl Classifier er

Seed Classifier

Linguistic Inquiry and Word Count (Pennebaker et al., 2003)

Psycholinguistic Features

(X)

Mood Instability Label

Y

Pennebaker, J. W., Mehl, M. R., & Niederhoffer, K.

  • G. (2003). Psychological aspects of natural language

use: Our words, our selves. Annual review of psychology.

slide-19
SLIDE 19

Seed Classifier: Accuracy Metrics

19

Metric mean stdev. max. Naïve Bayes 0.58 0.54 0.83 Logistic Regression 0.51 0.35 0.80 Random Forest 0.48 0.64 0.83 SVM (Knl.=Poly.) 0.56 0.24 0.80 SVM (Knl.=RBF) 0.51 0.35 0.80 SVM (Knl.=Linear) 0.68 0.29 0.83

Unstable Classification?

Challenge

slide-20
SLIDE 20

20

Se Semi-Su Supervis vised Cl Classifier

Self-Training

(Dara et al., 2002) ■ K-Means clustering (K=2) ■ Classification of centroids using seed classifier

Dara, R., Kremer, S. C., & Stacey, D. A. (2002). Clustering unlabeled data with SOMs improves classification of labeledreal-worlddata.Proc.ofthe2002IJCNN.IEEE. Dimension 1 Dimension 2

slide-21
SLIDE 21

Semi-supervised Classifier: Stability

21

Data k-fold CV accuracies of SS Classifier (%High MI)

Folds 1 2 3 4 5 mean stdev.

Bipolar 62.87 63.64 62.66 63.18 63.38 63.15 0.39 Borderline 61.06 61.81 62.44 62.84 62.31 62.09 0.68 Control 36.70 36.54 36.56 36.47 37.26 36.71 0.32

stdev.

0.39 0.68 0.32

Stable Classification

Addressed

slide-22
SLIDE 22

22

Re Results

Machine Learning Classification

■ High Accuracy

1000 2000 3000 4000 5000 6000

Bipolar Borderline Control High MI Low MI

■ Higher Occurrence of High MI in Bipolar and Borderline datasets as compared to Control

slide-23
SLIDE 23

23

An Analyzing the Lan Languag age

Psycholinguistic Features Mood Instability Lexicon

Psycholinguistic Features Psycholinguistic Group

  • H. MI vs. L. MI

Affective Attributes 83% Cognitive Attributes 521% Interpersonal Focus 124% Lexical Density and Awareness 195% Social/Personal Concerns 90% High MI Low MI

slide-24
SLIDE 24

Discussion

24

slide-25
SLIDE 25

25

Implic Implicatio ions ns

■ Social media as a passive sensor ■ Ability to detect Mood Instability ■ Tackle the challenges of lack of labeled data ■ Application in other health sensing problems ■ Integrate multiple sensors

slide-26
SLIDE 26

26

Limit Limitatio ions & & Fut Futur ure W e Work rk

■ Clinical Relevance ■ Causal Claims ■ Self-Reported and Social Media Data ■ Multimodal Data

slide-27
SLIDE 27

27

■ CampusLife Consortium ■ StudentLife Project ■ Human-Facing Privacy Thrust of the IISP Institute at Georgia Tech

Thank You

koustuv.saha@gatech.edu koustuv.com

Acknowledgements

Saha, K., Chan, L., De Barbaro, K., Abowd, G. D., & De Choudhury,

  • M. (2017). Inferring mood instability on social media by leveraging

ecological momentary assessments. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 1(3), 95, https://dl.acm.org/citation.cfm?id=3130960

slide-28
SLIDE 28

28

slide-29
SLIDE 29

Seed Classifier: Accuracy Metrics

29

Metric mean stdev. max. Naïve Bayes 0.58 0.54 0.83 Logistic Regression 0.51 0.35 0.80 Random Forest 0.48 0.64 0.83 SVM (Knl.=Poly.) 0.56 0.24 0.80 SVM (Knl.=RBF) 0.51 0.35 0.80 SVM (Knl.=Linear) 0.68 0.29 0.83

Unstable Classification? Challenge

Data k-fold CV accuracies of Seed Classifier (%High MI) Folds 1 2 3 4 5 mean stdev. Bipolar 66.81 69.86 64.64 43.76 62.82 51.38 10.30 Borderline 61.37 63.81 54.41 34.04 56.13 45.06 11.76 Control 42.04 46.05 37.35 24.79 37.94 31.40 7.99 stdev.

10.30 11.76 7.99