Predicting the Future with Deep Learning and Signals from Social - - PowerPoint PPT Presentation

predicting the future with deep learning and signals from
SMART_READER_LITE
LIVE PREVIEW

Predicting the Future with Deep Learning and Signals from Social - - PowerPoint PPT Presentation

Predicting the Future with Deep Learning and Signals from Social Media SVITLANA VOLKOVA, PHD Senior Research Scientist Data Sciences and Analytics Group, National Security Directorate Pacific Northwest National Laboratory ACL Workshop on


slide-1
SLIDE 1

Predicting the Future with Deep Learning and Signals from Social Media

SVITLANA VOLKOVA, PHD

Senior Research Scientist

Data Sciences and Analytics Group, National Security Directorate Pacific Northwest National Laboratory ACL Workshop on Natural Language Processing and Computational Social Science

August 10, 2017 1

slide-2
SLIDE 2

Social Media Analytics

Predict Deceptive News Forecast Perspective Dynamics

Brussels Bombings March 2016

Forecast Influenza and Weather

Russia-Ukraine Conflict 2014 – 2015

Forecast Language Change

LSTM layer t0 Predicted weekly ILI proportions

}

.03 .03 .01 .02 .05 ILI predictors t1 t3 t4 .4 … .1 .3 … .3 .3 … .1 .3 … .3 LSTM layer Fully connected layer SM predictors t0

} }

t1 t3 t4 Merge layer .4 … .1 .3 … .3 .3 … .1 .3 … .3 LSTM layer Fully connected layer Event Types Entity Distributions t0 The most likely event type

} }

Conflict t1 t3 t4 LSTM pre-trained (100 units) Dense Layer (100 units) Input Embeddings Dense Layer (128 units) Softmax Layer Output Probabilities

russian tanks spotted in crimea today embedding dimension (100)

}

… } } } } … }

… … …

}

Identify Suspicious Accounts

August 10, 2017 2

Detect Real-World Events Forecast Future Events and Instability

Predictive Analytics Forecasting Analytics

LSTM/ Convolutional Layer (100 units) Embedding Layer (200 units) Input Word Sequences Dense Layer (100 units) Probability Activation Layer (sigmoid/softmax) Final Output Probabilities } … } } } } … … } Dense Layer (100 units) … } Dense Layer (100 units) } Network/ Linguistic Cues … } Tensor Concatenation } Dense Layer (100 units)

Predict Native Language

Bidirectional GRU (20 units) Embedding Layer (30 units) Dense Classification Layers Binary Output English Input (Bytes)

}

} } … } Bidirectional GRU (20 units)

}

ES IN JA FR DE
slide-3
SLIDE 3 LSTM/ Convolutional Layer (100 units) Embedding Layer (200 units) Input Word Sequences Dense Layer (100 units) Probability Activation Layer (sigmoid/softmax) Final Output Probabilities

}

… } } } } … … }

Dense Layer (100 units)

… }

Dense Layer (100 units)

}

Network/ Linguistic Cues

… } Tensor Concatenation }

Dense Layer (100 units)

Outline

3 August 10, 2017

Writer

= =

Reader Agent Theme P ( w → a g e n t ) P ( w → t h e m e ) P(agent → theme)

the predicate doesn’t directly imply what the writer thinks of the theme the writer portrays the agent as being unfairly
  • pportunistic

— — — —

agent is unfairly taking advantage
  • f the theme

Predicting Suspicious and Trusted News on Twitter

(joint work with K. Shaffer, J. Yang, and N. Hodas)

Analyzing and Forecasting Targeted Perspectives in Social Media (collaboration with H. Rashkin and Y. Choi) Forecasting Short-Term Change in Text Representations during Crisis Events from VK

(joint work with I. Stewart, D. Arendt, and E. Bell)

slide-4
SLIDE 4 LSTM/ Convolutional Layer (100 units) Embedding Layer (200 units) Input Word Sequences Dense Layer (100 units) Probability Activation Layer (sigmoid/softmax) Final Output Probabilities

}

… } } } } … … }

Dense Layer (100 units)

… }

Dense Layer (100 units)

}

Network/ Linguistic Cues

… } Tensor Concatenation }

Dense Layer (100 units)

Outline

4 August 10, 2017

Predicting Suspicious and Trusted News on Twitter

(joint work with K. Shaffer, J. Yang, and N. Hodas)

Analyzing and Forecasting Targeted Perspectives in Social Media (collaboration with H. Rashkin and Y. Choi) Forecasting Short-Term Change in Text Representations during Crisis Events from VK

(joint work with I. Stewart, D. Arendt, and E. Bell)

slide-5
SLIDE 5

Motivation and Background

August 10, 2017 5

62% of U.S. adults get news on social media (Pew Research, Oct 2016) 64% of U.S. adults said that “made-up news” has caused a “great deal of confusion” about the facts of current events (Pew Research, Dec 2016) Previous work on deception detection:

Deceptive Amazon reviews (Choi, Mihalcea) Satirical news (Rubin et al.2015) Rumors (Qazvinian et al., 2011; Liu et al., 2015)

Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on

  • Twitter. S. Volkova, K. Shaffer, J. Yea Jang and N. Hodas. ACL 2017.
slide-6
SLIDE 6

Deceptive News

Google Fact Checking: https://www.blog.google/topics/journalism-news/expanding-fact-checking-google/ Facebook 3rd Party Verification: http://newsroom.fb.com/news/2016/12/news-feed-fyi-addressing-hoaxes-and-fake-news/ August 10, 2017 6

slide-7
SLIDE 7

Deceptive News Types

August 10, 2017 7

Propaganda deliberately spread misinformation in order to appeal to certain groups Hoax seek to mislead, rather than entertain, readers for financial or political gain Clickbait take bits of true stories but insinuate and make up

  • ther details to sew fear

Satire take fun of the news, are satirical bent, or parodies of news Intent to Deceive No Intent to Deceive Hoax Propaganda Satire Clickbait

slide-8
SLIDE 8

Twitter News Data

Intent to Deceive No Intent to Deceive Hoax Propaganda Satire Clickbait

Intent to Deceive No Intent to Deceive Propaganda Disinfo Clickbait Hoax Conspiracy

August 10, 2017

130K total 65K suspicious 2M suspicious tweets

8

slide-9
SLIDE 9

http://www.marketwatch.com/story/how-does-your-favorite-news-source-rate-on-the-truthiness-scale-consult-this-chart-2016-12-15

News Categorization

slide-10
SLIDE 10

Alternative News Categorization

http://www.marketwatch.com/story/how-does-your-favorite-news-source-rate-on-the-truthiness-scale-consult-this-chart-2016-12-15

slide-11
SLIDE 11

Annotations

August 10, 2017 11

Brussels bombing dataset

March 15 – March 29, 2016 One week after and before March 22nd, 2016

Account-level vs. tweet-level annotations:

Fake news annotations http://www.fakenewswatch.com/ PropOrNot http://www.propornot.com/p/the-list.html (manually verified)

Signs of propaganda

Tries to persuade Influences the emotions, attitudes, opinions, and actions Target audiences for political, ideological, and religious purposes Have examples of selectively-omitting and one-sided messages

slide-12
SLIDE 12

Task Definition

August 10, 2017 12

Build tweet-level neural network models to differentiate between: Verified vs. unverified news posts (130K) Types of unverified news posts: propaganda, hoax, clickbait, satire (65K) disinformation, propaganda, conspiracy, clickbait, hoaxes (2M)

?

Intent to Deceive No Intent to Deceive Intent to Deceive No Intent to Deceive Hoax Propaganda Satire Clickbait Intent to Deceive No Intent to Deceive Propaganda Disinfo Clickbait Hoax Conspiracy

slide-13
SLIDE 13

Model

August 10, 2017 13

Baselines: logistic regression with TFIDF and Doc2Vec representations Our models: neural networks (RNN/CNN) with social network interaction and linguistic cues: hedging, assertive, factive, implicative verbs

LSTM/ Convolutional Layer (100 units) Embedding Layer (200 units) Input Word Sequences Dense Layer (100 units) Probability Activation Layer (sigmoid/softmax) Final Output Probabilities

}

… } } } } … … }

Dense Layer (100 units)

… }

Dense Layer (100 units)

}

Network/ Linguistic Cues

… } Tensor Concatenation }

Dense Layer (100 units)

Keras: https://keras.io/, scikit-learn: http://scikit-learn.org/stable/, Doc2Vec: https://pypi.python.org/pypi/gensim

slide-14
SLIDE 14

Linguistic Analysis

August 10, 2017 14

Moral Foundation Theory (Haidt and Grahm, 2007, Graham et al., 2009) Harm, Care, Loyalty, Betrayal, Authority Biased Language (Recasens et al., 2013) Assertive, Factive, Hedging, Implicative, Report Verbs Subjective Language (Volkova et al., 2013, Liu et al., 2005, Riloff et al., 2003) Betrayal↑, Care↑, Loyalty↓, Hedging↓, Implicative↓ Loyalty↑, Hedges↑, Subj↑, Betrayal↓ Care↓, Subjective↓, Factive↓, Bias↓

slide-15
SLIDE 15

Verified vs. Suspicious Prediction Results

August 10, 2017 15

Binary: linguistic and social graph features (130K tweets, 10 fold c.v.) 0.76 0.81 0.93 0.95 0.6 0.7 0.8 0.9 1 text + graph + ling. cues all Accuracy LR D2V LR TFIDF RNN CNN

?

Intent to Deceive No Intent to Deceive

slide-16
SLIDE 16

Suspicious News Prediction Results (1)

August 10, 2017 16

Multi-class prediction: satire, hoaxes, clickbaits, propaganda (65K)

Intent to Deceive No Intent to Deceive Hoax Propaganda Satire Clickbait

0.63 0.71 0.63 0.66 0.2 0.4 0.6 0.8 text + network + ling. markers all F1 macro RNN CNN LR TFIDF LR D2V

slide-17
SLIDE 17

Suspicious News Prediction Results (2)

August 10, 2017 17

Multi-class prediction: disinformation, propaganda, conspiracy, clickbait, hoaxes (2M)

Intent to Deceive No Intent to Deceive Propaganda Disinfo Clickbait Hoax Conspiracy

0.67 0.84 0.78 0.65 0.85 0.76 0.2 0.4 0.6 0.8 1 words + network + deepwalk F1 macro 4-way (no disinfo) 5-way

0.64 0.61 0.71 0.92 0.98

slide-18
SLIDE 18

Key Findings

August 10, 2017 18

Neural network models that jointly learn from: Tweet content Linguistic signals of bias and subjectivity Social network interactions Predict suspicious vs. verified news: linguistic signalsñ Infer specific types of suspicious news: social interactionsñ Feature work Multilingual predictions Multimodal inference: text + images Information propagation

LSTM/ Convolutional Layer (100 units) Embedding Layer (200 units) Input Word Sequences Dense Layer (100 units) Probability Activation Layer (sigmoid/softmax) Final Output Probabilities

}

… } } } } … … }

Dense Layer (100 units)

… }

Dense Layer (100 units)

}

Network/ Linguistic Cues

… } Tensor Concatenation }

Dense Layer (100 units)

Truth of Varying Shades: On Political Fact-Checking and Fake News. H. Rashkin, E. Choi, J. Yea Jang,

  • S. Volkova and Y. Choi. EMNLP 2017.
slide-19
SLIDE 19

Outline

19 August 10, 2017

Writer

= =

Reader Agent Theme P ( w → a g e n t ) P ( w → t h e m e ) P(agent → theme)

the predicate doesn’t directly imply what the writer thinks of the theme the writer portrays the agent as being unfairly
  • pportunistic

— — — —

agent is unfairly taking advantage
  • f the theme

Predicting Suspicious and Trusted News on Twitter

(joint work with K. Shaffer, J. Yang, and N. Hodas)

Analyzing and Forecasting Targeted Perspectives in Social Media (collaboration with H. Rashkin and Y. Choi) Forecasting Short-Term Change in Text Representations during Crisis Events from VK

(joint work with I. Stewart, D. Arendt, and E. Bell)

slide-20
SLIDE 20

Motivation and Background

August 10, 2017 20

Peoples’ reflections and opinions Real world events ➖ Multilingual opinions ➖ Spatiotemporal analysis ➖ Targeted sentiment Analysis Large amounts multilingual Twitter news Connotations towards salient entities: people, countries Multiple dimensions: time, language, and country Models Build models to forecast language-specific connotation dynamics Track perspective change over time towards entities and events

Multilingual Connotation Frames: A Case Study on Social Media for Targeted Sentiment Analysis and

  • Forecast. H., Rashkin, E. Bell, Y. Choi and S. Volkova. ACL 2017.
slide-21
SLIDE 21

Connotation Frames

August 10, 2017 21

English Verb: survive Other languages: survivre, sobrevivir, überleben…

“L'incroyable miraculé des explosions à Brussels: ce Mormon avait déjà survécu aux attentats de Boston et de Paris”

“Este joven ha sobrevivido a los atentados de Boston, de París y de Bruselas” “US teenager … also survived Boston Marathon bombing” “19-jähriger Missionar überlebt drei Terroranschläge” Example Tweets

Writer

=

Reader Agent Theme

P ( w → a g e n t ) P ( w → t h e m e ) P(agent → theme)

the writer is implying that the theme was something brutal

the writer portrays the agent as sympathetic

— — —

Connotation Frame for surviving verbs:

+ + — — —

Analyze subjective roles and relationships implied by a given predicate (Rashkin et al., 2016) Extend to 10 European languages: Polish, Finnish, and Russian

Connotation Frames: A Data-Driven Investigation. H. Rashkin, S. Singh, Y. Choi. ACL 2016.

Perspective among the event participants Perspective of the writer towards entities

slide-22
SLIDE 22

Twitter Data: 1.2M Agent-Verb-Theme Tuples

August 10, 2017 22

Brussels bombing dataset

March 15 – March 29, 2016 One week after and before March 22nd Verified news accounts

SyntaxNet with Universal Dependencies* Context-based projections:

The Opus Corpus (Tiedemann, 2012) using Multi-UN parallel data (Eisele and Chen, 2010) for Russian EuroParl parallel data (Koehn, 2005)

*SyntaxNet: https://www.tensorflow.org/versions/r0.11/tutorials/syntaxnet/

Tweets EN 643,004 ES 305,310 FR 85,286 PT 76,849 RU 28,511 DE 23,197 NL 14,091 IT 13,586 FI 2,859 SV 2,229 PL 2,226

𝐷𝐺 𝑥$ = 𝐷𝐺 argmax+∈-.𝑄 𝑥|𝑥$

slide-23
SLIDE 23

Multidimensional Perspectives in English

August 10, 2017 23

Brussels attacks +aftermath News story about Clinton “killing” coal mines Arrest of terror suspect Abdeslam terror suspect accepts arrest and agrees to extradition most negative towards terror suspect Stories in English tweets from Russia about how Apple had “assaulted” FBI by refusing to help in their investigations Similar to the English tweets from Russia, the Russian tweets about Apple are also less positive than towards most of the other entities.

+

  • 𝐹 𝑞 = 1 𝑞4 + 0 𝑞7 + 1 𝑞8 = 𝑞4 − 𝑞8
slide-24
SLIDE 24

Model

August 10, 2017 24

Track public sentiment dynamics in response to a major terrorist event Baselines: predict the mean, windowed SVM

1 .3 .4 .3 .3 .6 .1 .3 .4 .3

LSTM Fully connected softmax layer Input distribution vectors

+ =

Mar 21 (UK → Brussels) Mar 22 (UK → Brussels) Mar 23 (UK → Brussels) Mar 24 (UK → Brussels)

Predicted Distrib: Mar 25 (UK → Brussels)

} }

slide-25
SLIDE 25

Forecasting Results

August 10, 2017 25

2 4 6 NL PT IT FI ES EN DE FR PL RU SV KL Divergence Forecast 1 day Forecast 4 days 3.41 2.00 3.26 1.69 2 4 1 day 4 days KL Divergence LSTM SVM

1 day 4 days

slide-26
SLIDE 26

Error Analysis

August 10, 2017 26

TP 
 PP
 TN PN

slide-27
SLIDE 27

Agenda

27 August 10, 2017

Predicting Suspicious and Trusted News on Twitter

(joint work with K. Shaffer, J. Yang, and N. Hodas)

Analyzing and Forecasting Targeted Perspectives in Social Media (collaboration with H. Rashkin and Y. Choi) Forecasting Short-Term Change in Text Representations during Crisis Events from VK

(joint work with I. Stewart, D. Arendt, and E. Bell)

slide-28
SLIDE 28

Motivation and Background

August 10, 2017 28

Long-term shift in Google Books

Gulordava and Baroni 2011 Kim et al. 2014

Short-term shift on Twitter

Kulkarni et al. 2015 Hamilton et al., 2016a; 2016b

Semantic narrowing

Sagi, Kaufmann, and Clark 2009

Socially-situated nature of language

Eisenstein et al. 2014

600K posts from 5K users 25 weeks between 09/2014 – 03/2015 Longer than tweets: 167 words Locations: Russia, Ukraine

Meaning dynamics in VK

***The data was collected while the author was affiliated with JHU

slide-29
SLIDE 29

Approach

August 10, 2017 29

Overcome limitations of dynamic topic models and word clusters Goal: Measure, predict and visualize language dynamics in social media Relate representation shift and concept drift Forecast short-term representation shift Differencing Statistics: Words’ usage dynamics: Δ𝜐<=>? 𝑥 , Δ𝜐A<BC< 𝑥 substruction Words’ meaning dynamics: Δ𝜐>DE>C 𝑥 , 𝐸 = 30, 𝑥 = 5, consistent dimensions, cosine similarity Δ𝜐J 𝑥 = Δ𝑡AL,AM 𝑥 … Δ𝑡O8P,O 𝑥

slide-30
SLIDE 30

Experimental Setup

August 10, 2017 30

Predict meaning shift

Representation shift Δ𝜐>DE>C 𝑥 = 𝜚 Δ𝜐>DE>C 𝑥 Concept drift Δ𝜐>DE>C 𝑥 = 𝜚 Δ𝜐A<BC< 𝑥

Baselines

AdaBoost with Random Forest Deep Learning: Long Short-Term Memory

Evaluation metrics

Pearson correlation 𝑠 = ∑ 𝑧B − 𝑧 U 𝑧 VB − 𝑧 V

W B7P

∑ 𝑧B − 𝑧 U X

W B7P

𝑧B Z − 𝑧 V X

W B7P

  • Root Mean Squared Error

𝑆𝑁𝑇𝐹 = 1 𝑜 _ 𝑧B − 𝑧B Z X

W B7P

  • ×108X
slide-31
SLIDE 31

Forecasting Results

August 10, 2017 31

0.44 0.21 0. 0.2 0.4 0.6 0.8 1 week 2 weeks Baseline AdaBoost LSTM

0.73 0.5 0. 0.2 0.4 0.6 0.8 1 week 2 weeks Pearson Meaning Usage

slide-32
SLIDE 32

Visualizing Meaning Shift: Dill

August 10, 2017 32

Measuring, Predicting and Visualizing Short-Term Change in Word Representation and Usage in VKontakte Social Network. II. Stewart, D. Arendt, E. Bell and S. Volkova. ICWSM 2017.

slide-33
SLIDE 33

Visualizing Newly Emerged Terms

August 10, 2017 33

Titushky Donetsk People’s Republic

slide-34
SLIDE 34

Demo: Spatiotemporal Text Representations on Twitter https://esteem.labworks.org/

August 10, 2017 34

slide-35
SLIDE 35

Social Media Analytics

Forecast Perspective Dynamics

Brussels Bombings March 2016

Forecast Influenza and Weather

Russia-Ukraine Conflict 2014 – 2015

Forecast Language Change

LSTM layer t0 Predicted weekly ILI proportions

}

.03 .03 .01 .02 .05 ILI predictors t1 t3 t4 .4 … .1 .3 … .3 .3 … .1 .3 … .3 LSTM layer Fully connected layer SM predictors t0

} }

t1 t3 t4 Merge layer .4 … .1 .3 … .3 .3 … .1 .3 … .3 LSTM layer Fully connected layer Event Types Entity Distributions t0 The most likely event type

} }

Conflict t1 t3 t4

August 10, 2017 35

Forecast Future Events and Instability

Forecasting Analytics

How effectively can we predict future behavior from the past? Can we explain performance variations across tasks? How to evaluate forecasting models? How can we inform models with the insights from social theories?

slide-36
SLIDE 36

Acknowledgements

August 10, 2017 36

Kyle Shaffer MS, Data Scientist Yejin Choi, PhD (co-PI) CS Department, UW Josh Mendoza, Data Engineer Hannah Rashkin PhD Student, UW Jin Yea Jang Post-Master Student Dustin Arendt PhD, Scientist Eric Bell MS, Data Scientist Nathan Nodas PhD, Scientist

slide-37
SLIDE 37

Svitlana Volkova, PhD

Senior Research Scientist Data Sciences and Analytics Group Computational Analytics Division National Security Directorate svitlana.volkova@pnnl.gov http://www.cs.jhu.edu/~svitlana/