Predicting the Future with Deep Learning and Signals from Social - PowerPoint PPT Presentation

Predicting the Future with Deep Learning and Signals from Social Media SVITLANA VOLKOVA, PHD Senior Research Scientist Data Sciences and Analytics Group, National Security Directorate Pacific Northwest National Laboratory ACL Workshop on Natural Language Processing and Computational Social Science August 10, 2017 1

Social Media Analytics Forecasting Analytics Predictive Analytics Identify Forecast Suspicious Perspective Accounts Dynamics Brussels Bombings March 2016 Predict Forecast Final Output Probabilities } Probability Activation Layer (sigmoid/softmax) Deceptive Language … } Dense Layer (100 units) } Tensor Concatenation … } Dense Layer (100 units) Dense Layer … } LSTM/ (100 units) … } Convolutional Change News Layer (100 units) Dense Layer … } (100 units) Embedding … } Layer (200 units) Network/ } } Input Word Linguistic Sequences Cues Russia-Ukraine Forecast Detect Conflict 2014 – 2015 Future Real-World Forecast Events Events and The most likely Conflict event type Predict } Fully Influenza and connected } Output Instability Probabilities layer } Softmax Layer Native } LSTM … Weather layer } Dense Layer (128 units) LSTM … } .4 .3 .3 .3 pre-trained Event Types (100 units) Language … … … … Entity … Distributions } Dense Layer .3 .1 .3 .1 (100 units) Predicted weekly .03 t 0 t 1 t 3 t 4 embedding ILI proportions dimension (100) } Fully } russian } Binary connected … tanks Output } Input layer spotted … Embeddings in Dense crimea … } ES DE FR JA IN Classification Merge today Layers layer Bidirectional } … } } GRU LSTM LSTM (20 units) layer layer Bidirectional … } GRU (20 units) .4 .3 .3 .3 .03 .01 .02 .05 … Embedding } Layer t 0 t 1 t 3 t 4 … … … … (30 units) ILI predictors .1 .3 .1 .3 English Input (Bytes) t 0 t 1 t 3 t 4 SM predictors August 10, 2017 2

Outline Predicting Suspicious and Trusted News on Twitter Final Output Probabilities } Probability Activation Layer (sigmoid/softmax) … } (joint work with K. Shaffer, J. Yang, and N. Hodas) Dense Layer (100 units) } Tensor Concatenation Dense Layer … } (100 units) … Dense Layer } LSTM/ (100 units) … } Convolutional Layer (100 units) … Dense Layer } (100 units) Embedding … } Layer (200 units) Network/ } Input Word } Linguistic Sequences Cues Analyzing and Forecasting Targeted Perspectives in Social Media (collaboration with H. Rashkin and Y. Choi) ) Writer P t n ( e w g the writer a the predicate → → portrays the t doesn’t directly = h — e w agent as being m imply what the ( P e unfairly ) writer thinks of P (agent → theme) — opportunistic the theme Agent Theme — agent is unfairly taking advantage — = of the theme Reader Forecasting Short-Term Change in Text Representations during Crisis Events from VK (joint work with I. Stewart, D. Arendt, and E. Bell) August 10, 2017 3

Outline Predicting Suspicious and Trusted News on Twitter Final Output Probabilities } Probability Activation Layer (sigmoid/softmax) … } (joint work with K. Shaffer, J. Yang, and N. Hodas) Dense Layer (100 units) } Tensor Concatenation Dense Layer … } (100 units) … Dense Layer } LSTM/ (100 units) … } Convolutional Layer (100 units) … Dense Layer } (100 units) Embedding … } Layer (200 units) Network/ } Input Word } Linguistic Sequences Cues Analyzing and Forecasting Targeted Perspectives in Social Media (collaboration with H. Rashkin and Y. Choi) Forecasting Short-Term Change in Text Representations during Crisis Events from VK (joint work with I. Stewart, D. Arendt, and E. Bell) August 10, 2017 4

Motivation and Background 62% of U.S. adults get news on social media (Pew Research, Oct 2016) 64% of U.S. adults said that “made-up news” has caused a “great deal of confusion” about the facts of current events (Pew Research, Dec 2016) Previous work on deception detection: Deceptive Amazon reviews (Choi, Mihalcea) Satirical news (Rubin et al.2015) Rumors (Qazvinian et al., 2011; Liu et al., 2015) Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on August 10, 2017 5 Twitter. S. Volkova, K. Shaffer, J. Yea Jang and N. Hodas. ACL 2017.

Deceptive News Google Fact Checking: https://www.blog.google/topics/journalism-news/expanding-fact-checking-google/ Facebook 3 rd Party Verification: http://newsroom.fb.com/news/2016/12/news-feed-fyi-addressing-hoaxes-and-fake-news/ August 10, 2017 6

Deceptive News Types Propaganda Hoax Clickbait Satire Intent to Deceive No Intent to Deceive Propaganda deliberately spread misinformation in order to appeal to certain groups Hoax seek to mislead, rather than entertain, readers for financial or political gain Clickbait take bits of true stories but insinuate and make up other details to sew fear Satire take fun of the news, are satirical bent, or parodies of news August 10, 2017 7

Twitter News Data Propaganda Hoax Clickbait Satire Disinfo Propaganda Conspiracy Hoax Clickbait Intent to Deceive No Intent to Deceive No Intent to Deceive Intent to Deceive 2M suspicious tweets 130K total 65K suspicious August 10, 2017 8

News Categorization http://www.marketwatch.com/story/how-does-your-favorite-news-source-rate-on-the-truthiness-scale-consult-this-chart-2016-12-15

Alternative News Categorization http://www.marketwatch.com/story/how-does-your-favorite-news-source-rate-on-the-truthiness-scale-consult-this-chart-2016-12-15

Annotations Brussels bombing dataset March 15 – March 29, 2016 One week after and before March 22 nd , 2016 Account-level vs. tweet-level annotations: Fake news annotations http://www.fakenewswatch.com/ PropOrNot http://www.propornot.com/p/the-list.html (manually verified) Signs of propaganda Tries to persuade Influences the emotions, attitudes, opinions, and actions Target audiences for political, ideological, and religious purposes Have examples of selectively-omitting and one-sided messages August 10, 2017 11

Task Definition Build tweet-level neural network models to differentiate between: Verified vs. unverified news posts (130K) ? Intent to Deceive No Intent to Deceive Types of unverified news posts: propaganda, hoax, clickbait, satire (65K) Propaganda Hoax Clickbait Satire No Intent to Deceive Intent to Deceive disinformation, propaganda, conspiracy, clickbait, hoaxes (2M) Disinfo Propaganda Conspiracy Hoax Clickbait No Intent to Deceive Intent to Deceive August 10, 2017 12

Model Baselines: logistic regression with TFIDF and Doc2Vec representations Our models: neural networks (RNN/CNN) with social network interaction and linguistic cues: hedging, assertive, factive, implicative verbs Final Output Probabilities } Probability Activation Layer (sigmoid/softmax) … } Dense Layer (100 units) } Tensor Concatenation Dense Layer … } (100 units) … Dense Layer } LSTM/ (100 units) … } Convolutional Layer (100 units) … Dense Layer } Embedding (100 units) … } Layer (200 units) Network/ Input Word } } Linguistic Sequences Cues August 10, 2017 13 Keras: https://keras.io/, scikit-learn: http://scikit-learn.org/stable/, Doc2Vec: https://pypi.python.org/pypi/gensim

Linguistic Analysis Moral Foundation Theory (Haidt and Grahm, 2007, Graham et al., 2009) Harm, Care, Loyalty, Betrayal, Authority Biased Language (Recasens et al., 2013) Assertive, Factive, Hedging, Implicative, Report Verbs Subjective Language (Volkova et al., 2013, Liu et al., 2005, Riloff et al., 2003) Betrayal↑, Care↑, Loyalty↓, Hedging↓, Implicative↓ Loyalty↑, Hedges↑, Subj↑, Betrayal↓ Care↓, Subjective↓, Factive↓, Bias↓ August 10, 2017 14

Verified vs. Suspicious Prediction Results Binary: linguistic and social graph features (130K tweets, 10 fold c.v.) ? Intent to Deceive No Intent to Deceive LR D2V LR TFIDF RNN CNN 0.95 1 0.93 Accuracy 0.9 0.81 0.76 0.8 0.7 0.6 text + graph + ling. cues all 15 August 10, 2017

Suspicious News Prediction Results (1) Multi-class prediction: satire, hoaxes, clickbaits, propaganda (65K) Propaganda Hoax Clickbait Satire Intent to Deceive No Intent to Deceive RNN CNN LR TFIDF LR D2V 0.71 0.8 0.66 0.63 0.63 F1 macro 0.6 0.4 0.2 text + network + ling. all markers August 10, 2017 16

Suspicious News Prediction Results (2) Multi-class prediction: disinformation, propaganda, conspiracy, clickbait, hoaxes (2M) Disinfo Propaganda Conspiracy Hoax Clickbait No Intent to Deceive Intent to Deceive 4-way (no disinfo) 5-way 1 0.85 0.84 0.78 0.76 F1 macro 0.8 0.67 0.65 0.6 0.4 0.2 0 words + network + deepwalk 0.98 0.92 0.71 0.64 0.61 August 10, 2017 17

Predicting the Future with Deep Learning and Signals from Social - PowerPoint PPT Presentation

Predicting the Future with Deep Learning and Signals from Social Media SVITLANA VOLKOVA, PHD Senior Research Scientist Data Sciences and Analytics Group, National Security Directorate Pacific Northwest National Laboratory ACL Workshop on

Asynchronous Events: Signals Signals Concepts Generating Signals Catching Signals

Asynchronous Events: Signals Signals Concepts Generating Signals

Topic 1: LTI Systems Overview: Introduction to Signals Types of Signals: CT/DT,

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

6.003: Signals and Systems Signals and Systems September 8, 2011 1 6.003: Signals and Systems

EE361: SIGNALS AND SYSTEMS II REVIEW SIGNALS AND SYSTEMS I http://www.ee.unlv.edu/~b1morris/ee361

091031 091031 VIDEO SIGNALS VIDEO SIGNALS Lecturer: Marco Marcon 091032 - AUDIO AND VIDEO

Signals Maninder Kaur professormaninder@gmail.com Maninder Kaur www.eazynotes.com 1 Various

Signal Encoding Techniques Digital Data, Analog Signals Analog Data, Digital Signals ITS323:

Signals - II Tevfik Ko ar Louisiana State University October 9 th , 2008 1 2 Sending

Welcome Predicting Change Outcomes Leveraging SQL Server Profiler Lee Everest SQL Rx Predicting

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Computational Approaches to Creative Language: Modelling Creativity Caroline Sporleder

Types of Subjectivity Subjectivity in Language Sentiments: positive or negative emotions,

Insertion Sort Divide-And-Conquer Sorting Small instance. a[0] a[0] a[n- a[n -2] 2]

B ALANCED T REES Acknowledgement: The course slides are adapted from the slides prepared by R.

Yaglom limits can depend on the starting state Ce travail conjoint avec Bob Foley 1 est d

The Aesthetic Imperative: Four views on beauty to enhance the user experience uday gajendar |

Understanding Beauty: A Framework for Designers IDSA NEC 2004: Pasadena Uday Gajendar BEA

1 Models for population structure Models for population structure Multi-group Spatial mixing

Sambuz

Useful Links

Newsletter

Mail Us