“DeClarE” Debunking Fake News and False Claims using Evidence-Aware Deep Learning
EMNLP-2018 Kashyap Popat, Subhabrata Mukherjee, Andrew Yates, Gerhard Weikum
DeClarE Debunking Fake News and False Claims using Evidence-Aware - - PowerPoint PPT Presentation
DeClarE Debunking Fake News and False Claims using Evidence-Aware Deep Learning Kashyap Popat, Subhabrata Mukherjee, Andrew Yates, Gerhard Weikum EMNLP-2018 M OTIVATION Rapid spread of misinformation online" one of the
“DeClarE” Debunking Fake News and False Claims using Evidence-Aware Deep Learning
EMNLP-2018 Kashyap Popat, Subhabrata Mukherjee, Andrew Yates, Gerhard Weikum
“Rapid spread of misinformation online" – one of the top 10
challenges as per The World Economic Forum
Many truth-checking websites manually verify/falsify claims
2
1 https://www.aljazeera.com/news/2018/10/bolsonaro-continues-lead-polls-fake-news-scandal-181019220347524.html 2 https://www.nytimes.com/2018/10/17/opinion/deep-fake-technology-democracy.html
Truth Finding
Conflict resolution amongst
multi-source structured data
Joint inference of source
reliability and truth
Communities & Social Media
Probabilistic graphical models Social Network analysis
Natural Language Claims
Supervised approaches
3
Limited to structured data Focused only on communities Community specific features No external evidence Substantial feature modeling No linguistic cues
Assess the credibility (true/false) of textual claims
Presents interpretable evidence supporting the assessment
4
T extual Claim DeClarE*
False True Evidence World Wide Web
*DeClare: Debunking Claims with Interpretable Evidence
5
Tucker Carlson: “Far more children died last year drowning in their bathtubs than were killed accidentally by guns." conservativeflashnews.com: “President Obama ordered a life-sized bronze statue of himself to be permanently installed at the White House’’ Coca-Cola’s original diet cola drink, TaB, took its name from an acronym for “totally artificial beverage”.
6
conservativeflashnews.com:
“President Obama ordered a life-sized bronze statue of himself to be permanently installed at the White House’’
ABC News: An article making the rounds on Facebook falsely says that a bronze statue of former President Barack Obama will soon be in the entryway of the White House. But you won't be seeing it any time soon -- or any time at all. The story is false. The Florida Times:The emails have made their way across the internet. But reports that Obama ordered a $200,000 life-size bronze statue of himself to be “permanently installed in the White House” are totally false.
Motivation Problem Statement
Key Contributors Network Architecture & Approach Experiments & Results
Conclusion
7
8
Evidence – Search Engine Language style and semantics of evidence –
biLSTM
Interaction between claim and external evidence
– Attention Mechanism
Trustworthiness of underlying sources – Claim
and Evidence Source Embeddings
9
Claim and article: sequences of word embeddings Claim source and article source: source embeddings
𝑑𝑚 ∈ 𝑆𝑒𝑥 [𝑏𝑙] ∈ 𝑆𝑒𝑥 𝑑𝑡 ∈ 𝑆𝑒𝑡 𝑏𝑡 ∈ 𝑆𝑒𝑡
10
Language style aware article representation
A biLSTM – hidden state output for each word in the evidence
[ℎ𝑙] [𝑏𝑙]
11
Importance of each word in the article text w.r.t. the claim Overall claim representation:
𝑑 =
1 𝑚 𝑚 𝑑𝑚
Append
𝑑 with each article term: 𝑏𝑙 = 𝑏𝑙⨁ 𝑑
[𝑏𝑙] [ 𝑏𝑙] 𝑑𝑚 [ℎ𝑙]
12
Claim specific attention weights:
[ℎ𝑙] [𝑏𝑙] 𝑑𝑚 [ 𝑏𝑙] [𝛽𝑙]
ATTENTION FOCUSED ARTICLE REPRESENTATION
13
Attention focused article representation
[ℎ𝑙] [𝑏𝑙] 𝑑𝑚 [ 𝑏𝑙] [𝛽𝑙]
14
Per-article credibility score
15
Per-article credibility score
16
Case Studies
Snopes (SN) – classification (~4300 claims) PolitiFact (PF) – classification (~3500 claims) NewsTrust (NT) – regression (~5344 news headlines) SemEval-2017 Task (SE) – classification (~250 tweets)
Analysis
Source embeddings Attention weights
17
Evaluation:
10% of the data for parameter tuning 10-fold cross-validation on 90% of the data
Keras with tensorflow backend
18
Snopes (~4300 claims)
Verifies Internet rumors,
hoaxes, and other claims
PolitiFact (~3500 claims)
Verifies political claims made
by politicians in USA
Extracted ~30 top search
results as evidence
“The user of solar panels drains the sun
“Entering your PIN in reverse at any ATM will automatically summon the police” Hillary Clinton: "The gun epidemic is the leading cause of death of young African- American men, more than the next nine causes put together."
19
Baselines
LSTM-T
ext (Rashkin et al., 2017) – no usage of evidence
CNN-T
ext (Wang, 2017) – no usage of evidence
DistantSup (Popat et al., 2017) DeClarE – Our Approach
Performance measures
per-class accuracies, macro F1, AUC
20
Dataset Configuration Macro-F1 AUC Snopes LSTM-T ext 0.66 0.70 CNN-T ext 0.66 0.72 DistantSup 0.82 0.88 DeClarE 0.79 0.86 Politifact LSTM-T ext 0.63 0.66 CNN-T ext 0.64 0.67 DistantSup 0.62 0.68 DeClarE 0.68 0.75
21
News review community – members review news
articles
Each story: article, article source, user reviews and
ratings (scale 1 to 5)
Title of the article – claim Article source – claim source User reviews – evidence User ids – evidence sources
Regression task – predict the credibility score
22
Additional baseline:
CCRF+SVR (Mukherjee and Weikum, 2015)
Performance measure – Mean Square Error (MSE)
Configuration MSE CNN-T ext 0.53 CCRF+SVR 0.36 LSTM-T ext 0.35 DistantSup 0.35 DeClarE 0.29
ANALYZING ARTICLE SOURCE EMBEDDINGS
23
Authentic Sources Fake Sources
ANALYZING CLAIM SOURCE EMBEDDINGS
24
Republicans Democrats
25
Proposed an end-to-end neural network model
No feature modeling Provide interpretable evidence
Experiments on real-world claims demonstrate
effectiveness of our approach
Considering external evidence helps! Datasets: https://www.mpi-inf.mpg.de/dl-cred-analysis/
26
27