An Interpretable Joint Graphical Model for Fact-Checking from Crowds - - PowerPoint PPT Presentation

an interpretable joint graphical model for fact checking
SMART_READER_LITE
LIVE PREVIEW

An Interpretable Joint Graphical Model for Fact-Checking from Crowds - - PowerPoint PPT Presentation

An Interpretable Joint Graphical Model for Fact-Checking from Crowds An T. Nguyen 1 Aditya Kharosekar 1 Matthew Lease 1 Byron C. Wallace 2 1 University of Texas at Austin 2 Northeastern University 1 Problems Given a claim: Facebook Shut Down an


slide-1
SLIDE 1

An Interpretable Joint Graphical Model for Fact-Checking from Crowds

An T. Nguyen1 Aditya Kharosekar1 Matthew Lease1 Byron C. Wallace2

1University of Texas at Austin 2 Northeastern University

1

slide-2
SLIDE 2

Problems

Given a claim: Facebook Shut Down an AI Experiment Because Chatbots Developed Their Own Language.

2

slide-3
SLIDE 3

Problems

Given a claim: Facebook Shut Down an AI Experiment Because Chatbots Developed Their Own Language. and relevant article headlines: No, Facebook Did Not Panic and Shut Down an AI Program That Was Getting Dangerously Smart. source: gizmodo.com

2

slide-4
SLIDE 4

Problems

Given a claim: Facebook Shut Down an AI Experiment Because Chatbots Developed Their Own Language. and relevant article headlines: No, Facebook Did Not Panic and Shut Down an AI Program That Was Getting Dangerously Smart. source: gizmodo.com Predict headline stance: For Against Observing

2

slide-5
SLIDE 5

Problems

Given a claim: Facebook Shut Down an AI Experiment Because Chatbots Developed Their Own Language. and relevant article headlines: No, Facebook Did Not Panic and Shut Down an AI Program That Was Getting Dangerously Smart. source: gizmodo.com Predict headline stance: For Against Observing Predict claim veracity: False True Unknown

2

slide-6
SLIDE 6

Problems

Given a claim: Facebook Shut Down an AI Experiment Because Chatbots Developed Their Own Language. and relevant article headlines: No, Facebook Did Not Panic and Shut Down an AI Program That Was Getting Dangerously Smart. source: gizmodo.com Predict headline stance: For Against Observing Predict claim veracity: False True Unknown

2

slide-7
SLIDE 7

Problems

Given a claim: Facebook Shut Down an AI Experiment Because Chatbots Developed Their Own Language. and relevant article headlines: No, Facebook Did Not Panic and Shut Down an AI Program That Was Getting Dangerously Smart. source: gizmodo.com Predict headline stance: For Against Observing Predict claim veracity: False True Unknown Our motivation:

◮ Make sense of general claims incl. scientific, historical, ...

2

slide-8
SLIDE 8

Problems

Given a claim: Facebook Shut Down an AI Experiment Because Chatbots Developed Their Own Language. and relevant article headlines: No, Facebook Did Not Panic and Shut Down an AI Program That Was Getting Dangerously Smart. source: gizmodo.com Predict headline stance: For Against Observing Predict claim veracity: False True Unknown Our motivation:

◮ Make sense of general claims incl. scientific, historical, ... ◮ Not just “fake news”.

2

slide-9
SLIDE 9

Solutions

Previous work:

3

slide-10
SLIDE 10

Solutions

Previous work:

◮ Predict stance from text features (Ferreira& Vlachos 2016).

3

slide-11
SLIDE 11

Solutions

Previous work:

◮ Predict stance from text features (Ferreira& Vlachos 2016). ◮ Predict veracity from stance+source features (Popat et al.

2017)

3

slide-12
SLIDE 12

Solutions

Previous work:

◮ Predict stance from text features (Ferreira& Vlachos 2016). ◮ Predict veracity from stance+source features (Popat et al.

2017) We proposed:

◮ Crowdsource stance labels.

3

slide-13
SLIDE 13

Solutions

Previous work:

◮ Predict stance from text features (Ferreira& Vlachos 2016). ◮ Predict veracity from stance+source features (Popat et al.

2017) We proposed:

◮ Crowdsource stance labels.

◮ Hybrid human AI ◮ Available near real-time

3

slide-14
SLIDE 14

Solutions

Previous work:

◮ Predict stance from text features (Ferreira& Vlachos 2016). ◮ Predict veracity from stance+source features (Popat et al.

2017) We proposed:

◮ Crowdsource stance labels.

◮ Hybrid human AI ◮ Available near real-time

◮ Joint graphical model of stance, veracity, annotators.

3

slide-15
SLIDE 15

Solutions

Previous work:

◮ Predict stance from text features (Ferreira& Vlachos 2016). ◮ Predict veracity from stance+source features (Popat et al.

2017) We proposed:

◮ Crowdsource stance labels.

◮ Hybrid human AI ◮ Available near real-time

◮ Joint graphical model of stance, veracity, annotators.

◮ Interaction between variables ◮ Interpretable

3

slide-16
SLIDE 16

Model

V S T W U L A B R

n claims m sources c lablers

Powered by TCPDF (www.tcpdf.org) Powered by TCPDF (www.tcpdf.org)

4

slide-17
SLIDE 17

Model

  • 1. Predict Stance S

◮ Text features T

V S T W U L A B R

n claims m sources c lablers

Powered by TCPDF (www.tcpdf.org) Powered by TCPDF (www.tcpdf.org) Powered by TCPDF (www.tcpdf.org)

4

slide-18
SLIDE 18

Model

  • 1. Predict Stance S

◮ Text features T

  • 2. Predict Veracity V

◮ Stance S ◮ Reputation R

V S T W U L A B R

n claims m sources c lablers

Powered by TCPDF (www.tcpdf.org) Powered by TCPDF (www.tcpdf.org) Powered by TCPDF (www.tcpdf.org)

4

slide-19
SLIDE 19

Model

  • 1. Predict Stance S

◮ Text features T

  • 2. Predict Veracity V

◮ Stance S ◮ Reputation R

  • 3. Stance Label L

◮ True stance S ◮ Annotator competence A

V S T W U L A B R

n claims m sources c lablers

Powered by TCPDF (www.tcpdf.org) Powered by TCPDF (www.tcpdf.org)

4

slide-20
SLIDE 20

Inference & Learning

Inference:

◮ Gibbs sampling: accurate but slow.

5

slide-21
SLIDE 21

Inference & Learning

Inference:

◮ Gibbs sampling: accurate but slow. ◮ Variational inference: fast but biased.

5

slide-22
SLIDE 22

Inference & Learning

Inference:

◮ Gibbs sampling: accurate but slow. ◮ Variational inference: fast but biased.

Learning: Expectation Maximization.

5

slide-23
SLIDE 23

Inference & Learning

Inference:

◮ Gibbs sampling: accurate but slow. ◮ Variational inference: fast but biased.

Learning: Expectation Maximization. Details in the paper.

5

slide-24
SLIDE 24

Evaluation

Data: Emergent (Ferreira and Vlachos 2016)

◮ 300 claims. ◮ 2595 articles with stance labels.

6

slide-25
SLIDE 25

Evaluation

Data: Emergent (Ferreira and Vlachos 2016)

◮ 300 claims. ◮ 2595 articles with stance labels. ◮ We collected: crowd stance labels by Mechanical Turk.

6

slide-26
SLIDE 26

Evaluation

Data: Emergent (Ferreira and Vlachos 2016)

◮ 300 claims. ◮ 2595 articles with stance labels. ◮ We collected: crowd stance labels by Mechanical Turk.

Baseline: Separated models for stance, veracity & crowd labels.

6

slide-27
SLIDE 27

Evaluation

Data: Emergent (Ferreira and Vlachos 2016)

◮ 300 claims. ◮ 2595 articles with stance labels. ◮ We collected: crowd stance labels by Mechanical Turk.

Baseline: Separated models for stance, veracity & crowd labels. Metric: Brier score, measures accuracy and prob. calibration.

6

slide-28
SLIDE 28

Results

7

slide-29
SLIDE 29

User study

Interface: users enter claims, see predictions.

8

slide-30
SLIDE 30

User study

Interface: users enter claims, see predictions. A/B testing

8

slide-31
SLIDE 31

User study

Interface: users enter claims, see predictions. A/B testing

◮ A: see only veracity predictions

8

slide-32
SLIDE 32

User study

Interface: users enter claims, see predictions. A/B testing

◮ A: see only veracity predictions ◮ B: also see explanation (reputation, stances)

8

slide-33
SLIDE 33

User study

Interface: users enter claims, see predictions. A/B testing

◮ A: see only veracity predictions ◮ B: also see explanation (reputation, stances)

8

slide-34
SLIDE 34

User study: results

9

slide-35
SLIDE 35

Conclusion

Takeaway:

◮ Stance/Veracity predictions are hard.

10

slide-36
SLIDE 36

Conclusion

Takeaway:

◮ Stance/Veracity predictions are hard. ◮ We contribute: crowdsourcing + joint modeling.

10

slide-37
SLIDE 37

Conclusion

Takeaway:

◮ Stance/Veracity predictions are hard. ◮ We contribute: crowdsourcing + joint modeling.

Paper: experiments on Snopes.

10

slide-38
SLIDE 38

Conclusion

Takeaway:

◮ Stance/Veracity predictions are hard. ◮ We contribute: crowdsourcing + joint modeling.

Paper: experiments on Snopes. Demo: fcweb.pythonanywhere.com

10

slide-39
SLIDE 39

Conclusion

Takeaway:

◮ Stance/Veracity predictions are hard. ◮ We contribute: crowdsourcing + joint modeling.

Paper: experiments on Snopes. Demo: fcweb.pythonanywhere.com We share code + data

10

slide-40
SLIDE 40

Conclusion

Takeaway:

◮ Stance/Veracity predictions are hard. ◮ We contribute: crowdsourcing + joint modeling.

Paper: experiments on Snopes. Demo: fcweb.pythonanywhere.com We share code + data Acknowledge: Crowd annotator, reviewers, NSF.

10

slide-41
SLIDE 41

Conclusion

Takeaway:

◮ Stance/Veracity predictions are hard. ◮ We contribute: crowdsourcing + joint modeling.

Paper: experiments on Snopes. Demo: fcweb.pythonanywhere.com We share code + data Acknowledge: Crowd annotator, reviewers, NSF. Questions?

10