an interpretable joint graphical model for fact checking
play

An Interpretable Joint Graphical Model for Fact-Checking from Crowds - PowerPoint PPT Presentation

An Interpretable Joint Graphical Model for Fact-Checking from Crowds An T. Nguyen 1 Aditya Kharosekar 1 Matthew Lease 1 Byron C. Wallace 2 1 University of Texas at Austin 2 Northeastern University 1 Problems Given a claim: Facebook Shut Down an


  1. An Interpretable Joint Graphical Model for Fact-Checking from Crowds An T. Nguyen 1 Aditya Kharosekar 1 Matthew Lease 1 Byron C. Wallace 2 1 University of Texas at Austin 2 Northeastern University 1

  2. Problems Given a claim: Facebook Shut Down an AI Experiment Because Chatbots Developed Their Own Language. 2

  3. Problems Given a claim: Facebook Shut Down an AI Experiment Because Chatbots Developed Their Own Language. and relevant article headlines: No, Facebook Did Not Panic and Shut Down an AI Program That Was Getting Dangerously Smart. source: gizmodo.com 2

  4. Problems Given a claim: Facebook Shut Down an AI Experiment Because Chatbots Developed Their Own Language. and relevant article headlines: No, Facebook Did Not Panic and Shut Down an AI Program That Was Getting Dangerously Smart. source: gizmodo.com Predict headline stance: For Against Observing 2

  5. Problems Given a claim: Facebook Shut Down an AI Experiment Because Chatbots Developed Their Own Language. and relevant article headlines: No, Facebook Did Not Panic and Shut Down an AI Program That Was Getting Dangerously Smart. source: gizmodo.com Predict headline stance: For Against Observing Predict claim veracity: False True Unknown 2

  6. Problems Given a claim: Facebook Shut Down an AI Experiment Because Chatbots Developed Their Own Language. and relevant article headlines: No, Facebook Did Not Panic and Shut Down an AI Program That Was Getting Dangerously Smart. source: gizmodo.com Predict headline stance: For Against Observing Predict claim veracity: False True Unknown 2

  7. Problems Given a claim: Facebook Shut Down an AI Experiment Because Chatbots Developed Their Own Language. and relevant article headlines: No, Facebook Did Not Panic and Shut Down an AI Program That Was Getting Dangerously Smart. source: gizmodo.com Predict headline stance: For Against Observing Predict claim veracity: False True Unknown Our motivation: ◮ Make sense of general claims incl. scientific, historical, ... 2

  8. Problems Given a claim: Facebook Shut Down an AI Experiment Because Chatbots Developed Their Own Language. and relevant article headlines: No, Facebook Did Not Panic and Shut Down an AI Program That Was Getting Dangerously Smart. source: gizmodo.com Predict headline stance: For Against Observing Predict claim veracity: False True Unknown Our motivation: ◮ Make sense of general claims incl. scientific, historical, ... ◮ Not just “fake news”. 2

  9. Solutions Previous work: 3

  10. Solutions Previous work: ◮ Predict stance from text features (Ferreira& Vlachos 2016). 3

  11. Solutions Previous work: ◮ Predict stance from text features (Ferreira& Vlachos 2016). ◮ Predict veracity from stance+source features (Popat et al. 2017) 3

  12. Solutions Previous work: ◮ Predict stance from text features (Ferreira& Vlachos 2016). ◮ Predict veracity from stance+source features (Popat et al. 2017) We proposed: ◮ Crowdsource stance labels. 3

  13. Solutions Previous work: ◮ Predict stance from text features (Ferreira& Vlachos 2016). ◮ Predict veracity from stance+source features (Popat et al. 2017) We proposed: ◮ Crowdsource stance labels. ◮ Hybrid human AI ◮ Available near real-time 3

  14. Solutions Previous work: ◮ Predict stance from text features (Ferreira& Vlachos 2016). ◮ Predict veracity from stance+source features (Popat et al. 2017) We proposed: ◮ Crowdsource stance labels. ◮ Hybrid human AI ◮ Available near real-time ◮ Joint graphical model of stance, veracity, annotators. 3

  15. Solutions Previous work: ◮ Predict stance from text features (Ferreira& Vlachos 2016). ◮ Predict veracity from stance+source features (Popat et al. 2017) We proposed: ◮ Crowdsource stance labels. ◮ Hybrid human AI ◮ Available near real-time ◮ Joint graphical model of stance, veracity, annotators. ◮ Interaction between variables ◮ Interpretable 3

  16. Powered by TCPDF (www.tcpdf.org) Powered by TCPDF (www.tcpdf.org) Model B U L A c lablers V S W T n claims R m sources 4

  17. Powered by TCPDF (www.tcpdf.org) Powered by TCPDF (www.tcpdf.org) Model B U L A c lablers V S W 1. Predict Stance S ◮ Text features T T n claims R m sources Powered by TCPDF (www.tcpdf.org) 4

  18. Powered by TCPDF (www.tcpdf.org) Powered by TCPDF (www.tcpdf.org) Model B U L A c lablers V S W 1. Predict Stance S ◮ Text features T T n claims 2. Predict Veracity V R ◮ Stance S m sources Powered by TCPDF (www.tcpdf.org) ◮ Reputation R 4

  19. Model B 1. Predict Stance S ◮ Text features T U L A 2. Predict Veracity V c lablers ◮ Stance S V S W ◮ Reputation R 3. Stance Label L T ◮ True stance S n claims ◮ Annotator competence A R m sources Powered by TCPDF (www.tcpdf.org) Powered by TCPDF (www.tcpdf.org) 4

  20. Inference & Learning Inference: ◮ Gibbs sampling: accurate but slow. 5

  21. Inference & Learning Inference: ◮ Gibbs sampling: accurate but slow. ◮ Variational inference: fast but biased. 5

  22. Inference & Learning Inference: ◮ Gibbs sampling: accurate but slow. ◮ Variational inference: fast but biased. Learning: Expectation Maximization. 5

  23. Inference & Learning Inference: ◮ Gibbs sampling: accurate but slow. ◮ Variational inference: fast but biased. Learning: Expectation Maximization. Details in the paper. 5

  24. Evaluation Data: Emergent (Ferreira and Vlachos 2016) ◮ 300 claims. ◮ 2595 articles with stance labels. 6

  25. Evaluation Data: Emergent (Ferreira and Vlachos 2016) ◮ 300 claims. ◮ 2595 articles with stance labels. ◮ We collected: crowd stance labels by Mechanical Turk. 6

  26. Evaluation Data: Emergent (Ferreira and Vlachos 2016) ◮ 300 claims. ◮ 2595 articles with stance labels. ◮ We collected: crowd stance labels by Mechanical Turk. Baseline: Separated models for stance, veracity & crowd labels. 6

  27. Evaluation Data: Emergent (Ferreira and Vlachos 2016) ◮ 300 claims. ◮ 2595 articles with stance labels. ◮ We collected: crowd stance labels by Mechanical Turk. Baseline: Separated models for stance, veracity & crowd labels. Metric: Brier score, measures accuracy and prob. calibration. 6

  28. Results 7

  29. User study Interface: users enter claims, see predictions. 8

  30. User study Interface: users enter claims, see predictions. A/B testing 8

  31. User study Interface: users enter claims, see predictions. A/B testing ◮ A: see only veracity predictions 8

  32. User study Interface: users enter claims, see predictions. A/B testing ◮ A: see only veracity predictions ◮ B: also see explanation (reputation, stances) 8

  33. User study Interface: users enter claims, see predictions. A/B testing ◮ A: see only veracity predictions ◮ B: also see explanation (reputation, stances) 8

  34. User study: results 9

  35. Conclusion Takeaway: ◮ Stance/Veracity predictions are hard. 10

  36. Conclusion Takeaway: ◮ Stance/Veracity predictions are hard. ◮ We contribute: crowdsourcing + joint modeling. 10

  37. Conclusion Takeaway: ◮ Stance/Veracity predictions are hard. ◮ We contribute: crowdsourcing + joint modeling. Paper: experiments on Snopes. 10

  38. Conclusion Takeaway: ◮ Stance/Veracity predictions are hard. ◮ We contribute: crowdsourcing + joint modeling. Paper: experiments on Snopes. Demo: fcweb.pythonanywhere.com 10

  39. Conclusion Takeaway: ◮ Stance/Veracity predictions are hard. ◮ We contribute: crowdsourcing + joint modeling. Paper: experiments on Snopes. Demo: fcweb.pythonanywhere.com We share code + data 10

  40. Conclusion Takeaway: ◮ Stance/Veracity predictions are hard. ◮ We contribute: crowdsourcing + joint modeling. Paper: experiments on Snopes. Demo: fcweb.pythonanywhere.com We share code + data Acknowledge: Crowd annotator, reviewers, NSF. 10

  41. Conclusion Takeaway: ◮ Stance/Veracity predictions are hard. ◮ We contribute: crowdsourcing + joint modeling. Paper: experiments on Snopes. Demo: fcweb.pythonanywhere.com We share code + data Acknowledge: Crowd annotator, reviewers, NSF. Questions? 10

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend