fools gold
play

Fools Gold: Understanding the Linguistic Features of Deception and - PowerPoint PPT Presentation

Fools Gold: Understanding the Linguistic Features of Deception and Humour Through April Fools Hoaxes Ed Dearden e.dearden@lancaster.ac.uk Hell Planet Why do we care about April Fools? False Information But where does April


  1. Fools’ Gold: Understanding the Linguistic Features of Deception and Humour Through April Fools’ Hoaxes Ed Dearden e.dearden@lancaster.ac.uk

  2. Hell Planet

  3. Why do we care about April Fools’?

  4. False Information

  5. But where does April Fools’ day fit into this?

  6. April Fools’ Day

  7. What’s the Difference?

  8. What’s the Difference?

  9. Deceptive Intent: Is the author trying to deceive me? Not Deceive? Deceive?

  10. Research Questions

  11. What are the Linguistic features of an April Fools’ article compared to regular news?

  12. How similar are the features of April Fools’ to those of “Fake News”?

  13. I need some background

  14. Deception • Exaggeration. • Vagueness. • Details.

  15. Humour • Contextual Imbalance. • Emotional Language. • Ambiguity.

  16. Irony • Part humour, part deception. • Negative Emotional Language. • Polarity Contrast.

  17. How about the data?

  18. Catching Fools’! 519 April Fools’ articles. • 371 websites. • 213776 words. • 2004-2018 •

  19. Matching Fools’! 519 regular news articles. • 240 Websites. • 344927 Words. • 2004-2018 •

  20. Fake News! Flagged as fake by Buzzfeed. • 2016 Election. • Horne and Adali, 2017. •

  21. But what are you going to do with it?

  22. Vagueness Details Imagination Building a feature set Deception Humour Formality Complexity

  23. CLAWS Ambiguity USAS Ambiguity Wordnet Ambiguity Vague Degree Vagueness Superlatives Degree Adverbs Comparative Adverbs Exaggeration

  24. Time Related Sense Terms Motion Terms Proper Nouns Details Spatial Terms Numbers Dates

  25. Imaginative Informative Verbs Imaginative Verbs Conjunctions Prepositions Imagination Articles Imaginative Adjectives Determiners

  26. First Person Pronouns Deception Negative Negations Emotional Terms

  27. Head Contextual Positive Emotion Relationships Imbalance Body Contextual Alliteration Humour Imbalance Profanity

  28. Associated Press Number Guidelines Associated Press Associated Press Formality Date Guidelines Title Guidelines Spelling Errors

  29. Average Body Punctuation Head Punctuation Sentence Length Readability Complexity Lexical Diversity Function Words Lexical Density

  30. Corpus Feature 1 … Feature N Class 0.111 … 0.552 AF … … … … Create feature matrix 0.444 … 0.654 NAF Feature Selection Which features are most informative? Can we learn to automatically differentiate? Classification What do the results mean? Analysis

  31. Feature Selection Chi-squared test • ANOVA • Mutual Information • Recursive Feature Elimination • Logistic Regression Coefficients •

  32. Feature Selection Formali lity ty Details ls • • Time Rela lated Term rms Associated Press Date • • Associated Press Number Sense Terms Compl plexity ty • Proper Nouns • Avg Sentence Length • Decepti tion Body Punctuation • Readabili lity • First Person n Prono nouns uns Imagination Im • Lexical l Diversity • Preposition • Adjectives Vag agueness • Imagination Conjunctions • Degree Adverb rbs

  33. Classification Feature 1 … Feature N Class 0.111 … 0.552 AF … … … … 0.444 … 0.654 NAF Artjcle Predictjon Truth 1 AF AF 2 NAF AF … … … n-1 NAF NAF n AF NAF

  34. Classification Accuracies for all Feature Sets Hoax Set: 74% Bag-of-Words: 80% Complexity: 71% + Detail

  35. What are we seeing so far?

  36. Our feature set can differentiate between hoax and genuine.

  37. Most individual feature groups don’t do so well.

  38. Complexity and Detail are Important.

  39. How does this compare to Fake News?

  40. Classifying Fakes 1. One classifier trained on Fake News. 2. Second Classifier trained on April Fools’ and tested on Fake News.

  41. Classification Accuracies for Fake News Hoax Set: 76.9% Bag-of-Words: 77.7% Complexity: 78.1% + Detail

  42. Classification Accuracies for Fake News Hoax Set: 64.5% Bag-of-Words: 49.4% Complexity: 65.7% + Detail Complexity: 75.7%

  43. What does this suggest?

  44. Our feature set differentiates fake news similarly well to April Fools’.

  45. Some feature groups perform much worse.

  46. Complexity and Detail remain the most important feature groups.

  47. Our classifier trained on AF seems to work (to some extent) on Fake News.

  48. But what does the data say?

  49. Readability (Complexity)

  50. Lexical Diversity (Complexity)

  51. Time Related Vocabulary (Detail)

  52. Proper Nouns (Detail)

  53. Dates (Detail)

  54. First Person Pronouns (Deception)

  55. Can you sum it all up?

  56. Conclusions – Part 1 Cr Created a a ne new c corp rpus us of April F ril Fools ls’ h hoax axes. • Used features from deception, humour, and irony • detection to classify hoaxes with moderate success. Showed that features relating to complexity and detail • seem to be the most important.

  57. Conclusions – Part 1  Created a new corpus of April Fools’ hoaxes.  Us Used f d features f from d m deceptio tion, humo umour, an r, and ir d irony ny de detectio tion t n to c clas assif ify h hoax axes w with ith mo mode derate s suc uccess.  Showed that features relating to complexity and detail seem to be the most important.

  58. Conclusions – Part 1 Created a new corpus of April Fools’ hoaxes. • Used features from deception, humour, and irony • detection to classify hoaxes with moderate success. Sh Showed d that f t featu atures r rela lating ting t to compl mplexit ity an and d d detail ail • seem t m to be be the mo most t impo important. ant.

  59. Conclusions – Part 2 Found und th that s at simil imilar ar featur atures ar are us useful in l in ide identif tifyin ying Apr April il • Fools ls’ an ’ and d Fake Ne News. Some of these features manifest themselves similarly • for both AF Hoaxes and Fake News.

  60. Conclusions – Part 2 Found that similar features are useful in identifying April • Fools’ and Fake News. So Some o of th these featur tures ma manif nifest th t thems mselv lves s simil imilarly arly • for bo both th AF H AF Hoax axes and and F Fak ake N News.

  61. Future Work

  62. Questions? Thanks for listening! e.dearden@lancaster.ac.uk

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend