Deception Detection in Transcribed Speech and Written Text Rebecca - PowerPoint PPT Presentation

Deception Detection in Transcribed Speech and Written Text Rebecca Pottenger

Background • Detecting deception is a difficult problem • Human detection accuracy is low • For text: on average, correctly classify 47% of lies and 61% of truths 1 • For transcribed speech: on average, correctly classify 58.2% 2 • Not many automated methods; most existing automated methods have low accuracy • Logistic regression; Linguistic Inquiry and Word Count (LIWC) features; 5- fold CV – 59% of lies and 62% of truths 3 • Ripper rule induction; Acoustic, LIWC and speaker-dependent features; 10- fold CV – 66.4% 4 • Naïve Bayes and SVM; bag-of-words; 10-fold CV – 70% 5 Best method in existing literature: SVM; LIWC and bigram features; 5-fold CV • – 89.8% 6

Questions To Explore • Is there an underlying distribution to deceptive language? What is it? • Is this distribution different depending on whether the person was speaking (i.e. transcribed text) or writing? • Can we improve the accuracy of automated deception detection with better features? What should those features be?

Dataset • 400 truthful (Trip Advisor) and 400 gold-standard deceptive (Amazon Mechanical Turk) hotel reviews 6 • Michigan State University cheating game 7 • 7 lied about cheating, 9 confessed to cheating, 44 did not cheat • Possibly: Testimony from convicted perjurers and other cases

Experimental Methods 1) Re-create existing best method on dataset #2 • Use variety of supervised algorithms in addition to SVM (Naïve Bayes, Artificial Neural Networks etc.) 2) Distribution building • Identify space of possible features to work with (entire word set, bigram, LIWC, etc.) • Build probability distributions from deceptive and truthful data for both datasets 3) Use new sets of features to learn the model on dataset #1 and #2 • Use variety of features as well as variety of supervised algorithms

Methods of Analysis • Maximum Likelihood Estimate to find best fitting distribution • 10-fold cross validation • Accuracy, Precision, Recall, F-score • Feature weights

Sources 1) C.F. Bond and B.M. DePaulo. 2006. Accuracy of deception judgments. Personality and Social Psychology Review, 10(3): 214 2) F. Enos, S. Benus, R.L. Cautin, M. Graciarena, J. Hirschberg, and E. Shriberg. Personality Factors in Human Deception Detection: Comparing Human to Machine Performance. In Proceedings of INTERSPEECH-2006, Pittsburgh, Pennsylvania, USA. 3) M.L. Newman, J.W . Pennebaker, D.S. Berry, and J.M. Richards. 2003. Lying words: Predicting deception from linguistic styles. Personality and Social Psychology Bulletin, 29(5):665. 4) J. Hirschberg, S. Benus, J. Brenier, F. Enos, S. Friedman, S. Gilman, C. Girand, M. Graciarena, A. Kathol, L. Michaelis, B. Pellom, E. Shriberg, and A. Stolcke. 2005. Distinguishing deceptive from non-deceptive speech. In Proceedings of INTERSPEECH-2005, Lisbon, Portugal. 5) R. Mihalcea and C. Strapparava. 2009. The lie detector: Explorations in the automatic recognition of deceptive language. In Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, pages 309–312. Association for Computational Linguistics. 6) M. Ott, Y. Choi, C. Cardie, and J.T. Hancock. 2011. Finding Deceptive Opinion Spam by Any Stretch of the Imagination. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, pages 309-319. Association for Computational Linguistics. 7) M. Ali & T. Levine. 2008. The Language of Truthful and Deceptive Denials and Confessions, Communication Reports, 21:2, 82-91.

Deception Detection in Transcribed Speech and Written Text Rebecca - PowerPoint PPT Presentation

Deception Detection in Transcribed Speech and Written Text Rebecca Pottenger Background Detecting deception is a difficult problem Human detection accuracy is low For text: on average, correctly classify 47% of lies and 61% of

Lying and Deception in Games Joel Sobel August 2, 2016 Lying and Deception Sobel What is the

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

Speech Processing 15-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

EECS E6870 converting speech to text Speech Recognition automatic speech recognition

Deception and Estimation: Deception and Estimation: How We Fool Ourselves How We Fool Ourselves

Speech Processing 11-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

Post-Conference Presentation Sunday Oladayo Oladejo Table of Content A Introduction B

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

Speech Processing 15-492/18-492 Speech Synthesis Waveform generation 2 Speech Synthesis Text

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Recognition Acoustic

Project Overview Speech Speech Generation Generation Common Semantic Frame Speech Speech

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Synthesis Evaluation

Speech Processing 15- -492/18 492/18- -492 492 Speech Processing 15 Speech Synthesis Prosody

Prsentation Sun, snow, flower an anglais Diapo 1 I would like to thank the scientific comitee

Fall Protection RMMMA Members Our First Walkway First Walkway Finished Product Fuel Tank

MMR IAC Hearing Submission on behalf of of RMIT University 12 th September 2016 Tim Marks 1

Automatically Discovering, Reporting Kevin Moran , Mario Linares-Vsquez, and Reproducing

R. H. Beede, D. Kluepfel 2 , and M.V. McKenry 3 University of CA, Cooperative Extension, 2

for Rice Intensification in West Africa: CEMA Senegal Case Study Growing opportunities for

Learning Imbalanced Data with Random Forests Chao Chen (Stat., UC Berkeley) chenchao@ st at .

Scalable Solutions for Waste (Water) Management - small volume systems Sustainable Planet

Deception Detection in Transcribed Speech and Written Text Rebecca - PowerPoint PPT Presentation

Deception Detection in Transcribed Speech and Written Text Rebecca Pottenger Background Detecting deception is a difficult problem Human detection accuracy is low For text: on average, correctly classify 47% of lies and 61% of

Lying and Deception in Games Joel Sobel August 2, 2016 Lying and Deception Sobel What is the

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

Speech Processing 15-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

EECS E6870 converting speech to text Speech Recognition automatic speech recognition

Deception and Estimation: Deception and Estimation: How We Fool Ourselves How We Fool Ourselves

Speech Processing 11-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

Post-Conference Presentation Sunday Oladayo Oladejo Table of Content A Introduction B

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

Speech Processing 15-492/18-492 Speech Synthesis Waveform generation 2 Speech Synthesis Text

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Recognition Acoustic

Project Overview Speech Speech Generation Generation Common Semantic Frame Speech Speech

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Synthesis Evaluation

Speech Processing 15- -492/18 492/18- -492 492 Speech Processing 15 Speech Synthesis Prosody

Prsentation Sun, snow, flower an anglais Diapo 1 I would like to thank the scientific comitee

Fall Protection RMMMA Members Our First Walkway First Walkway Finished Product Fuel Tank

MMR IAC Hearing Submission on behalf of of RMIT University 12 th September 2016 Tim Marks 1

Automatically Discovering, Reporting Kevin Moran , Mario Linares-Vsquez, and Reproducing

R. H. Beede*, D. Kluepfel 2 , and M.V. McKenry 3 * University of CA, Cooperative Extension, 2

for Rice Intensification in West Africa: CEMA Senegal Case Study Growing opportunities for

Learning Imbalanced Data with Random Forests Chao Chen (Stat., UC Berkeley) chenchao@ st at .

Scalable Solutions for Waste (Water) Management - small volume systems Sustainable Planet

R. H. Beede, D. Kluepfel 2 , and M.V. McKenry 3 University of CA, Cooperative Extension, 2