deception detection in transcribed speech and written text
play

Deception Detection in Transcribed Speech and Written Text Rebecca - PowerPoint PPT Presentation

Deception Detection in Transcribed Speech and Written Text Rebecca Pottenger Background Detecting deception is a difficult problem Human detection accuracy is low For text: on average, correctly classify 47% of lies and 61% of


  1. Deception Detection in Transcribed Speech and Written Text Rebecca Pottenger

  2. Background • Detecting deception is a difficult problem • Human detection accuracy is low • For text: on average, correctly classify 47% of lies and 61% of truths 1 • For transcribed speech: on average, correctly classify 58.2% 2 • Not many automated methods; most existing automated methods have low accuracy • Logistic regression; Linguistic Inquiry and Word Count (LIWC) features; 5- fold CV – 59% of lies and 62% of truths 3 • Ripper rule induction; Acoustic, LIWC and speaker-dependent features; 10- fold CV – 66.4% 4 • Naïve Bayes and SVM; bag-of-words; 10-fold CV – 70% 5 Best method in existing literature: SVM; LIWC and bigram features; 5-fold CV • – 89.8% 6

  3. Questions To Explore • Is there an underlying distribution to deceptive language? What is it? • Is this distribution different depending on whether the person was speaking (i.e. transcribed text) or writing? • Can we improve the accuracy of automated deception detection with better features? What should those features be?

  4. Dataset • 400 truthful (Trip Advisor) and 400 gold-standard deceptive (Amazon Mechanical Turk) hotel reviews 6 • Michigan State University cheating game 7 • 7 lied about cheating, 9 confessed to cheating, 44 did not cheat • Possibly: Testimony from convicted perjurers and other cases

  5. Experimental Methods 1) Re-create existing best method on dataset #2 • Use variety of supervised algorithms in addition to SVM (Naïve Bayes, Artificial Neural Networks etc.) 2) Distribution building • Identify space of possible features to work with (entire word set, bigram, LIWC, etc.) • Build probability distributions from deceptive and truthful data for both datasets 3) Use new sets of features to learn the model on dataset #1 and #2 • Use variety of features as well as variety of supervised algorithms

  6. Methods of Analysis • Maximum Likelihood Estimate to find best fitting distribution • 10-fold cross validation • Accuracy, Precision, Recall, F-score • Feature weights

  7. Sources 1) C.F. Bond and B.M. DePaulo. 2006. Accuracy of deception judgments. Personality and Social Psychology Review, 10(3): 214 2) F. Enos, S. Benus, R.L. Cautin, M. Graciarena, J. Hirschberg, and E. Shriberg. Personality Factors in Human Deception Detection: Comparing Human to Machine Performance. In Proceedings of INTERSPEECH-2006, Pittsburgh, Pennsylvania, USA. 3) M.L. Newman, J.W . Pennebaker, D.S. Berry, and J.M. Richards. 2003. Lying words: Predicting deception from linguistic styles. Personality and Social Psychology Bulletin, 29(5):665. 4) J. Hirschberg, S. Benus, J. Brenier, F. Enos, S. Friedman, S. Gilman, C. Girand, M. Graciarena, A. Kathol, L. Michaelis, B. Pellom, E. Shriberg, and A. Stolcke. 2005. Distinguishing deceptive from non-deceptive speech. In Proceedings of INTERSPEECH-2005, Lisbon, Portugal. 5) R. Mihalcea and C. Strapparava. 2009. The lie detector: Explorations in the automatic recognition of deceptive language. In Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, pages 309–312. Association for Computational Linguistics. 6) M. Ott, Y. Choi, C. Cardie, and J.T. Hancock. 2011. Finding Deceptive Opinion Spam by Any Stretch of the Imagination. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, pages 309-319. Association for Computational Linguistics. 7) M. Ali & T. Levine. 2008. The Language of Truthful and Deceptive Denials and Confessions, Communication Reports, 21:2, 82-91.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend