a human factors approach to spam
play

A Human Factors Approach to Spam Factors Approach Filtering - PowerPoint PPT Presentation

Spam & HCI R. Beverly The Problem A Human A Human Factors Approach to Spam Factors Approach Filtering SpamGUI Parting Thoughts Summary Robert Beverly MIT CSAIL rbeverly@csail.mit.edu July 27, 2009 Conference on Email and


  1. Spam & HCI R. Beverly The Problem A Human A Human Factors Approach to Spam Factors Approach Filtering SpamGUI Parting Thoughts Summary Robert Beverly MIT CSAIL rbeverly@csail.mit.edu July 27, 2009 Conference on Email and Anti-Spam 2009 R. Beverly (MIT) Spam & HCI CEAS 2009 1 / 12

  2. The Problem Spam & HCI R. Beverly No spam classifier is perfect The Problem A Human Okay in other ML fields, e.g. Factors Approach Handwriting recognition, search engines, music SpamGUI recommendation, etc. Parting Thoughts Summary But with spam: Adaptable, adversarial inputs Complexion of dataset severely unbalanced High cost of false positives Getting from 99.9% to 99.999% Fighting a losing battle? R. Beverly (MIT) Spam & HCI CEAS 2009 2 / 12

  3. The Problem Spam & HCI R. Beverly No spam classifier is perfect The Problem A Human Okay in other ML fields, e.g. Factors Approach Handwriting recognition, search engines, music SpamGUI recommendation, etc. Parting Thoughts Summary But with spam: Adaptable, adversarial inputs Complexion of dataset severely unbalanced High cost of false positives Getting from 99.9% to 99.999% Fighting a losing battle? R. Beverly (MIT) Spam & HCI CEAS 2009 2 / 12

  4. The Problem Spam & HCI R. Beverly No spam classifier is perfect The Problem A Human Okay in other ML fields, e.g. Factors Approach Handwriting recognition, search engines, music SpamGUI recommendation, etc. Parting Thoughts Summary But with spam: Adaptable, adversarial inputs Complexion of dataset severely unbalanced High cost of false positives Getting from 99.9% to 99.999% Fighting a losing battle? R. Beverly (MIT) Spam & HCI CEAS 2009 2 / 12

  5. The Problem Spam & HCI 1 R. Beverly The Problem 0.8 A Human Factors Cumulative Fraction of Emails Approach 0.6 SpamGUI Parting Thoughts 0.4 Summary 0.2 Spam Ham 0 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 SpamAssassin Score TREC 2007 dataset ( ∼ 75k messages) Classified with SpamAssassin How close are mails to the threshold (5)? R. Beverly (MIT) Spam & HCI CEAS 2009 3 / 12

  6. The Problem Spam & HCI R. Beverly 1 The Problem A Human 0.8 Factors Approach Cumulative Fraction of Emails SpamGUI 0.6 Parting Thoughts Summary 0.4 0.2 Spam Ham 0 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 SpamAssassin Score How close are mails to the threshold (5)? 99.72% of ham below threshold... good? R. Beverly (MIT) Spam & HCI CEAS 2009 4 / 12

  7. The Problem Spam & HCI 1 Spam Ham R. Beverly Complimentary Cumulative Fraction of Emails The Problem 0.1 A Human Factors Approach 0.01 SpamGUI Parting Thoughts 0.001 Summary 0.0001 1e-05 -15 -10 -5 0 5 10 15 20 25 30 35 40 45 50 55 60 SpamAssassin Score No threshold gives zero FP/FN (well-known compromise) Deluge of spam implies this compromise is flawed 0.28% above → 71 false positives R. Beverly (MIT) Spam & HCI CEAS 2009 5 / 12

  8. A Human Factors Approach Spam & HCI R. Beverly The Problem A Human Approaching from a different direction... Factors Approach SpamGUI The User Agent: Parting Thoughts Users interact with their email via a Mail User Agent Summary (MUA), e.g. Outlook, Hotmail, etc. Note that besides going graphical, MUAs have changed little over past ∼ 30 years Better incorporate human factors into a MUA R. Beverly (MIT) Spam & HCI CEAS 2009 6 / 12

  9. A Human Factors Approach Spam & HCI R. Beverly The Problem Human Factors Approach – Potential: A Human Factors Make email more useful to the user 1 Approach How are emails presented? SpamGUI Humans ultimate arbiter of any mail’s importance Parting 2 Thoughts How to better include, scale their decision process? Summary Remove burden of perfect classification from classifier 3 “good enough” filtering Eliminate false positives 4 Innovate in the user agent R. Beverly (MIT) Spam & HCI CEAS 2009 7 / 12

  10. A Human Factors Approach Spam & HCI R. Beverly The Problem Human Factors Approach – Potential: A Human Factors Make email more useful to the user 1 Approach How are emails presented? SpamGUI Humans ultimate arbiter of any mail’s importance Parting 2 Thoughts How to better include, scale their decision process? Summary Remove burden of perfect classification from classifier 3 “good enough” filtering Eliminate false positives 4 Innovate in the user agent R. Beverly (MIT) Spam & HCI CEAS 2009 7 / 12

  11. SpamGUI Spam & HCI R. Beverly Position The Problem A Human Separate classification from filtering Factors Approach SpamGUI The inbox : Parting Rethink the inbox: use a single mail folder, don’t Thoughts Summary attempt to filter into spam, ham “folders” Use color, size, shade, order, and other human factors to present the inbox Presentation of email a function of importance Proof-of-concept: SpamGUI Thunderbird extension... R. Beverly (MIT) Spam & HCI CEAS 2009 8 / 12

  12. SpamGUI Spam & HCI R. Beverly Position The Problem A Human Separate classification from filtering Factors Approach SpamGUI The inbox : Parting Rethink the inbox: use a single mail folder, don’t Thoughts Summary attempt to filter into spam, ham “folders” Use color, size, shade, order, and other human factors to present the inbox Presentation of email a function of importance Proof-of-concept: SpamGUI Thunderbird extension... R. Beverly (MIT) Spam & HCI CEAS 2009 8 / 12

  13. SpamGUI Spam & HCI R. Beverly The Problem A Human Factors Approach SpamGUI Parting Thoughts Summary R. Beverly (MIT) Spam & HCI CEAS 2009 9 / 12

  14. SpamGUI Spam & HCI R. Beverly A Few Observations: The Problem A demarcation “line” naturally emerges to the eye, A Human Factors above which user (or UI) can ignore messages Approach SpamGUI User part of filtering process, but only burdened by Parting making spam decisions on a small number of emails Thoughts around line Summary Easy to scan for formerly false positive emails on the threshold border Lots of work remains: No user studies performed yet Experimenting with several approaches R. Beverly (MIT) Spam & HCI CEAS 2009 10 / 12

  15. Parting Thoughts Spam & HCI R. Beverly The Problem A Human More generally: Factors Approach Users inundated with information, how can UI help? SpamGUI Spam is just one class of very unimportant information Parting Thoughts Lots of unused input “features;” systems designers Summary should use them Learn best way to present email to user Recognize that innovation is possible in the user agent R. Beverly (MIT) Spam & HCI CEAS 2009 11 / 12

  16. Summary Spam & HCI R. Beverly The Problem We’re fighting a losing battle trying to make spam A Human classifiers perfect Factors Approach Separate act of classification from filtering SpamGUI As a community, think more about how HCI / human Parting Thoughts factors methods can help Summary Thanks! http://www.rbeverly.net/spamgui/ Questions? R. Beverly (MIT) Spam & HCI CEAS 2009 12 / 12

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend