reverse engineering captchas
play

Reverse Engineering CAPTCHAs Abram Hindle, Micheal W. Godfrey, - PowerPoint PPT Presentation

Reverse Engineering CAPTCHAs WCRE 2008 Reverse Engineering CAPTCHAs Abram Hindle, Micheal W. Godfrey, Richard C. Holt Software Architecture Group David R. Cheriton School of Computer Science University of Waterloo Canada


  1. Reverse Engineering CAPTCHAs WCRE 2008 Reverse Engineering CAPTCHAs Abram Hindle, Micheal W. Godfrey, Richard C. Holt Software Architecture Group David R. Cheriton School of Computer Science University of Waterloo Canada http://swag.uwaterloo.ca/ { ahindle,migod,holt } @cs.uwaterloo.ca Abram Hindle 1

  2. Reverse Engineering CAPTCHAs WCRE 2008

  3. Reverse Engineering CAPTCHAs WCRE 2008

  4. Reverse Engineering CAPTCHAs WCRE 2008

  5. Reverse Engineering CAPTCHAs WCRE 2008

  6. Reverse Engineering CAPTCHAs WCRE 2008

  7. Reverse Engineering CAPTCHAs WCRE 2008 Motivation • How can we solve that CAPTCHA? • How was a CAPTCHA made? Abram Hindle 7

  8. Reverse Engineering CAPTCHAs WCRE 2008 Why Reverse Engineer? • If we can reverse engineer a CAPTCHA – leverage weaknesses – re-implement a CAPTCHA ∗ The more we understand the easier it is to defeat ∗ We can solve by cloning Abram Hindle 8

  9. Reverse Engineering CAPTCHAs WCRE 2008 Abram Hindle 9

  10. Reverse Engineering CAPTCHAs WCRE 2008 CAPTCHA Properties Abram Hindle 10

  11. Reverse Engineering CAPTCHAs WCRE 2008 Abram Hindle 11

  12. Reverse Engineering CAPTCHAs WCRE 2008 Common Properties • Readable: the captcha must be easily read and decoded by humans. • Unguessable: The captcha message cannot be guessed at random with any real confidence. • Order-able: Characters are read left to right, top to bottom (exceptions could include Hebrew or Arabic captchas). If a captcha is readable, its character ordering should be apparent. Abram Hindle 12

  13. Reverse Engineering CAPTCHAs WCRE 2008 Bitmap fonts and placement Abram Hindle 13

  14. Reverse Engineering CAPTCHAs WCRE 2008 Backgrounds Abram Hindle 14

  15. Reverse Engineering CAPTCHAs WCRE 2008 Noise Abram Hindle 15

  16. Reverse Engineering CAPTCHAs WCRE 2008 Linear Transformations Abram Hindle 16

  17. Reverse Engineering CAPTCHAs WCRE 2008 Non-Linear Transformations Abram Hindle 17

  18. Reverse Engineering CAPTCHAs WCRE 2008 Dripping and Fuzzy Text Abram Hindle 18

  19. Reverse Engineering CAPTCHAs WCRE 2008 CAPTCHA Breaking Abram Hindle 19

  20. Reverse Engineering CAPTCHAs WCRE 2008 Abram Hindle 20

  21. Reverse Engineering CAPTCHAs WCRE 2008 Layering Abram Hindle 21

  22. Reverse Engineering CAPTCHAs WCRE 2008 Abram Hindle 22

  23. Reverse Engineering CAPTCHAs WCRE 2008 Abram Hindle 23

  24. Reverse Engineering CAPTCHAs WCRE 2008 Abram Hindle 24

  25. Reverse Engineering CAPTCHAs WCRE 2008 Abram Hindle 25

  26. Reverse Engineering CAPTCHAs WCRE 2008 Abram Hindle 26

  27. Reverse Engineering CAPTCHAs WCRE 2008 Abram Hindle 27

  28. Reverse Engineering CAPTCHAs WCRE 2008 Abram Hindle 28

  29. Reverse Engineering CAPTCHAs WCRE 2008 Abram Hindle 29

  30. Reverse Engineering CAPTCHAs WCRE 2008 Abram Hindle 30

  31. Reverse Engineering CAPTCHAs WCRE 2008 Text Pixel Identification and Image Cleanup Abram Hindle 31

  32. Reverse Engineering CAPTCHAs WCRE 2008 Erosion and Dilation Abram Hindle 32

  33. Reverse Engineering CAPTCHAs WCRE 2008 Thresholding Abram Hindle 33

  34. Reverse Engineering CAPTCHAs WCRE 2008 Edge Detection Abram Hindle 34

  35. Reverse Engineering CAPTCHAs WCRE 2008 Segmentation Abram Hindle 35

  36. Reverse Engineering CAPTCHAs WCRE 2008 Weight Segmentation Abram Hindle 36

  37. Reverse Engineering CAPTCHAs WCRE 2008 Box Segmenter Abram Hindle 37

  38. Reverse Engineering CAPTCHAs WCRE 2008 Shrinking and K-Means segmentation Abram Hindle 38

  39. Reverse Engineering CAPTCHAs WCRE 2008 Fill Flood Abram Hindle 39

  40. Reverse Engineering CAPTCHAs WCRE 2008 Normalization and Character Matching Abram Hindle 40

  41. Reverse Engineering CAPTCHAs WCRE 2008 PCA of A Abram Hindle 41

  42. Reverse Engineering CAPTCHAs WCRE 2008 PCA of F Abram Hindle 42

  43. Reverse Engineering CAPTCHAs WCRE 2008 Skeletonization Abram Hindle 43

  44. Reverse Engineering CAPTCHAs WCRE 2008 CAPTCHA Solving • Character Database • Normalization of Characters – PCA etc. • Matching – Nearest Neighbor – Shape Matching Abram Hindle 44

  45. Reverse Engineering CAPTCHAs WCRE 2008 Piratebay Database Abram Hindle 45

  46. Reverse Engineering CAPTCHAs WCRE 2008 Digg Database Abram Hindle 46

  47. Reverse Engineering CAPTCHAs WCRE 2008 Reverse Engineering • Layering • Background • Noise • Text • Transforms Abram Hindle 47

  48. Reverse Engineering CAPTCHAs WCRE 2008

  49. Reverse Engineering CAPTCHAs WCRE 2008 Captcha Solving Summary • Image Clean Up • Text Pixel Identification • Segmentation • Character Matching – Normalization Abram Hindle 49

  50. Reverse Engineering CAPTCHAs WCRE 2008 Solving by Cloning • Reverse Engineer captcha • Preprocess the captcha • Parameterize • Generate candidates – Search through the captchas – Find best match – Repeat Abram Hindle 50

  51. Reverse Engineering CAPTCHAs WCRE 2008 Watercap demo • Provided with a captcha of “WCREWCRE” and the code to generate such captchas • Algorithm – Per each column we iterate through each character, ∗ generating a captcha for each prefix and character, · keeping the best match. Abram Hindle 51

  52. Reverse Engineering CAPTCHAs WCRE 2008 CAPTCHA Example Accuracy Digg 30% PHPBB 99% Piratebay 61% Watercap 27% / 93% Rogers 95% Minimum accuracy of our captcha breakers Abram Hindle 52

  53. Reverse Engineering CAPTCHAs WCRE 2008 How to improve captcha implementations • Non-linear transformations • Non-fill-flood-able letters • Use more characters • Limit captcha access • Similar to the background • Non continuous and overlapping characters Abram Hindle 53

  54. Reverse Engineering CAPTCHAs WCRE 2008 Ethics • Spammers • Visually Impaired • Poor security • Options: – Telephone Confirmation – Credit Cards – Web of trust Abram Hindle 54

  55. Reverse Engineering CAPTCHAs WCRE 2008 Reverse Engineering Lessons • RE Can be interpretative • Some outputs have properties that allow us to Reverse Engineer the software that created it – In this case 2D Image generation has many common patterns • Absence of code still allows RE Abram Hindle 55

  56. Reverse Engineering CAPTCHAs WCRE 2008 Future Work • Better Breakers • Layer recognition • Audio captchas Abram Hindle 56

  57. Reverse Engineering CAPTCHAs WCRE 2008 Conclusion • Reverse Engineering captchas hi-lights techniques that have weaknesses. • Captcha generation follows certain patterns which are recoverable and leveragable. • Captchas have been defeated – Even “good” captchas from Microsoft, Yahoo and Google have been defeated. Abram Hindle 57

Recommend


More recommend