SLIDE 1 Insert presenter logo here
- n slide master. See hidden
slide 4 for direc6ons
Session ID: Session Classification:
Elie Bursztein
The art of breaking and designing captchas
HT02-402 xxxxxxxxxxxx
SLIDE 2 Insert presenter logo here
- n slide master. See hidden
slide 4 for direc6ons
2
SLIDE 3 Insert presenter logo here
- n slide master. See hidden
slide 4 for direc6ons
2
SLIDE 4 Insert presenter logo here
- n slide master. See hidden
slide 4 for direc6ons
2
SLIDE 5 Insert presenter logo here
- n slide master. See hidden
slide 4 for direc6ons
2
SLIDE 6 Elie Bursztein (@elie) https://elie.net
3
SLIDE 7 Elie Bursztein (@elie) https://elie.net
3
SLIDE 8 Elie Bursztein (@elie) https://elie.net
3
SLIDE 9 Elie Bursztein (@elie) https://elie.net
3
SLIDE 10 Elie Bursztein (@elie) https://elie.net
3
SLIDE 11 Elie Bursztein (@elie) https://elie.net
World Most-Popular Captchas
[Baidu] [Captcha.net] [NIH] [Wikipedia] [Digg] [Blizzard] [Google] [Skyrock] [Recaptcha] [Authorize] [CNN] [Megaupload] [Reddit]
[Slashdot]
4
SLIDE 12 Elie Bursztein (@elie) https://elie.net
World Most-Popular Captchas
[Baidu] [Captcha.net] [NIH] [Wikipedia] [Digg] [Blizzard] [Google] [Skyrock] [Recaptcha] [Authorize] [CNN] [Megaupload] [Reddit]
[Slashdot]
4
SLIDE 13 Elie Bursztein (@elie) https://elie.net
Captcha Design Goal
Hard for computer Hard for human
5
SLIDE 14 Elie Bursztein (@elie) https://elie.net
Captcha Design Goal
Hard for computer Hard for human Human
5
SLIDE 15 Elie Bursztein (@elie) https://elie.net
Captcha Design Goal
Hard for computer Hard for human AI ? Human
5
SLIDE 16 Elie Bursztein (@elie) https://elie.net
Captcha Design Goal
Hard for computer Hard for human AI ? Human sweet spot
5
SLIDE 17 Elie Bursztein (@elie) https://elie.net
Focus of this talk
xw
How to break and design CAPTCHAs
6
SLIDE 18 Elie Bursztein (@elie) https://elie.net
Based on the breaking 21 of the most popular schemes and designing the new Wikipedia captcha
7
SLIDE 19 Elie Bursztein (@elie) https://elie.net
Outline
8
SLIDE 20 Elie Bursztein (@elie) https://elie.net
Outline
How to break text captcha
8
SLIDE 21 Elie Bursztein (@elie) https://elie.net
Outline
How to break text captcha How to make captchas easier for human
8
SLIDE 22 Elie Bursztein (@elie) https://elie.net
Outline
How to break text captcha How to make captchas easier for human How to break audio captcha
8
SLIDE 23 Elie Bursztein (@elie) https://elie.net
Outline
How to break text captcha How to make captchas easier for human How to break audio captcha How to break video captcha
8
SLIDE 24 Elie Bursztein (@elie) https://elie.net
Evaluation metrics Accuracy
9
SLIDE 25 Elie Bursztein (@elie) https://elie.net
Evaluation metrics Accuracy
9
Solving time
SLIDE 26 Elie Bursztein (@elie) https://elie.net
Evaluation metrics Accuracy Learnability
9
Solving time
SLIDE 27 Insert presenter logo here
- n slide master. See hidden
slide 4 for direc6ons
How to Break Text-Captchas
10
SLIDE 28 Elie Bursztein (@elie) https://elie.net
Think Lego
11
SLIDE 29 Elie Bursztein (@elie) https://elie.net
3 7 1 3 3 7 1 3 3 7 1 3
How to break a captcha: example
12
SLIDE 30 Elie Bursztein (@elie) https://elie.net
3 7 1 3 3 7 1 3 3 7 1 3
Pre-processing: background removal
12
SLIDE 31 Elie Bursztein (@elie) https://elie.net
3 7 1 3 3 7 1 3 3 7 1 3
Pre-processing: background removal
12
SLIDE 32 Elie Bursztein (@elie) https://elie.net
3 7 1 3 3 7 1 3 3 7 1 3
Pre-processing: captcha binarization
12
SLIDE 33 Elie Bursztein (@elie) https://elie.net
3 7 1 3 3 7 1 3
Pre-processing: captcha binarization
12
SLIDE 34 Elie Bursztein (@elie) https://elie.net
3 7 1 3 3 7 1 3
Pre-processing: Line detection
12
SLIDE 35 Elie Bursztein (@elie) https://elie.net
3 7 1 3 3 7 1 3
Pre-processing: Line detection
12
SLIDE 36 Elie Bursztein (@elie) https://elie.net
3 7 1 3 3 7 1 3
Pre-processing: Line removal
12
SLIDE 37 Elie Bursztein (@elie) https://elie.net
3 7 1 3 3 7 1 3
Pre-processing: Line removal
12
SLIDE 38 Elie Bursztein (@elie) https://elie.net
3 7 1 3 3 7 1 3
Segmentation: clustering algorithm
12
SLIDE 39 Elie Bursztein (@elie) https://elie.net
3 7 1 3
Segmentation: clustering algorithm
12
SLIDE 40 Elie Bursztein (@elie) https://elie.net
3 7 1 3
Segmentation: cluster separation
12
SLIDE 41 Elie Bursztein (@elie) https://elie.net
3 7 1 3
Segmentation: cluster separation
12
SLIDE 42 Elie Bursztein (@elie) https://elie.net
3 7 1 3
Post-segmentation: inverting rotation
12
SLIDE 43 Elie Bursztein (@elie) https://elie.net
3 713
Post-segmentation: inverting rotation
12
SLIDE 44 Elie Bursztein (@elie) https://elie.net
3 713
Recognition:
12
SLIDE 45 Elie Bursztein (@elie) https://elie.net
Recognition: 3 1 7 3
12
SLIDE 46 Elie Bursztein (@elie) https://elie.net
Breaker 5 Stages Pipeline
Slashdot captcha
13
SLIDE 47 Elie Bursztein (@elie) https://elie.net
Breaker 5 Stages Pipeline
Preprocessing
13
SLIDE 48 Elie Bursztein (@elie) https://elie.net
Breaker 5 Stages Pipeline
Preprocessing
13
SLIDE 49 Elie Bursztein (@elie) https://elie.net
Breaker 5 Stages Pipeline
Preprocessing Segmentation
13
SLIDE 50 Elie Bursztein (@elie) https://elie.net
Breaker 5 Stages Pipeline
Preprocessing Segmentation
13
SLIDE 51 Elie Bursztein (@elie) https://elie.net
Breaker 5 Stages Pipeline
Preprocessing Segmentation Post- segmentation
13
SLIDE 52 Elie Bursztein (@elie) https://elie.net
Breaker 5 Stages Pipeline
Preprocessing Segmentation Post- segmentation
13
SLIDE 53 Elie Bursztein (@elie) https://elie.net
Breaker 5 Stages Pipeline
Preprocessing Segmentation Post- segmentation Recognition
13
SLIDE 54 Elie Bursztein (@elie) https://elie.net
Breaker 5 Stages Pipeline
Preprocessing Segmentation Post- segmentation Recognition
f a e t e s t
13
SLIDE 55 Elie Bursztein (@elie) https://elie.net
Breaker 5 Stages Pipeline
Preprocessing Segmentation Post- segmentation Recognition
f a e t e s t
Post-recognition
13
SLIDE 56 Elie Bursztein (@elie) https://elie.net
Breaker 5 Stages Pipeline
Preprocessing Segmentation Post- segmentation Recognition
f a e t e s t f a s t e s t
Post-recognition
13
SLIDE 57 From the image to the matrix representation
14
SLIDE 58 From the image to the matrix representation
14
SLIDE 59 From the image to the matrix representation
14
SLIDE 60 From the image to the matrix representation
14
SLIDE 61 From the image to the matrix representation
14
SLIDE 62 From the matrix representation to the vector representation
15
SLIDE 63 From the matrix representation to the vector representation
15
SLIDE 64 From the matrix representation to the vector representation
15
SLIDE 65 L1 L2 From the matrix representation to the vector representation
15
SLIDE 66 L1 L2 L3 From the matrix representation to the vector representation
15
SLIDE 67 L1 L2 L3 L4 L5 L6 vector From the matrix representation to the vector representation
15
SLIDE 68 B B A A C C vector Distance Known vectors From the vector representation to the segment value (classification)
16
SLIDE 69 B B A A C C vector Distance Known vectors From the vector representation to the segment value (classification)
16
SLIDE 70 B B A A C C 42 vector Distance Known vectors From the vector representation to the segment value (classification)
16
SLIDE 71 B B A A C C 42 vector Distance Known vectors From the vector representation to the segment value (classification)
16
SLIDE 72 B B A A C C 42 40 vector Distance Known vectors From the vector representation to the segment value (classification)
16
SLIDE 73 B B A A C C 42 40 vector Distance Known vectors From the vector representation to the segment value (classification)
16
SLIDE 74 B B A A C C 42 40 32 vector Distance Known vectors From the vector representation to the segment value (classification)
16
SLIDE 75 B B A A C C 42 40 32 70 12 18 vector Distance Known vectors From the vector representation to the segment value (classification)
16
SLIDE 76 B B A A C 42 40 32 70 18 vector Distance Known vectors C 12 From the vector representation to the segment value (classification)
16
SLIDE 77 Elie Bursztein (@elie) https://elie.net
Breaker efficiency
Solver accuracy = Coverage * Precision^length Coverage: Segmentation rate Precision: Recognition rate
17
SLIDE 78 Elie Bursztein (@elie) http://elie.im
Anti-recognition techniques
18
SLIDE 79 Elie Bursztein (@elie) http://elie.im
Anti-recognition techniques
Blurring
18
SLIDE 80 Elie Bursztein (@elie) http://elie.im
Anti-recognition techniques
Blurring Distortion
18
SLIDE 81 Elie Bursztein (@elie) http://elie.im
Anti-recognition techniques
Blurring Distortion Rotation
18
SLIDE 82 Elie Bursztein (@elie) http://elie.im
Anti-recognition techniques
Blurring Distortion Rotation Fonts
18
SLIDE 83 Elie Bursztein (@elie) http://elie.im
Anti-recognition techniques
Blurring Distortion Rotation Fonts Charsets
18
SLIDE 84 Elie Bursztein (@elie) http://elie.im
Anti-recognition techniques
Blurring Distortion Rotation Fonts Charsets 0123456789
18
SLIDE 85 Elie Bursztein (@elie) https://elie.net
SVM learning rate
% success
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Trainning set size
10 20 50 100 200 500
09 AZ09 azAZ09 Distortion 3 fonts 5 fonts Angles
19
SLIDE 86 Elie Bursztein (@elie) https://elie.net
KNN learning rate
% success
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Trainning set size
10 20 50 100 200 500
09 AZ09 azAZ09 Distortion 3 fonts 5 fonts Angles
20
SLIDE 87 Elie Bursztein (@elie) http://elie.im
Anti-recognition taxonomy
21
SLIDE 88 Elie Bursztein (@elie) http://elie.im
Anti-recognition taxonomy
Background Confusion Background confusion
21
SLIDE 89 Elie Bursztein (@elie) http://elie.im
Anti-recognition taxonomy
Background Confusion Background confusion
21
SLIDE 90 Elie Bursztein (@elie) http://elie.im
Anti-recognition taxonomy
Background Confusion Background confusion
21
SLIDE 91 Elie Bursztein (@elie) http://elie.im
Anti-recognition taxonomy
Background Confusion Background confusion
21
SLIDE 92 Elie Bursztein (@elie) http://elie.im
Anti-recognition taxonomy
Background Confusion Lines Background confusion
21
SLIDE 93 Elie Bursztein (@elie) http://elie.im
Anti-recognition taxonomy
Background Confusion Lines Background confusion
21
SLIDE 94 Elie Bursztein (@elie) http://elie.im
Anti-recognition taxonomy
Background Confusion Lines Background confusion
21
SLIDE 95 Elie Bursztein (@elie) http://elie.im
Anti-recognition taxonomy
Background Confusion Lines Background confusion
21
SLIDE 96 Elie Bursztein (@elie) http://elie.im
Anti-recognition taxonomy
Background Confusion Lines Collapsing Background confusion
21
SLIDE 97 Elie Bursztein (@elie) http://elie.im
Anti-recognition taxonomy
Background Confusion Lines Collapsing Background confusion
21
SLIDE 98 Elie Bursztein (@elie) http://elie.im
Anti-recognition taxonomy
Background Confusion Lines Collapsing Background confusion
21
SLIDE 99 Elie Bursztein (@elie) http://elie.im
Anti-recognition taxonomy
Background Confusion Lines Collapsing Background confusion
21
SLIDE 100 Elie Bursztein (@elie) http://elie.im
Breaking World of Warcraft
22
SLIDE 101 Elie Bursztein (@elie) http://elie.im
Breaking World of Warcraft
22
SLIDE 102 Elie Bursztein (@elie) http://elie.im
Breaking World of Warcraft
22
SLIDE 103 Elie Bursztein (@elie) http://elie.im
Breaking World of Warcraft
22
SLIDE 104 Elie Bursztein (@elie) http://elie.im
Breaking World of Warcraft
22
SLIDE 105 Elie Bursztein (@elie) http://elie.im
Breaking Captcha.net
23
SLIDE 106 Elie Bursztein (@elie) http://elie.im
Breaking Captcha.net
23
SLIDE 107 Elie Bursztein (@elie) http://elie.im
Breaking Captcha.net
23
SLIDE 108 Elie Bursztein (@elie) http://elie.im
Breaking Captcha.net
23
SLIDE 109 Elie Bursztein (@elie) http://elie.im
Breaking Captcha.net
23
SLIDE 110 Elie Bursztein (@elie) http://elie.im
Breaking Wikipedia
24
SLIDE 111 Elie Bursztein (@elie) http://elie.im
Breaking Wikipedia
24
SLIDE 112 Elie Bursztein (@elie) http://elie.im
Breaking Wikipedia
24
SLIDE 113 Elie Bursztein (@elie) http://elie.im
Breaking Wikipedia
24
SLIDE 114 Elie Bursztein (@elie) http://elie.im
Breaking Wikipedia
24
SLIDE 115 Elie Bursztein (@elie) http://elie.im
Breaking Digg
25
SLIDE 116 Elie Bursztein (@elie) http://elie.im
Breaking Digg
25
SLIDE 117 Elie Bursztein (@elie) http://elie.im
Breaking Digg
25
SLIDE 118 Elie Bursztein (@elie) http://elie.im
Breaking Digg
25
SLIDE 119 Elie Bursztein (@elie) http://elie.im
Breaking Digg
25
SLIDE 120 Elie Bursztein (@elie) http://elie.im
Breaking Slashdot
26
SLIDE 121 Elie Bursztein (@elie) http://elie.im
Breaking Slashdot
26
SLIDE 122 Elie Bursztein (@elie) http://elie.im
Breaking Slashdot
26
SLIDE 123 Elie Bursztein (@elie) http://elie.im
Breaking Slashdot
26
SLIDE 124 Elie Bursztein (@elie) http://elie.im
Breaking Slashdot
26
SLIDE 125 Elie Bursztein (@elie) http://elie.im
Breaking eBay
27
SLIDE 126 Elie Bursztein (@elie) http://elie.im
Breaking eBay
27
SLIDE 127 Elie Bursztein (@elie) http://elie.im
Breaking eBay
27
SLIDE 128 Elie Bursztein (@elie) http://elie.im
Breaking eBay
27
SLIDE 129 Elie Bursztein (@elie) http://elie.im
Breaking eBay
27
SLIDE 130 Elie Bursztein (@elie) http://elie.im
Failing to break eBay
28
SLIDE 131 Elie Bursztein (@elie) http://elie.im
Failing to break eBay
28
SLIDE 132 Elie Bursztein (@elie) http://elie.im
Failing to break eBay
28
SLIDE 133 Elie Bursztein (@elie) http://elie.im
Failing to break eBay
28
SLIDE 134 Elie Bursztein (@elie) http://elie.im
Failing to break eBay
28
SLIDE 135 Elie Bursztein (@elie) http://elie.im
Breaking Baidu
29
SLIDE 136 Elie Bursztein (@elie) http://elie.im
Breaking Baidu
29
SLIDE 137 Elie Bursztein (@elie) http://elie.im
Breaking Baidu
29
SLIDE 138 Elie Bursztein (@elie) http://elie.im
Breaking Baidu
29
SLIDE 139 Elie Bursztein (@elie) http://elie.im
Breaking Baidu
29
SLIDE 140 Elie Bursztein (@elie) http://elie.im
Breaking Baidu
29
SLIDE 141 Elie Bursztein (@elie) https://elie.net
Overall results
Segmentation rate Solving rate Authorize 84% 66% Baidu 98% 5% Blizzard 75% 70% Captcha.net 96% 73% CNN 50% 16% Digg 86% 20% eBay 95% 43% Google 0% 0% MegaUpload n/a 93% NIH 87% 72% Recaptcha 0% 0% Reddit 71% 42% Skyrock 30% 2% Slashdot 52% 35% Wikipedia 57% 25%
30
SLIDE 142 Elie Bursztein (@elie) https://elie.net
Learning rate for real schemes
% success
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
Trainning set size
10 20 50 100 200 500
Authorize Baidu Blizzard Captcha.net CNN Digg eBay Megaupload NIH Reddit Skyrock Slashdot Wikipedia
31
SLIDE 143 Elie Bursztein (@elie) https://elie.net
Decaptcha main interface
32
SLIDE 144 Elie Bursztein (@elie) https://elie.net
Apply design principles
Core design principles
Randomize length Randomize character size Wave the captcha
Use anti-recognition as a means of strengthening captcha security Don’t use a complex charset
Bad for human (see our research on this) Useless for security
Use collapsing or lines
33
SLIDE 145 Insert presenter logo here
- n slide master. See hidden
slide 4 for direc6ons
Designing Better Captchas
34
SLIDE 146 Elie Bursztein (@elie) https://elie.net
Think Lego again
Decompose in features Analyze
feature in isolation features interaction
35
SLIDE 147 Elie Bursztein (@elie) https://elie.net
Evaluation system
Tasks Web Fronted Test results Monitoring system Captcha image Generator
Feedback system
Tasks Generator
Amazon Mechanical Turk Payement validation
36
SLIDE 148 Elie Bursztein (@elie) https://elie.net
Experiment details
Round Task N possible N sampled N tests per sample Total tests 1 Baseline (“Control”) 1 1 1000 1000 2 Real world captchas 8 8 1000 8000 3 Features in isolation 496 496 200 99200 4 2 feature interactions 60950 60950 5 304750 5 3 feature interactions 1 303 224 25000 10 250000 6 4 feature interactions 113 951 684 25000 10 250000 Total 912150
37
SLIDE 149 Elie Bursztein (@elie) https://elie.net
Some of the features tested
Blurring Text color Collapsing Font Distortion line line shape nb line line size line coverage line position line angle Waving Tilting Distortion Background color Noise
38
SLIDE 150 Elie Bursztein (@elie) https://elie.net
Angle of rotation
solving time (s) 4 5 6 7 8 9 10 11 12 13 14 accuracy 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 rotation angle (°) 50 100 150 200 250 300 350 solving time accuracy
39
SLIDE 151 Elie Bursztein (@elie) https://elie.net
Collapsing
solving time (s) 4 5 6 7 8 9 10 11 12 13 14 accuracy 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 character gap width
2 4 solving time accuracy
40
SLIDE 152 Elie Bursztein (@elie) https://elie.net
Character size
solving time (s) 4 5 6 7 8 9 10 11 12 13 14 accuracy 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 character size 2 4 6 8 10 12 14 16 18 20 solving time accuracy
41
SLIDE 153 Elie Bursztein (@elie) https://elie.net
Resolution invariant
Accuracy 50 55 60 65 70 75 80 85 90 95 100 captcha length (number of characters) 5 10 15 20 25 30 <= 1024 > 1024 all captchas
42
SLIDE 154 Elie Bursztein (@elie) https://elie.net
2D interactions
43
SLIDE 155 Elie Bursztein (@elie) https://elie.net
Length vs Angle interaction
44
SLIDE 156 Elie Bursztein (@elie) https://elie.net
Perception Does Not Match Number
5 10 15 20 25 30 35 az 09 AZ az09 AZ09 azAZ azAZ09 pretty HF+ cutest LF + guilty HF - molest LF - %fast %easy %like
45
SLIDE 157 Elie Bursztein (@elie) https://elie.net
The New Wikipedia
Use digit Wave the captcha Use random length (5-7) Use random size (34-50) Rotate letter (-25/ 25) Add a line for a super secure version
46
SLIDE 158 Elie Bursztein (@elie) https://elie.net
End result
Accuracy Solving time
47
SLIDE 159 Elie Bursztein (@elie) https://elie.net
End result
Accuracy Solving time 84.8%
47
SLIDE 160 Elie Bursztein (@elie) https://elie.net
End result
Accuracy Solving time 84.8% 7.8s
47
SLIDE 161 Elie Bursztein (@elie) https://elie.net
End result
Accuracy Solving time 84.8% 7.8s
47
SLIDE 162 Elie Bursztein (@elie) https://elie.net
End result
Accuracy Solving time 84.8% 7.8s 89.2% 82.6%
47
SLIDE 163 Elie Bursztein (@elie) https://elie.net
End result
Accuracy Solving time 84.8% 7.8s 89.2% 82.6% 4.9s 5.3s
47
SLIDE 164 Elie Bursztein (@elie) https://elie.net
End result
confusing
Accuracy Solving time 84.8% 7.8s 89.2% 82.6% 4.9s 5.3s
47
SLIDE 165 Elie Bursztein (@elie) https://elie.net
End result
confusing
Accuracy Solving time 84.8% 7.8s 89.2% 82.6% 4.9s 5.3s
47
SLIDE 166 Elie Bursztein (@elie) https://elie.net
End result
confusing
Accuracy Solving time 84.8% 7.8s 89.2% 82.6% 4.9s 5.3s 97% 92.2%
47
SLIDE 167 Elie Bursztein (@elie) https://elie.net
End result
confusing
Accuracy Solving time 84.8% 7.8s 89.2% 82.6% 4.9s 5.3s 97% 92.2% 4.9s 5.2s
47
SLIDE 168 Insert presenter logo here
- n slide master. See hidden
slide 4 for direc6ons
How to Break Audio-Captcha
48
SLIDE 169 Elie Bursztein (@elie) https://elie.net
Audio Captchas
49
SLIDE 170 Elie Bursztein (@elie) https://elie.net
Audio Captchas
49
SLIDE 171 Elie Bursztein (@elie) https://elie.net
Super secure captcha Captcha Maker
Creating Audio Captcha
50
SLIDE 172 Elie Bursztein (@elie) https://elie.net
Super secure captcha Captcha Maker
Creating Audio Captcha
Voices
50
SLIDE 173 Elie Bursztein (@elie) https://elie.net
Super secure captcha Captcha Maker
Creating Audio Captcha
Noises
50
SLIDE 174 Elie Bursztein (@elie) https://elie.net
Super secure captcha
Creating Audio Captcha
50
SLIDE 175 Elie Bursztein (@elie) https://elie.net
Noise intensity (RMS/SNR)
2 9 Micros J Dig A K Authori K J 5 H
51
SLIDE 176 Elie Bursztein (@elie) https://elie.net
Sound representation
WAV DFT Cep TFR TCR TDC
52
SLIDE 177 Elie Bursztein (@elie) http://elie.im
Solving an audio captcha
53
SLIDE 178 Elie Bursztein (@elie) http://elie.im
Solving an audio captcha
53
SLIDE 179 Elie Bursztein (@elie) http://elie.im
Solving an audio captcha
53
SLIDE 180 Elie Bursztein (@elie) http://elie.im
Solving an audio captcha
53
SLIDE 181 Elie Bursztein (@elie) http://elie.im
Solving an audio captcha
53
SLIDE 182 Elie Bursztein (@elie) http://elie.im
Solving an audio captcha
C
53
SLIDE 183 Elie Bursztein (@elie) http://elie.im
Solving an audio captcha
C T T A R A F R S 2
53
SLIDE 184 Elie Bursztein (@elie) https://elie.net
Dealing with random noise
Statistical learning Supervised learning RLS (Regularized least square) classifier
Authorize eBay Recaptcha Authorize Digg
5: J:
54
SLIDE 185 Elie Bursztein (@elie) https://elie.net
Semantic noise
55
SLIDE 186 Elie Bursztein (@elie) https://elie.net
Results
Length Coverage Digit Captcha Authorize 5 100 97 89.2% Digg 5 100 76 41.4% eBay 6 85.6 92.5 82.9% Microsoft 10 80.6 89.6 48.9% Recaptcha 8 99.9 40.5 1.5% Yahoo 7 99.1 74.7 45.4%
56
SLIDE 187 Elie Bursztein (@elie) https://elie.net
Recaptcha semantic noise
20 40 60 80 100 120 140 160 180 200
3 7 N 2 1 4 9 N 5 D B Time in seconds
57
SLIDE 188 Elie Bursztein (@elie) https://elie.net
Recaptcha semantic noise
20 40 60 80 100 120 140 160 180 200
3 7 N 2 1 4 9 N 5 D B Time in seconds
57
SLIDE 189 Elie Bursztein (@elie) https://elie.net
How many captchas do you need ?
10
2
10
3
10
4
10 20 30 40 50 60 70 80 90 100 Per−Captcha Precision (%) Corpus Size (in Digits) Authorize Digg Ebay MSLive Recaptcha Yahoo 58
SLIDE 190 Elie Bursztein (@elie) https://elie.net
Video captcha
Interesting direction -> more design space Good for human Good for computer :(
Working on it
59
See blog post for more information: http://elie.im/blog
SLIDE 191 Elie Bursztein (@elie) https://elie.net
Apply
Within 3 months
Make sure you have a strong captcha scheme (use mine if you want) Ensure that your site is accessible
Within 6 months
Log your captchas failure rate and monitor them Have a backup captcha scheme in case your scheme is broken
60
SLIDE 192 Elie Bursztein (@elie) https://elie.net
Thank you !
Questions ? Follow-me ! Thank you Twitter: @elie
61
Captcha research: http://elie.im/tag/captcha