Authorship Obfuscation
Using Heuristic Search
Master’s Thesis Defence by Janek Bevendorff on 20 June 2018
Supervisors: Prof. Dr. Benno Stein, PD Dr. Andreas Jakoby
Authorship Obfuscation Using Heuristic Search Masters Thesis Defence - - PowerPoint PPT Presentation
Authorship Obfuscation Using Heuristic Search Masters Thesis Defence by Janek Bevendorff on 20 June 2018 Supervisors: Prof. Dr. Benno Stein, PD Dr. Andreas Jakoby Unmasking for short texts Obfuscation against unmasking Obfuscation
Master’s Thesis Defence by Janek Bevendorff on 20 June 2018
Supervisors: Prof. Dr. Benno Stein, PD Dr. Andreas Jakoby
20.06.2018 2
20.06.2018 3
20.06.2018 5
20.06.2018 5
20.06.2018 6
20.06.2018 6
20.06.2018 7 Koppel and Schler, Authorship verification as a one-class problem, 2004
20.06.2018 7 Koppel and Schler, Authorship verification as a one-class problem, 2004
20.06.2018 7 Koppel and Schler, Authorship verification as a one-class problem, 2004
20.06.2018 7 Koppel and Schler, Authorship verification as a one-class problem, 2004
20.06.2018 7 Koppel and Schler, Authorship verification as a one-class problem, 2004
20.06.2018 7 Koppel and Schler, Authorship verification as a one-class problem, 2004
20.06.2018 8 Koppel and Schler, Authorship verification as a one-class problem, 2004
20.06.2018 9
Same author
20.06.2018 9
Same author Different authors
20.06.2018 10
20.06.2018 10
the
Treasure
Island
Dr
Livesey
gentlemen
rest
having
will
begin
the
story
adventures
my
certain
morning
the
Treasure
Island
Dr
Livesey
gentlemen
rest
having
11 20.06.2018
20.06.2018 12
20.06.2018 13 1.0 0.5 0.9 0.8 0.7 0.6 Accuracy 3 6 9 12 15 18 21 Rounds Different authors Same author
20.06.2018 14 1.0 0.5 0.9 0.8 0.7 0.6 Accuracy 3 6 9 12 15 18 21 Rounds Different authors Same author
20.06.2018 15
Training Test
Confidence Level Threshold Precision % Classified Very High 0.9 1.00 6.2 0.8 1.00 12.5 0.7 1.00 13.8 High 0.6 1.00 18.8 0.5 1.00 30.0 Moderate 0.4 0.93 43.8 0.3 0.83 55.0 0.2 0.68 70.0 Low 0.1 0.82 87.5 0.0 0.76 100.0
20.06.2018 17
20.06.2018 17
20.06.2018 18 Different authors Same author 1.0 0.5 0.9 0.8 0.7 0.6 Accuracy 3 6 9 12 15 18 21 Rounds
20.06.2018 18 Different authors Same author 1.0 0.5 0.9 0.8 0.7 0.6 Accuracy 3 6 9 12 15 18 21 Rounds
20.06.2018 19 Different authors Same author 1.0 0.5 0.9 0.8 0.7 0.6 Accuracy 3 6 9 12 15 18 21 Rounds
20.06.2018 20
KLD ԡ 𝑄 𝑅 =
𝑗
𝑄[𝑗] log2 𝑄[𝑗] 𝑅[𝑗]
20.06.2018 21
JSD ԡ 𝑄 𝑅 = KLD ԡ 𝑄 𝑁 + KLD ԡ 𝑅 𝑁 2 𝑁 = 𝑄 + 𝑅 2 KLD ԡ 𝑄 𝑅 =
𝑗
𝑄[𝑗] log2 𝑄[𝑗] 𝑅[𝑗]
20.06.2018 21
JSD ԡ 𝑄 𝑅 = KLD ԡ 𝑄 𝑁 + KLD ԡ 𝑅 𝑁 2 𝑁 = 𝑄 + 𝑅 2 KLD ԡ 𝑄 𝑅 =
𝑗
𝑄[𝑗] log2 𝑄[𝑗] 𝑅[𝑗]
→ maximize
20.06.2018 22
𝜖 𝜖𝑅[𝑗] 𝑄[𝑗] log2 𝑄[𝑗] 𝑅[𝑗] = − 𝑄[𝑗] 𝑅[𝑗] ln 2
20.06.2018 22
𝜖 𝜖𝑅[𝑗] 𝑄[𝑗] log2 𝑄[𝑗] 𝑅[𝑗] = − 𝑄[𝑗] 𝑅[𝑗] ln 2 RKL (𝑗) = 𝑄[𝑗] 𝑅[𝑗]
20.06.2018 22
𝜖 𝜖𝑅[𝑗] 𝑄[𝑗] log2 𝑄[𝑗] 𝑅[𝑗] = − 𝑄[𝑗] 𝑅[𝑗] ln 2 RKL (𝑗) = 𝑄[𝑗] 𝑅[𝑗]
→ maximize
Text 1 Text 2 (to be obfuscated) Text 1 Text 2 (to be obfuscated)
20.06.2018 23
n-gram frequencies n-grams ranked left to right gre ly_ par bor y_h hel eme ny_ dis gro
Text 1 Text 2 (to be obfuscated) Text 1 Text 2 (to be obfuscated)
20.06.2018 23
n-gram frequencies n-grams ranked left to right gre ly_ par bor y_h hel eme ny_ dis gro
Text 1 Text 2 (to be obfuscated) Text 1 Text 2 (to be obfuscated)
20.06.2018 23
n-gram frequencies n-grams ranked left to right gre ly_ par bor y_h hel eme ny_ dis gro
Text 1 Text 2 (to be obfuscated) Text 1 Text 2 (to be obfuscated)
20.06.2018 23
n-gram frequencies n-grams ranked left to right gre ly_ par bor y_h hel eme ny_ dis gro
Text 1 Text 2 (to be obfuscated) Text 1 Text 2 (to be obfuscated)
20.06.2018 23
n-gram frequencies n-grams ranked left to right gre ly_ par bor y_h hel eme ny_ dis gro
Text 1 Text 2 (to be obfuscated) Text 1 Text 2 (to be obfuscated)
20.06.2018 23
n-gram frequencies n-grams ranked left to right gre ly_ par bor y_h hel eme ny_ dis gro
Text 1 Text 2 (to be obfuscated) Text 1 Text 2 (to be obfuscated)
20.06.2018 23
n-gram frequencies n-grams ranked left to right gre ly_ par bor y_h hel eme ny_ dis gro
20.06.2018 24 1.4 0.4 1.2 1.0 0.8 0.6 JS distance (JS∆) 27 Text length (characters) 28 29 210 211 212 213 214 JSΔ = 2 ⋅ JSD( ԡ 𝑄 𝑅) Different authors Same author
20.06.2018 24 1.4 0.4 1.2 1.0 0.8 0.6 JS distance (JS∆) 27 Text length (characters) 28 29 210 211 212 213 214 JSΔ = 2 ⋅ JSD( ԡ 𝑄 𝑅) Different authors Same author ɛ0
20.06.2018 24 1.4 0.4 1.2 1.0 0.8 0.6 JS distance (JS∆) 27 Text length (characters) 28 29 210 211 212 213 214 JSΔ = 2 ⋅ JSD( ԡ 𝑄 𝑅) Different authors Same author ɛ0.5 ɛ0
Confidence Level Threshold Precision % Classified Very High 0.9 1.00 6.2 0.8 1.00 12.5 0.7 1.00 13.8 High 0.6 1.00 18.8 0.5 1.00 30.0 Moderate 0.4 0.93 43.8 0.3 0.83 55.0 0.2 0.68 70.0 Low 0.1 0.82 87.5 0.0 0.76 100.0 20.06.2018 25
Confidence Level Threshold Precision % Classified Very High 0.9 0.00 2.5 0.8 0.00 5.0 0.7 0.00 8.7 High 0.6 0.00 17.5 0.5 0.00 27.5 Moderate 0.4 0.00 42.5 0.3 0.67 66.7 0.2 0.50 70.0 Low 0.1 0.42 85.0 0.0 0.53 100.0 20.06.2018 25 Confidence Level Threshold Precision % Classified Very High 0.9 1.00 6.2 0.8 1.00 12.5 0.7 1.00 13.8 High 0.6 1.00 18.8 0.5 1.00 30.0 Moderate 0.4 0.93 43.8 0.3 0.83 55.0 0.2 0.68 70.0 Low 0.1 0.82 87.5 0.0 0.76 100.0
20.06.2018 27
s
20.06.2018 27
s
CLOSED
20.06.2018 27
s
CLOSED OPEN
20.06.2018 27
s
𝑔(𝑜)
20.06.2018 27
s
𝑔(𝑜)
20.06.2018 27
s
𝑔(𝑜)
20.06.2018 27
s
20.06.2018 27
s
𝑔(𝑜)
20.06.2018 28
𝑔 𝑜 = 𝑜 + ℎ(𝑜)
20.06.2018 28
𝑔 𝑜 = 𝑜 + ℎ(𝑜) ℎ 𝑜 ≤ ℎ∗(𝑜)
20.06.2018 29
ℎ𝑞𝑠𝑗𝑝𝑠 𝑜 = 𝜁 − JSΔ𝑜
20.06.2018 29
ℎ𝑞𝑠𝑗𝑝𝑠 𝑜 𝑜𝑝𝑠𝑛 𝑜 = 𝜁 − JSΔ𝑜 = (𝑜) JSΔ𝑜 − JSΔ0
20.06.2018 29
20.06.2018 30
Linear Gain
𝜁
20.06.2018 30
Linear Gain
(𝑜) 𝜁
20.06.2018 30
Linear Gain
(𝑜) JSΔ 𝜁
20.06.2018 30
Linear Gain
ℎ(𝑜) (𝑜) JSΔ 𝜁
20.06.2018 30
Linear Gain
ℎ(𝑜) (𝑜) JSΔ 𝜁
20.06.2018 31
Sublinear Gain
ℎ(𝑜) (𝑜) JSΔ 𝜁
20.06.2018 31
Sublinear Gain
ℎ(𝑜) (𝑜) JSΔ 𝜁
20.06.2018 31
Sublinear Gain
ℎ(𝑜) (𝑜) JSΔ 𝜁
20.06.2018 32
ℎ(𝑜) (𝑜) JSΔ 𝜁
Operations 100 200 300 400
20.06.2018 32
ℎ(𝑜) (𝑜) JSΔ Stepwise JSΔ 𝜁
Operations 100 200 300 400
20.06.2018 32
ℎ(𝑜) (𝑜) JSΔ Stepwise JSΔ 𝜁
Operations 100 200 300 400
20.06.2018 33
n-gram removal
20.06.2018 33
n-gram removal
20.06.2018 33
n-gram removal character flip
20.06.2018 33
n-gram removal character flip
20.06.2018 33
The End.
n-gram removal character flip character map
20.06.2018 33
The End!
n-gram removal character flip character map
20.06.2018 33
The End!
n-gram removal character flip character map synonym
20.06.2018 33
The End!
n-gram removal character flip character map synonym
20.06.2018 33
The End!
author
n-gram removal character flip character map synonym Netspeak
20.06.2018 33
The End!
author of
n-gram removal character flip character map synonym Netspeak
20.06.2018 34
ℎ(𝑜) (𝑜) JSΔ Stepwise JSΔ
Operations 20 60 100 120 40 80
𝜁
20.06.2018 35
With a furtive glance around him, he clapped the other half of the clay sphere over the filled hemisphere and then stood up. The patients lined up at the door, waiting for the walk back across the green hills to the main hospital. The attendants made a quick count and then unlocked the door. The group shuffled out into the warm, afternoon sunlight and the door closed behind
room and picked up her chart book of patient progress. Moving slowly down the line of benches, she made short, precise notes on the day’s work accomplished by each
A Filbert Is a Nut by Rick Raphael
20.06.2018 36
With a furtive glance around him, he clapped the other half of the clay sphere over the filled hemisphere and then stood up. The patients lined up at the door, waiting for the walk back across the site hills to the main hospital. The attendants made a quick investigation and then unlocked the door. The group shuffled out into the warm, daylight sunlight and the door closed behind them. Miss Abercrombie gazed around the cluttered room and picked up her chart forward
bens, she made parcel, precise notes on the day’s work accomplishedb y aehc patient. [...]’
A Filbert Is a Nut by Rick Raphael
20.06.2018 36
With a furtive glance around him, he clapped the other half of the clay sphere over the filled hemisphere and then stood up. The patients lined up at the door, waiting for the walk back across the site hills to the main hospital. The attendants made a quick investigation and then unlocked the door. The group shuffled out into the warm, daylight sunlight and the door closed behind them. Miss Abercrombie gazed around the cluttered room and picked up her chart forward
bens, she made parcel, precise notes on the day’s work accomplishedb y aehc patient. [...]’
A Filbert Is a Nut by Rick Raphael
20.06.2018 37
With a furtive glance around him, he clapped the other half of the clay sphere over the filled hemisphere and then stood up. The patients lined up at the door, waiting for the walk back across the site hills to the main hospital. The attendants made a quick investigation and then unlocked the door. The group shuffled out into the warm, daylight sunlight and the door closed behind them. Miss Abercrombie gazed around the cluttered room and picked up her chart forward
bens, she made parcel, precise notes on the day’s work accomplishedb y aehc patient. [...]’
A Filbert Is a Nut by Rick Raphael
20.06.2018 38
20.06.2018 38
20.06.2018 38
20.06.2018 38
20.06.2018 38
20.06.2018 38
for your attention
Image Credits: Min An, Pexels.com