[PPT] - Side-Channel Attacks and Human Secrets Yossi Oren, BGU PowerPoint Presentation

SLIDE 1

Side-Channel Attacks and Human Secrets

Yossi Oren, BGU https://iss.oy.ne.ro @yossioren CROSSING Conference,TU Darmstadt, Germany September 2019 Joint work with Anatoly Shusterman, Lachlan Kang, Yosef Meltser, Yarden Haskal, Prateek Mittal and Yuval Yarom

SLIDE 2

https://orenlab.sise.bgu.ac.il

SLIDE 3

Output

μ-Arch EM Heat Power Vibration Timing

Secure Device

Radiation

Implementation Attacks

Bad Input Errors

3

Secret Input

SLIDE 4

Types of Secrets

Crypto Secrets State Secrets Human Secrets Short-Term Session Keys Long-Term Signing Keys Long-Term Decryption Keys Addresses of Sensitive Instructions Inventory of Installed Vulnerable Software Random Number Generator State Identity Passwords Browsing History Images on Screen Health Sensors

What if the secret is compromised?
How do we protect the secret from attack?

SLIDE 5

SLIDE 6

Target PC Target Adversary Target Browser Sensitive Website

SLIDE 7

Target PC Target Adversary Target Browser Sensitive Website Tor Network

SLIDE 8

Website Fingerprinting

Collect Labeled

Network Traces

Extract

Features

Train Classifier

(classical/deep)

Classify

Unknown Network Traces

Automated Website Fingerprinting through Deep Learning

Vera Rimmer∗, Davy Preuveneers∗, Marc Juarez§, Tom Van Goethem∗ and Wouter Joosen∗

∗imec-DistriNet, KU Leuven

E m a i l : { fi r s t n a m e . l a s t n a m e } @ c s . k u l e u v e n . b e

§imec-COSIC, ESAT, KU Leuven

E m a i l : m a r c . j u a r e z @ e s a t . k u l e u v e n . b e Abstract—Several studies have shown that the network traffic that is generated by a visit to a website over Tor reveals information specific to the website through the timing and sizes of network packets. By capturing traffic traces between users and their Tor entry guard, a network eavesdropper can leverage this meta-data to reveal which website Tor users are

visiting. The success of such attacks heavily depends on the

particular set of traffic features that are used to construct the

fingerprint. Typically, these features are manually engineered

and, as such, any change introduced to the Tor network can render these carefully constructed features ineffective. In this paper, we show that an adversary can automate the feature engineering process, and thus automatically deanonymize Tor traffic by applying our novel method based on deep learning. We collect a dataset comprised of more than three million network traces, which is the largest dataset of web traffic ever used for website fingerprinting, and find that the performance achieved by

ur deep learning approaches is comparable to known methods

which include various research efforts spanning over multiple

years. The obtained success rate exceeds 96% for a closed world
f 100 websites and 94% for our biggest closed world of 900
classes. In our open world evaluation, the most performant

deep learning model is 2% more accurate than the state-of- the-art attack. Furthermore, we show that the implicit features automatically learned by our approach are far more resilient to dynamic changes of web content over time. We conclude that the ability to automatically construct the most relevant traffic features and perform accurate traffic recognition makes our deep learning based approach an efficient, flexible and robust technique for website fingerprinting. I . INTRODUCTION O n i

n

R

u

t e r ( T

r

) i s a c

m

m u n i c a t i

n

t

l

t h a t p r

I

n t e r n e t u s e r s . I t i s a n a c t i v e l y d e v e l

p

e d e n s u r e s t h e p r i v a c y

f

i t s u s e r s ’ e n c r y p t s t h e c

n

t e n t s n e v e r t h e

r

i g i n a n d d e s t i n a t i

n
f

a c

m

m u n i c a t i

n

a t t h e s a m e t i m e . T

r

’ s a r c h i t e c t u r e t h u s p r e v e n t s I S P s a n d l

c

a l n e t w

r

k

b

s e r v e r s f r

m

i d e n t i f y i n g t h e w e b s i t e s u s e r s v i s i t . A s a r e s u l t

f

p r e v i

u

s r e s e a r c h

n

T

r

p r i v a c y , a s e r i

u

s s i d e

c

h a n n e l

f

T

r

n e t w

r

k t r a f fi c w a s r e v e a l e d t h a t a l l

w

e d a l

c

a l a d v e r s a r y t

i

n f e r w h i c h w e b s i t e s w e r e v i s i t e d b y a p a r t i c u l a r u s e r [ 1 4 ] . T h e i d e n t i f y i n g i n f

r

m a t i

n

l e a k s f r

m

t h e c

m

m u n i c a t i

n

’ s m e t a

d

a t a , m

r

e p r e c i s e l y , f r

m

t h e d i

r

e c t i

n

s a n d s i z e s

f

e n c r y p t e d n e t w

r

k p a c k e t s . A s t h i s s i d e

c

h a n n e l i n f

r

m a t i

n

i s

f

t e n u n i q u e f

r

a s p e c i fi c w e b s i t e , i t c a n b e l e v e r a g e d t

f
r

m a u n i q u e fi n g e r p r i n t , t h u s a l l

w

i n g n e t w

r

k e a v e s d r

p

p e r s t

r

e v e a l w h i c h w e b s i t e w a s v i s i t e d b a s e d

n

t h e t r a f fi c t h a t i t g e n e r a t e d . T h e f e a s i b i l i t y

f

W e b s i t e F i n g e r p r i n t i n g ( W F ) a t t a c k s

n

T

r

w a s a s s e s s e d i n a s e r i e s

f

s t u d i e s [ 2 5 ] , [ 3 1 ] , [ 1 9 ] , [ 2 4 ] , [ 3 2 ] . I n t h e r e l a t e d w

r

k s , t h e a t t a c k i s t r e a t e d a s a c l a s s i

fi

c a t i

n

p r

b

l e m . T h i s p r

b

l e m i s s

l

v e d b y , fi r s t , m a n u a l l y e n g i n e e r i n g f e a t u r e s

f

t r a f fi c t r a c e s a n d t h e n c l a s s i f y i n g t h e s e f e a t u r e s w i t h s t a t e

f
p

r a c t i c e m a c h i n e l e a r n i n g a l g

r

i t h m s . P r

p
s

e d a p p r

a

c h e s h a v e b e e n s h

w

n t

a

c h i e v e a c l a s s i fi c a

t

i

n

a c c u r a c y

f

9 1

9

6 % c

r

r e c t l y r e c

g

n i z e d w e b s i t e s [ 3 ] , [ 2 4 ] , [ 1 3 ] i n a s e t

f

1 w e b s i t e s w i t h 1 t r a c e s p e r w e b s i t e . T h e i r w

r

k s s h

w

t h a t fi n d i n g d i s t i n c t i v e f e a t u r e s i s e s s e n t i a l f

r

a c c u r a t e r e c

g

n i t i

n
f

w e b s i t e s . M

r

e

v

e r , t h i s t a s k s c a n b e c

s

t l y f

r

t h e a d v e r s a r y a s h e h a s t

k

e e p u p w i t h c h a n g e s i n t r

d

u c e d i n t h e n e t w

r

k p r

t
c
l

[ 4 ] , [ 2 ] , [ 9 ] . T h e W F r e s e a r c h c

m

m u n i t y t h u s f a r h a s n

t

i n v e s t i g a t e d t h e s u c c e s s

f

a n a t t a c k e r w h

a

u t

m

a t e s t h e f e a t u r e e x t r a c t i

n

s t e p f

r

c l a s s i fi c a t i

n

. T h i s i s t h e k e y p r

b

l e m t h a t w e a d d r e s s i n t h i s w

r

k . A n e s s e n t i a l s t e p

f

t r a d i t i

n

a l m a c h i n e l e a r n i n g i s f e a t u r e e n g i n e e r i n g . F e a t u r e e n g i n e e r i n g i s a m a n u a l p r

c

e s s , b a s e d

n

a n d e x p e r t k n

w

l e d g e , t

fi

n d a r e p r e s e n t a t i

n
f

r a w c h a r a c t e r i s t i c s t h a t a r e m

s

t r e l e v a n t t

t

h e e n g i n e e r i n g p r

v

e d t

b

e e v e n m

r

e m a c h i n e l e a r n i n g

a r X i v : 1 7 8 . 6 3 7 6 v 2 [ c s . C R ] 5 D e c 2 1 7

SLIDE 9

How is WF Evaluated?

Main metric is

accuracy

Closed World

vs Open World

Base rate is

important!

Network based

WF has >90% accuracy

Automated Website Fingerprinting through Deep Learning

Vera Rimmer∗, Davy Preuveneers∗, Marc Juarez§, Tom Van Goethem∗ and Wouter Joosen∗

∗imec-DistriNet, KU Leuven

E m a i l : { fi r s t n a m e . l a s t n a m e } @ c s . k u l e u v e n . b e

§imec-COSIC, ESAT, KU Leuven

E m a i l : m a r c . j u a r e z @ e s a t . k u l e u v e n . b e Abstract—Several studies have shown that the network traffic that is generated by a visit to a website over Tor reveals information specific to the website through the timing and sizes of network packets. By capturing traffic traces between users and their Tor entry guard, a network eavesdropper can leverage this meta-data to reveal which website Tor users are

visiting. The success of such attacks heavily depends on the

particular set of traffic features that are used to construct the

fingerprint. Typically, these features are manually engineered

and, as such, any change introduced to the Tor network can render these carefully constructed features ineffective. In this paper, we show that an adversary can automate the feature engineering process, and thus automatically deanonymize Tor traffic by applying our novel method based on deep learning. We collect a dataset comprised of more than three million network traces, which is the largest dataset of web traffic ever used for website fingerprinting, and find that the performance achieved by

ur deep learning approaches is comparable to known methods

which include various research efforts spanning over multiple

years. The obtained success rate exceeds 96% for a closed world
f 100 websites and 94% for our biggest closed world of 900
classes. In our open world evaluation, the most performant

deep learning model is 2% more accurate than the state-of- the-art attack. Furthermore, we show that the implicit features automatically learned by our approach are far more resilient to dynamic changes of web content over time. We conclude that the ability to automatically construct the most relevant traffic features and perform accurate traffic recognition makes our deep learning based approach an efficient, flexible and robust technique for website fingerprinting. I . INTRODUCTION O n i

n

R

u

t e r ( T

r

) i s a c

m

m u n i c a t i

n

t

l

t h a t p r

I

n t e r n e t u s e r s . I t i s a n a c t i v e l y d e v e l

p

e d e n s u r e s t h e p r i v a c y

f

i t s u s e r s ’ e n c r y p t s t h e c

n

t e n t s n e v e r t h e

r

i g i n a n d d e s t i n a t i

n
f

a c

m

m u n i c a t i

n

a t t h e s a m e t i m e . T

r

’ s a r c h i t e c t u r e t h u s p r e v e n t s I S P s a n d l

c

a l n e t w

r

k

b

s e r v e r s f r

m

i d e n t i f y i n g t h e w e b s i t e s u s e r s v i s i t . A s a r e s u l t

f

p r e v i

u

s r e s e a r c h

n

T

r

p r i v a c y , a s e r i

u

s s i d e

c

h a n n e l

f

T

r

n e t w

r

k t r a f fi c w a s r e v e a l e d t h a t a l l

w

e d a l

c

a l a d v e r s a r y t

i

n f e r w h i c h w e b s i t e s w e r e v i s i t e d b y a p a r t i c u l a r u s e r [ 1 4 ] . T h e i d e n t i f y i n g i n f

r

m a t i

n

l e a k s f r

m

t h e c

m

m u n i c a t i

n

’ s m e t a

d

a t a , m

r

e p r e c i s e l y , f r

m

t h e d i

r

e c t i

n

s a n d s i z e s

f

e n c r y p t e d n e t w

r

k p a c k e t s . A s t h i s s i d e

c

h a n n e l i n f

r

m a t i

n

i s

f

t e n u n i q u e f

r

a s p e c i fi c w e b s i t e , i t c a n b e l e v e r a g e d t

f
r

m a u n i q u e fi n g e r p r i n t , t h u s a l l

w

i n g n e t w

r

k e a v e s d r

p

p e r s t

r

e v e a l w h i c h w e b s i t e w a s v i s i t e d b a s e d

n

t h e t r a f fi c t h a t i t g e n e r a t e d . T h e f e a s i b i l i t y

f

W e b s i t e F i n g e r p r i n t i n g ( W F ) a t t a c k s

n

T

r

w a s a s s e s s e d i n a s e r i e s

f

s t u d i e s [ 2 5 ] , [ 3 1 ] , [ 1 9 ] , [ 2 4 ] , [ 3 2 ] . I n t h e r e l a t e d w

r

k s , t h e a t t a c k i s t r e a t e d a s a c l a s s i

fi

c a t i

n

p r

b

l e m . T h i s p r

b

l e m i s s

l

v e d b y , fi r s t , m a n u a l l y e n g i n e e r i n g f e a t u r e s

f

t r a f fi c t r a c e s a n d t h e n c l a s s i f y i n g t h e s e f e a t u r e s w i t h s t a t e

f
p

r a c t i c e m a c h i n e l e a r n i n g a l g

r

i t h m s . P r

p
s

e d a p p r

a

c h e s h a v e b e e n s h

w

n t

a

c h i e v e a c l a s s i fi c a

t

i

n

a c c u r a c y

f

9 1

9

6 % c

r

r e c t l y r e c

g

n i z e d w e b s i t e s [ 3 ] , [ 2 4 ] , [ 1 3 ] i n a s e t

f

1 w e b s i t e s w i t h 1 t r a c e s p e r w e b s i t e . T h e i r w

r

k s s h

w

t h a t fi n d i n g d i s t i n c t i v e f e a t u r e s i s e s s e n t i a l f

r

a c c u r a t e r e c

g

n i t i

n
f

w e b s i t e s . M

r

e

v

e r , t h i s t a s k s c a n b e c

s

t l y f

r

t h e a d v e r s a r y a s h e h a s t

k

e e p u p w i t h c h a n g e s i n t r

d

u c e d i n t h e n e t w

r

k p r

t
c
l

[ 4 ] , [ 2 ] , [ 9 ] . T h e W F r e s e a r c h c

m

m u n i t y t h u s f a r h a s n

t

i n v e s t i g a t e d t h e s u c c e s s

f

a n a t t a c k e r w h

a

u t

m

a t e s t h e f e a t u r e e x t r a c t i

n

s t e p f

r

c l a s s i fi c a t i

n

. T h i s i s t h e k e y p r

b

l e m t h a t w e a d d r e s s i n t h i s w

r

k . A n e s s e n t i a l s t e p

f

t r a d i t i

n

a l m a c h i n e l e a r n i n g i s f e a t u r e e n g i n e e r i n g . F e a t u r e e n g i n e e r i n g i s a m a n u a l p r

c

e s s , b a s e d

n

a n d e x p e r t k n

w

l e d g e , t

fi

n d a r e p r e s e n t a t i

n
f

r a w c h a r a c t e r i s t i c s t h a t a r e m

s

t r e l e v a n t t

t

h e e n g i n e e r i n g p r

v

e d t

b

e e v e n m

r

e m a c h i n e l e a r n i n g

a r X i v : 1 7 8 . 6 3 7 6 v 2 [ c s . C R ] 5 D e c 2 1 7

SLIDE 10

Target PC Adversary Sensitive Website Tor Network

Traffic Moulding Defenses against WF

+

Source: lakeland.co.uk

SLIDE 11

SLIDE 12

Target Target Browser Sensitive Website Tor Network

Architectural Boundary

Adversary

SLIDE 13

SLIDE 14

Memorygrams

Wikipedia Github Oracle

SLIDE 15

Cache-Based WF

Collect Labeled

Memorygrams

Extract

Features

Train Classifier

(classical/deep)

Classify

Unknown Memorygrams

>90% accuracy

SLIDE 16

Cache-based vs Net-based WF

Cache beats Net Net beats Cache Resists net countermeasures Can be detected by victim Robust to response caching Depends on hardware config Works across NICs Lighter attack model

SLIDE 17

Countermeasures

Hiding
Lowering the SNR
Hiding in Time
Hiding in Amplitude
Masking
Secret Invariance
Separation in Time
Separation in Space

SLIDE 18

Hiding in amplitude

Idea: run a dummy prime and probe in the

background

What is the effect on WF accuracy?
What is the effect on performance?

SLIDE 19

Effect on Accuracy

10 20 30 40 50 60 70 80 90 100 Firefox 59 CW Firefox 59 OW Tor CW Tor OW

ML with Cache Activity Masking

Without Countermeasure With Countermeasure Baseline

SLIDE 20

Effect on Performance

0% 5% 10% 15% 20% perlbench bzip2 gcc mcf gobmk hmmer sjeng libquantum h264ref

mnetpp

astar xalancbmk INT bwaves gamess milc zeusmp gromacs cactusADM leslie3d namd dealII soplex povray calculix GemsFDTD tonto lbm wrf sphinx3 FP Slowdown

SLIDE 21

Sustainability

Source: Bilge and Dumitras, CCS 2012

SLIDE 22

Humans

Non upgradable, difficult to patch
Fight security with all their might
Semi Rational

SLIDE 23

Thank you!

Dataset freely available under

CC-BY 4.0 license

Contains:
Thousands of memorygrams in

multiple settings

Associated network traces
Deep learning classifiers in Python

https://orenlab.sise.bgu.ac.il/p/RobustFingerprinting

SLIDE 24

JavaScript Attack Results

0.1 1 10 100 20 40 60 80 100 Linux - Chrome64 Win - Chrome64 MacOS - Safari1.1 Linux - Firefox59 Win - Firefox59

Closed World – Base Rate 1%

Accuracy CNN Accuracy LSTM Timer Resolution

Timer Resolution (msec) Accurac (%)

SLIDE 25

JavaScript Attack Results

1 10 100 20 40 60 80 100 TorBrowser 7.5 TorBrowser 7.5 (top5)

Closed World – Base Rate 1%

Accuracy CNN Accuracy LSTM Timer Resolution

Timer Resolution (msec) Accurac (%)

SLIDE 26

JavaScript Attack Results

0.1 1 10 100 20 40 60 80 100 Linux - Chrome64 Win - Chrome64 MacOS - Safari1.1 Linux - Firefox59 Win - Firefox59

Open World – Base Rate 33%

Accuracy CNN Accuracy LSTM Timer Resolution

Timer Resolution (msec) Accurac (%)

SLIDE 27

JavaScript Attack Results

1 10 100 20 40 60 80 100 TorBrowser 7.5 TorBrowser 7.5 (top5)

Open World – Base Rate 33%

Accuracy CNN Accuracy LSTM Timer Resolution

Timer Resolution (msec) Accuracy (%)