Synesthesia The problem Many colleagues appear blandly disengaged - - PowerPoint PPT Presentation

synesthesia
SMART_READER_LITE
LIVE PREVIEW

Synesthesia The problem Many colleagues appear blandly disengaged - - PowerPoint PPT Presentation

Synesthesia The problem Many colleagues appear blandly disengaged during crucial video-conference calls 2 The challenge Telling what they are actually doing VS. 3 Idea: hear the screen ? Victim (evil colleague appearing


slide-1
SLIDE 1

Synesthesia

slide-2
SLIDE 2

2

The problem

  • Many colleagues appear blandly disengaged during

crucial video-conference calls

slide-3
SLIDE 3

3

  • Telling what they are actually doing…

The challenge

VS.

slide-4
SLIDE 4

4

Idea: “hear” the screen

Attacker (you) Victim

(evil colleague appearing aloof and disengaged)

Voice over IP

?

slide-5
SLIDE 5

5

acoustic noise ?

slide-6
SLIDE 6

6

Acoustic leakage from screens is dangerous

Microphones are ubiquitous Audio is commonly shared and stored …conveying

  • n-screen

content?

WWW

Acoustic leakage highly available compared to electromagnetic leakage [Eck’85][Kuh’04]

slide-7
SLIDE 7

7

pixel color transitions (Zebra)

Detecting leakage: “see a Zebra”

66 stripes x 60 refresh per second = 4k black/white transitions per second 4 kHz Frequency Time

!!

slide-8
SLIDE 8

8

Changing stripe width

Frequency Time

slide-9
SLIDE 9

9

Leakage pattern consistent across makes/models

920NW ZR30w U3011t 170S4

slide-10
SLIDE 10

10

Leakage pattern consistent across many makes/ models

slide-11
SLIDE 11

11

Whence acoustic leakage?

slide-12
SLIDE 12

12

Whence acoustic leakage?

power supply control board display

  • vs. acoustic leakage of

CPU computation [GST’14]

slide-13
SLIDE 13

13

So far: lab conditions

slide-14
SLIDE 14

14

Attacker (you) Victim

(evil colleague appearing aloof and disengaged)

Voice over IP

Webcam microphone (close to screen) Victim’s environment

Record using commodity equipment? Codec-encoded audio?

slide-15
SLIDE 15

15

VoIP

Codec-encoded VoIP (Google Hangouts)

slide-16
SLIDE 16

16

Leakage still detectible in cloud-archived recordings!

Recordings uploaded to the cloud

slide-17
SLIDE 17

17

Smart phone

slide-18
SLIDE 18

18

Attack at a distance (using a parabolic dish)

slide-19
SLIDE 19

19

What can an attacker do?

  • Activity/website

distinguishing

  • On-screen keyboard

snooping

  • Text extraction

g

abcdefg

slide-20
SLIDE 20

20

How?

  • 1. denoising
  • 2. ML-based attacks
  • Website

distinguishing

  • On-screen

keyboard snoop

  • Text extraction
slide-21
SLIDE 21

21

Observation (1): amplitude modulation

time amplitude

pixel line intensity modulated on 32 kHz carrier

slide-22
SLIDE 22

22

Observation (2): signal redundancy

  • Screen refreshes every ~1/60 seconds

è the signal is extremely redundant!

  • Chop and average?

1/60 sec 2/60 sec 3/60 sec 4/60 sec 0 sec Average: high SNR!

slide-23
SLIDE 23

23

Leveraging redundancy: challenges

  • Drift
  • Jitter (+anomalous refresh cycles)

1/60+𝜗 sec sec 2/60+2𝜗 sec sec 3/60+3𝜗 sec sec 4/60+4𝜗 sec sec 0 sec 1/60+𝜗 sec sec ?? sec ??+1/60+𝜗 sec sec 0 sec

slide-24
SLIDE 24

24

Leveraging redundancy: our approach

  • Naïve approaches do not work
  • High-level idea:

– Choose a “master” chop that correlates well with its consecutive one – Extract chops chronologically, starting with the master – Automatically account for minor drift on-the-fly using a correlation test – If correlation becomes very low (indicating jitter encountered), re- synchronize with master chop via correlation analysis

Our approach Ground truth

slide-25
SLIDE 25

25

How?

  • 1. denoising
  • 2. ML-based attacks
  • Website

distinguishing

  • On-screen

keyboard snoop

  • Text extraction
slide-26
SLIDE 26

26

ML-based attacker: website distinguishing

display different websites, simulate attack denoise

attacker’s screen

training traces (with known websites)

neural network training

attack time

victim’s screen

victim’s trace inference victim’s website denoise

  • ff-line phase
slide-27
SLIDE 27

27

Website distinguishing: results

attacker accuracy websites traces per website 97% 97 100x5s 90% 97 100x5s 91% 97 100x5s 99.4%

10 sites + Hangouts window

300x6s

video-chat window vs. surfing the Web

slide-28
SLIDE 28

28

How?

  • 1. denoising
  • 2. ML-based attacks
  • Website

distinguishing

  • On-screen

keyboard snoop

  • Text extraction
slide-29
SLIDE 29

29

On-screen keyboards

Considered “safe” against audio-recording attacks on physical keyboards

[AA’04, BWY’06, VP’09, HS’12, BCV’08, HS’15, ZZT09, CCLT’17]

Sometimes required for security, e.g., by online banking websites

slide-30
SLIDE 30

30

victim’s screen

victim’s trace inference victim’s website key denoise

slide-31
SLIDE 31

31

Results: keyboard snooping 1

attacker screen layout key accuracy key top-3 accuracy 40.8% 71.9% 96.4% 99.6%

Extract whole words with high accuracy?

slide-32
SLIDE 32

32

Results: keyboard snooping 2 (grouping horizontally-aligned keys)

attacker screen layout word contained in small “prediction set” 94% 98%

slide-33
SLIDE 33

33

How?

  • 1. denoising
  • 2. ML-based attacks
  • Website

distinguishing

  • On-screen

keyboard snoop

  • Text extraction
slide-34
SLIDE 34

34

ML-based attacker: text extraction

victim’s screen

victim’s trace inference victim’s website ??? denoise

“open-world” domain, cannot directly apply classifier

slide-35
SLIDE 35

35

Extracting on-screen text

  • Idea:
  • 1. Train separate classifier for each character

location

è Up to 98% per-character accuracy

  • 2. Error-correction exploiting natural language

redundancy

è Exact word extracted with probability >1/2 Some limitations: large monospace font, known layout…

slide-36
SLIDE 36

36

Cross-screen train-test

display different websites, simulate attack denoise

attacker’s screen

training traces (with known websites)

neural network training

attack time

victim’s screen

victim’s trace inference victim’s website denoise

  • ff-line phase

attacker’s screen victim’s screen

Can we train on one screen and attack another screen?

slide-37
SLIDE 37

37

Are traces from different screens similar?

S2 S1 S1

T (sec) amplitude

slide-38
SLIDE 38

38

  • Challenge: overfitting to training screen
  • Idea: learn from multiple screens

Learning from multiple screens

Trend: more training screens à higher accuracy Up to 94% accuracy

Distinguishing between 25 websites, training on up to 10 screens

slide-39
SLIDE 39

39

Microphones are ubiquitous It conveys

  • n-screen

content Audio is commonly shared and stored

cs.tau.ac.il/~tromer/synesthesia

A thousand words are worth a picture