Hidden Voice Commands Nicholas Carlini*, Pratyush Mishra*, Tavish - - PowerPoint PPT Presentation

hidden voice commands
SMART_READER_LITE
LIVE PREVIEW

Hidden Voice Commands Nicholas Carlini*, Pratyush Mishra*, Tavish - - PowerPoint PPT Presentation

Hidden Voice Commands Nicholas Carlini*, Pratyush Mishra*, Tavish Vaidya**, Yuankai Zhang**, Micah Sherr**, Clay Shields**, David Wagner*, Wenchao Zhou** * University of California, Berkeley ** Georgetown University Voice channel opens up new


slide-1
SLIDE 1

Hidden Voice Commands

Nicholas Carlini*, Pratyush Mishra*, Tavish Vaidya**, Yuankai Zhang**, Micah Sherr**, Clay Shields**, David Wagner*, Wenchao Zhou** * University of California, Berkeley ** Georgetown University

slide-2
SLIDE 2
slide-3
SLIDE 3

Voice channel opens up new 
 possibilities for attack

slide-4
SLIDE 4

Today: "Okay google, text [premium SMS number]"

slide-5
SLIDE 5

In the future? "Okay google, pay John $100"

slide-6
SLIDE 6
slide-7
SLIDE 7

We make voice commands stealthy.

slide-8
SLIDE 8

We produce audio which is noise to humans, but speech to devices.

slide-9
SLIDE 9

This is an instance of attacks 


  • n Machine Learning
slide-10
SLIDE 10

Background

slide-11
SLIDE 11

Background

Machine Learning Algorithm

Text

slide-12
SLIDE 12

Background

ML Algorithm Feature Extraction

Text

slide-13
SLIDE 13

Feature Extraction

slide-14
SLIDE 14

Feature Extraction

slide-15
SLIDE 15

Feature Extraction

slide-16
SLIDE 16

Feature Extraction

[x0]

MFCC MFCC MFCC

[x1] [x2]

slide-17
SLIDE 17

ML Algorithm Feature Extraction

Text

slide-18
SLIDE 18
slide-19
SLIDE 19

First Attack: White-Box

Assume complete system knowledge 
 (model, parameters, etc)

slide-20
SLIDE 20

Recognition

ML Algorithm Feature Extraction

Text

slide-21
SLIDE 21

Attack

ML Algorithm Feature Extraction

Text

slide-22
SLIDE 22

Attack

ML Algorithm Feature Extraction

Text

slide-23
SLIDE 23

Attack

ML Algorithm Feature Extraction

Text

slide-24
SLIDE 24

Inverting Feature Extraction

[x0] [x1] [x2] MFCC-1 MFCC-1 MFCC-1

slide-25
SLIDE 25

Inverting Feature Extraction

[x0] [x1] [x2] MFCC-1 MFCC-1 MFCC-1

slide-26
SLIDE 26

Inverting Feature Extraction

[x0] MFCC-1

slide-27
SLIDE 27

Inverting Feature Extraction

[x0] [x1] MFCC-1 MFCC-1

slide-28
SLIDE 28

Inverting Feature Extraction

[x0] [x1] MFCC-1 MFCC-1

slide-29
SLIDE 29

Inverting Feature Extraction

[x0] [x1] [x2] MFCC-1 MFCC-1 MFCC-1

slide-30
SLIDE 30

Inverting Feature Extraction

[x0] [x1] [x2] MFCC-1 MFCC-1 MFCC-1

slide-31
SLIDE 31

Actually not that easy

slide-32
SLIDE 32

Playing attacks over-the-air

  • 1. Create a model of the physical channel
  • 2. Use model to predict effect of over-the-air
  • 3. Validate model by playing potential
  • bfuscated commands during generation
slide-33
SLIDE 33

Demo

slide-34
SLIDE 34

Demo

slide-35
SLIDE 35

Okay Google, take a picture

slide-36
SLIDE 36

Demo

slide-37
SLIDE 37

Okay Google, text 12345

slide-38
SLIDE 38

Demo

slide-39
SLIDE 39

Okay Google, browse to evil.com

slide-40
SLIDE 40

Not Over-The-Air Demo

slide-41
SLIDE 41

Okay Google, browse to evil.com

slide-42
SLIDE 42
slide-43
SLIDE 43

Limitations

No background noise, in an echo-free room. Assumes complete knowledge of model.

slide-44
SLIDE 44
slide-45
SLIDE 45

Can we make this attack practical?

Can we remove the white-box assumption?

slide-46
SLIDE 46

Yes.

... but at the expense of attack quality.

slide-47
SLIDE 47
slide-48
SLIDE 48

Audio Obfuscater Speech Recognition

Text

Black-Box Attack

slide-49
SLIDE 49

Speech Recognition

Text

Black-Box Attack

MFCC MFCC-1

slide-50
SLIDE 50

Evaluation

slide-51
SLIDE 51

Demo

slide-52
SLIDE 52
slide-53
SLIDE 53
slide-54
SLIDE 54
slide-55
SLIDE 55
slide-56
SLIDE 56

White-Box

Attack on open system Commands heavily obfuscated Works when played over-the-air Doesn't tolerate background noise

Black-Box

Practical real-world attack Somewhat possible to recognize Works when played over-the-air Background noise and echo okay

slide-57
SLIDE 57
slide-58
SLIDE 58

Defenses?

Notify the user that an action was taken. Challenge the user to perform an action. Detect and prevent the malicious commands.

slide-59
SLIDE 59

Detect and Prevent

Successfully trained simple machine learning classifier: learn the difference between attack commands and actual commands

slide-60
SLIDE 60
slide-61
SLIDE 61

Conclusion

Voice: new paradigm for human-device interaction. This brings many new risks. Our hidden voice commands are practical. The impact of these attacks will increase. Future work is needed to construct defenses.

http://hiddenvoicecommands.com/

slide-62
SLIDE 62
slide-63
SLIDE 63
slide-64
SLIDE 64