VentriLock: Exploring voice-based authentication systems Chaouki K - - PowerPoint PPT Presentation

ventrilock exploring voice based authentication systems
SMART_READER_LITE
LIVE PREVIEW

VentriLock: Exploring voice-based authentication systems Chaouki K - - PowerPoint PPT Presentation

VentriLock: Exploring voice-based authentication systems Chaouki K ASMI & Jos L OPES E STEVES ANSSI, F RANCE Hack In Paris 06/2017 WHO WE ARE Chaouki Kasmi and Jos Lopes Esteves ANSSI-FNISA / Wireless Security Lab


slide-1
SLIDE 1

VentriLock: Exploring voice-based authentication systems

Chaouki KASMI & José LOPES ESTEVES ANSSI, FRANCE

Hack In Paris – 06/2017

slide-2
SLIDE 2

Chaouki Kasmi & José Lopes Esteves

WHO WE ARE

Chaouki Kasmi and José Lopes Esteves

  • ANSSI-FNISA / Wireless Security Lab
  • Electromagnetic threats on information

systems

  • RF communications security
  • Embedded systems
  • Signal processing

2

slide-3
SLIDE 3

Chaouki Kasmi & José Lopes Esteves

AGENDA

  • Context: Voice command interpreters
  • Voice as biometrics
  • From brain to computer’s model
  • Testing voice authentication engines
  • Conclusion and future work

3

slide-4
SLIDE 4

Definitions and security analysis

Voice Command Interpreters

slide-5
SLIDE 5

Chaouki Kasmi & José Lopes Esteves

VOICE COMMAND INTERPRETERS

5

Where? Who? What?

APIs

*

slide-6
SLIDE 6

Chaouki Kasmi & José Lopes Esteves

THREAT OF UNAUTHORIZED USE

6

slide-7
SLIDE 7

Chaouki Kasmi & José Lopes Esteves

THREAT OF UNAUTHORIZED USE

  • Silent voice command injection with a radio

signal by front-door coupling on headphones cables [5]

7

Target Tx antenna Headphone cable

slide-8
SLIDE 8

Chaouki Kasmi & José Lopes Esteves

THREAT OF UNAUTHORIZED USE

  • Silent voice command injection with a radio

signal by back-door coupling [6]

8

slide-9
SLIDE 9

Chaouki Kasmi & José Lopes Esteves

THREAT OF UNAUTHORIZED USE

  • Malicious application playing voice

commands through the phone’s speaker [1]

  • Mangled commands understandable by the

system but not the user [3]

  • Same technique, embedded in multimedia

files [2,4]

9

slide-10
SLIDE 10

Chaouki Kasmi & José Lopes Esteves

SECURITY IMPACTS

  • Tracking
  • Eavesdropping
  • Cost abuse
  • Reputation / Phishing
  • Malicious app trigger/payload delivery
  • Advanced compromising
  • Unauthorized use of applications / services /

smart devices…

10

slide-11
SLIDE 11

Chaouki Kasmi & José Lopes Esteves

SECURITY MEASURES

11

  • Personalize keyword
  • Carefully choose available commands

(esp. Pre-auth)

  • Limit critical commands
  • Provide finer-grain settings to user
  • Enable feedbacks (sound, vibration…)
  • Voice recogniton

flickr.com/photos/hikingartist

slide-12
SLIDE 12

Using voice for authentication

Voice as biometrics

slide-13
SLIDE 13

Chaouki Kasmi & José Lopes Esteves

BIOMETRICS

  • "automated recognition of individuals based
  • n their biological and behavioural

characteristics“

  • "biological and behavioural characteristic of

an individual from which distinguishing, repeatable biometric features can be extracted for the purpose of biometric recognition"

13

biometricsinstitute.org ISO/IEC 2382-37. Information technology — Vocabulary — Part 37: Biometrics

slide-14
SLIDE 14

Chaouki Kasmi & José Lopes Esteves

BIOMETRICS

14

Biometrics Behavioral Voice Physical Head Hand Others Others

  • Face
  • Iris
  • Ear
  • Etc.
  • Fingerprint
  • Palmprint
  • Hand geometry
  • Vein pattern
  • Etc.
  • DNA
  • Etc.
  • Writing
  • Typing
  • Gait
  • Etc.
slide-15
SLIDE 15

Chaouki Kasmi & José Lopes Esteves

BIOMETRICS

  • Enrollment
  • Application

15

www.silicon.co.uk Acquisition Signal processing Feature extraction Template / Model Acquisition Signal processing Feature extraction Comparison / Decision

slide-16
SLIDE 16

Chaouki Kasmi & José Lopes Esteves

VOICE BIOMETRICS

  • Applications:

 Speaker verification/authentication,  Speaker identification…

  • Two main cases:

 Text independent  Text dependent

16

http://www.busim.ee.boun.edu.tr

slide-17
SLIDE 17

Chaouki Kasmi & José Lopes Esteves

VOICE BIOMETRICS

  • Applications:

 Speaker verification/authentication,  Speaker identification…

  • Two main cases:

 Text independent  Text dependent

17

http://www.busim.ee.boun.edu.tr

slide-18
SLIDE 18

Chaouki Kasmi & José Lopes Esteves

VOICE BIOMETRICS

  • Enrollment

 3 to 5 repetitions of the keyword

  • Model derivation

 The more samples, the more reliable

  • Speaker verification

 A comparison metrics and a threshold

18

Acquisition Signal processing Feature extraction Microphone Pre-emphasis Filtering… LPC, MFCC, LPCC, DWT, WPD, PLP… GMM, RNN… Comparison / Decision

slide-19
SLIDE 19

Chaouki Kasmi & José Lopes Esteves

VOICE BIOMETRICS

  • Pros:

 Acquisition device (microphone) widespread and

low cost

 Remote operation possible and natively

supported

  • Cons:

 Voice changes over time (accuracy vs. usability)  Malicious acquisition very easy  Generation, modification tools available  Submission of test vectors affordable (speaker)  Liveness detection not trivial

19

slide-20
SLIDE 20

Chaouki Kasmi & José Lopes Esteves

VOICE BIOMETRICS

  • Reliability issues:

 “At the present time, there is no scientific

process that enables one to uniquely characterize a person’s voice” (2003) [10]

  • “Especially when:

 The speaker does not cooperate  There is no control over recording equipment  Recording conditions are not known  One does not know if the voice was disguised  The linguistic content is not controlled ”

20

slide-21
SLIDE 21

Chaouki Kasmi & José Lopes Esteves

VOICE BIOMETRICS

  • Reliability issues:

21

Extract from [12]

slide-22
SLIDE 22

Feature extraction techniques

From brain to computer’s model

slide-23
SLIDE 23

Chaouki Kasmi & José Lopes Esteves

FROM BRAIN TO COMPUTER’S MODEL

  • Voice characteristics
  • What we hear?

Dan Jurafsky “Lecture 6: Feature Extraction and Acoustic Modeling “

23

slide-24
SLIDE 24

Chaouki Kasmi & José Lopes Esteves

FROM BRAIN TO COMPUTER’S MODEL

  • Voice characteristics – Specificities

 Signal processing of non-stationnary signals  Characteristics function of the time

24

slide-25
SLIDE 25

Chaouki Kasmi & José Lopes Esteves

FROM BRAIN TO COMPUTER’S MODEL

  • Voice characteristics – Specificities

 Sensitivity of human hearing not linear  Less sensitive at higher frequencies > 1 kHz

25

Dan Jurafsky “Lecture 6: Feature Extraction and Acoustic Modeling “

slide-26
SLIDE 26

Chaouki Kasmi & José Lopes Esteves

FROM BRAIN TO COMPUTER’S MODEL

  • Linear prediction cepstral coefficient (LPCC)
  • Energy values of linearly arranged filter banks
  • Mimic the human speech production
  • Discrete Wavelet Transform (DWT)
  • Decomposition separates the lower frequency

contents and higher frequency contents.

  • Only the low pass signal is further split
  • Wavelet Packet Decomposition (WPD)
  • Low and High pass signals are further split

26

slide-27
SLIDE 27

Chaouki Kasmi & José Lopes Esteves

FROM BRAIN TO COMPUTER’S MODEL

  • Mel-frequency cepstral coefficients (MFCC)

 Frequency bands are placed logarithmically  Model the human system closely  Easier to implement  Voice to text and voice recognition engines  Widely used for feature extraction (many papers

published by voice recognition editors ex. Google)

27

slide-28
SLIDE 28

Chaouki Kasmi & José Lopes Esteves

FROM BRAIN TO COMPUTER’S MODEL

  • Mel-frequency cepstral coefficients (MFCC)

 Preprocessing before feature extraction;  Framing the signal are splits in time domain, then on

each individual frame then windowing them;

 Converting each frame TD to FD with DFT;  Filter bank is created by calculating number of picks

spaced on Mel-scale and again transforming back to the normal frequency scale;

 Converting back the mel spectrum coefficient to TD

coefficient to the time domain with Discrete Cosine Transform

28

slide-29
SLIDE 29

Testing in a black-box context existing solutions

Testing voice authentication engines

slide-30
SLIDE 30

Chaouki Kasmi & José Lopes Esteves

TESTING APPROACH

  • We consider the verification system as a

black box

  • We use publicly available toolsets
  • We set up test scenarios based on the

attack’s prerequisites

 Knows target language ?  Knows target’s keyword ?  Possesses target’s voice samples?

30

slide-31
SLIDE 31

Chaouki Kasmi & José Lopes Esteves

EXPERIMENTAL SETUP

31

Wi-Fi Target 1 (Siri) Target 2 (S-voice) Target 3 (Google now)

slide-32
SLIDE 32

Chaouki Kasmi & José Lopes Esteves

TESTS: SPEAKER IMPERSONATION

  • The attacker hears the target saying the

keyword

  • He tries to impersonate the target’s voice
  • We are not professional impersonators
  • But we succeeded on all tested targets

 Within less than 15 attempts

32

slide-33
SLIDE 33

Chaouki Kasmi & José Lopes Esteves

TESTS: REPLAY

  • The attacker has a recording of the target

saying the keyword

  • Our demo last year at Hack In Paris [6]

33

slide-34
SLIDE 34

Chaouki Kasmi & José Lopes Esteves

TESTS: REPLAY

  • The attacker has a recording of the target

saying the keyword

  • Our demo last year at Hack In Paris [6]
  • Additionnal tests

 Looking to boundaries with legit sample

modifications (Filtering, Pitch, Time-Scale, SNR)

 Target 1 (Siri) is shifting pre-auth. ???

34

slide-35
SLIDE 35

Chaouki Kasmi & José Lopes Esteves

TESTS: MODEL SHIFTING

  • The attacker knows the keyword
  • If the model is updated for each submitted

sample

  • It can shift so as to accept any voice sample
  • By submitting the same sample repeatedly

until it passes the authentication

35

slide-36
SLIDE 36

Chaouki Kasmi & José Lopes Esteves

TESTS: MODEL SHIFTING

  • Results related to target 1

 Try 1 : 10 use by legit user  Try 2 : 50 use by legit user  Number of try required to trigger target 1  Legit user still able to trigger target 1 (+ OK, -

NOK)

36

2 3 4 5 6 7 8 9 10 1

1, + 1, + 16, + 25,+ 101,- 21,+ 34,- 70,+ 385, -

1 bis

1, + 4, + 30, + 48, + 98,- 33,- 24,+ 54,- 402, -

slide-37
SLIDE 37

Chaouki Kasmi & José Lopes Esteves

TESTS: TD RECONSTRUCTION

  • The attacker knows the keyword
  • The attacker has a recording of the target
  • Contains all the phonemes of the keyword
  • He reconstructs the keyword by

concatenating the phonemes in time domain

37

Video 1

slide-38
SLIDE 38

Chaouki Kasmi & José Lopes Esteves

TESTS: FD RECONSTRUCTION

  • The attacker knows the MFCC features

extracted from the target pronouncing the keyword

  • He can modify the MFCC and reconstruct

several time domain samples from the features [3,4]

38

slide-39
SLIDE 39

Chaouki Kasmi & José Lopes Esteves

TESTS: FD RECONSTRUCTION

  • MFCC and MFCC inverse

 MFCC inverse of legit user  MFCC inverse of a composition of samples

  • Targets 2 and 3: the MFCC seems to contain

enough of the information required to authenticate

39

Video 2

slide-40
SLIDE 40

Chaouki Kasmi & José Lopes Esteves

TESTS: KEYWORD COMPOSITION

  • The attacker knows the keyword
  • He has access to several other voice

samples saying the keyword

  • He generates test vectors by superimposing

several voice samples

40

slide-41
SLIDE 41

Chaouki Kasmi & José Lopes Esteves

TESTS: KEYWORD COMPOSITION

41

1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 try 2-100 try > 100 try – NOK Video 3

Enrolled voice Superimposed voices

slide-42
SLIDE 42

Chaouki Kasmi & José Lopes Esteves

TESTS: KEYWORD COMPOSITION

42

1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 try 2-100 try > 100 try – NOK Removed

Enrolled voice Superimposed voices

slide-43
SLIDE 43

Chaouki Kasmi & José Lopes Esteves

TEST SUMMARY

43

Test Target 1 (Siri) Target 2 (S-voice) Target 3 (Google now) Impersonation

  

Replay

  

TD reconstruction

  

FD reconstruction

  

Keyword composition

  ?

Model shifting

 ? ?

slide-44
SLIDE 44

Conclusion and future work

slide-45
SLIDE 45

Chaouki Kasmi & José Lopes Esteves

CONCLUSION

  • Voice interface is getting widespread

 Devices without other UI  Hands free commodity

  • Available commands and actions

 Getting richer and more critical

  • Voice authentication seemed as a viable

countermeasure

  • But it is still unefficient and immature
  • Gives a false security feeling

45

slide-46
SLIDE 46

Chaouki Kasmi & José Lopes Esteves

TEST RESULTS LIMITATIONS

  • Results cannot be generalized, depend on

 Language  Keyword  Legitimate user’s voice during enrollment  Model and decision metrics

  • And we don’t know the model

 It can be updated  There can be several models/approaches  It can shift

  • That’s why we don’t provide result statistics

46

slide-47
SLIDE 47

Chaouki Kasmi & José Lopes Esteves

COUNTERMEASURES

  • Prevent unlimited successive failed

authentication attempts (as Google does)

  • Prove the command originates from the user:

 Use the phone’s sensors [7, 9]

  • Add entropy and interaction

 N-staged process, with challenge-response

  • Enhance the user’s voice model

 Qualcomm patent: continuous voice

authentication [8]

47

slide-48
SLIDE 48

Chaouki Kasmi & José Lopes Esteves

FUTURE WORK: FEATURES BRUTEFORCE

  • The attacker knows the keyword
  • He has access to several other voice

samples saying the keyword

  • He extracts features for all samples and

generates test vectors from statistical characteristics of the features distribution

  • Trying to preserve the keyword recognition

48

slide-49
SLIDE 49

Chaouki Kasmi & José Lopes Esteves

TEST SUMMARY

49

Test Target 1 (Siri) Target 2 (S-voice) Target 3 (Google now) Impersonation

  

Replay

  

TD reconstruction

  

FD reconstruction

  

Keyword composition

  ?

Model shifting

 ? ?

Features bruteforce WIP WIP WIP

slide-50
SLIDE 50

Chaouki Kasmi & José Lopes Esteves

OPEN QUESTIONS

  • Is it possible, for a given language and keyword:

 To generate a « masterkey »?  To derive a verified sample by bruteforce? At which

complexity?

  • Is it possible, knowing the model and features:

 To estimate the probability and/or the number of

masterkeys?

 To estimate the robustness of the authentication

system against impersonation?

  • Can voice authentication vendors tell us:

 How easily can it be circumvented according to my

language, keyword and voice characteristics ?

 And how confident could we be about the answer?

50

slide-51
SLIDE 51

Chaouki Kasmi & José Lopes Esteves

TAKE AWAY THOUGHTS

  • Voice command usability vs. security
  • Apple response to our disclosure:

 « Voice recognition in Siri is not a security

feature »

51

slide-52
SLIDE 52

Chaouki Kasmi & José Lopes Esteves

TAKE AWAY THOUGHTS

  • By using unsecure settings, does the user

give permission to access the system ?

52

slide-53
SLIDE 53

Thank You

We thank the manufacturers and the editors for their interesting feedbacks

slide-54
SLIDE 54

Chaouki Kasmi & José Lopes Esteves

REFERENCES

[1] W. Diao et al., Your Voice Assistant is Mine: How to Abuse Speakers to Steal Information and Control Your Phone. SPSM 2014 [2] AVG, How an app could use Google Now to send an email on your behalf, YouTube, 2014 [3] T. Vaidya et al., Cocaine Noodles: Exploiting the Gap between Human and Machine Speech Recognition, Usenix Woot, 2015 [4] T. Vaidya et al., Hidden Voice Commands, Usenix Security, 2016 [5] C. Kasmi, J. Lopes Esteves, You don’t hear but you phone’s voice interface does, Hack In Paris15, 2015 [6] C. Kasmi, J. Lopes Esteves, Whisper in the Wire: Voice Command Injection Reloaded, Hack In Paris 16, 2016 [7] S. Chen et al, You can hear but you cannot steal: Defending against voice impersonation attacks

  • n smartphones, 37th International Conference on Distributed Computing Systems, 2017

[8] Qualcomm, Continuous voice authentication for a mobile device, US patent WO2012135681 A3, 2012 [9] C.Kasmi, J.Lopes Esteves, Automated analysis of the effects induced by radio-frequency pulses

  • n embedded systems for EMC safety, AT-RASC, URSI, 2015

[10] JF.Bonastre et al., Person Authentication by Voice: A Need for Caution, EUROSPEECH, ISCA, 2003 [11] S. Prabhakar et al., Biometrics Recognition: Security and Privacy Concerns, IEEE Security & Privacy, 2003

54

slide-55
SLIDE 55

Chaouki Kasmi & José Lopes Esteves

QUESTIONS ?

  • José Lopes Esteves,

jose.lopes-esteves@ssi.gouv.fr

  • Chaouki Kasmi,

chaouki.kasmi@ssi.gouv.fr

55