DolphinAttack: Inaudible Voice Commands Guoming Zhang, Chen Yan, - PowerPoint PPT Presentation

DolphinAttack: Inaudible Voice Commands Guoming Zhang, Chen Yan, Xiaoyu Ji, Tianchen Zhang, Taimin Zhang, Wenyuan Xu Zhejiang University Presenter: Huichen Li This paper won the CCS 2017 Best Paper award

Speech Recognition Systems Apple Siri Amazon Alexa Google Now Huawei HiVoice

Obfuscated Voice Commands Hidden Voice Commands

Threat Model • Inaudible (with ultrasounds f > 20kHz) • No owner interaction. • Whitebox. • No (physical) target device access. • Attacker has required equipments (e.g. speakers for transmitting ultrasound near target devices).

Threat Model • Inaudible (with ultrasounds f > 20kHz) • No owner interaction. • Whitebox . • No (physical) target device access . • Attacker has required equipments (e.g. speakers for transmitting ultrasound near target devices).

Voice Controllable System Q: Which parts of the VCS are most vulnerable? (No known answer)

Voice Controllable System

Voice Controllable System ambient voices: recorded -> amplified -> filtered -> digitized

Voice Controllable System - remove frequencies that are beyond the audible sound range - discard signal segments that contain sounds too weak to be identified

Voice Controllable System

Voice Controllable System Performed locally e.g. Siri - say pre-defined wake words - press a special key

Voice Controllable System Via a cloud service signals sent to servers -> extract features -> recognize commands e.g. Mel-frequency cepstral coe ffi cients(MFCC) e.g. machine learning

Voice Controllable System launch the corresponding application or execute an operation

Voice Controllable System Q: Which parts of the VCS are most vulnerable? (No known answer) Take a guess!

Focus of Attack Inaudible!

Doubts on Inaudible Voice Commands • How can inaudible sounds be audible to devices? low-pass filters? low audio sampling rates? • How can inaudible sounds be intelligible to SR systems? SR systems do not recognize signals that do not match human tonal features? • How can inaudible sounds cause unnoticed security breach to VCS? speaker-dependent wake words?

Microphone Pros: - miniature package sizes - low power consumption air pressure change -> capacitive change -> AC signal

Nonlinearity of Microphone in ultrasound bands f > 20kHz m(t): target voice signal LPF Fourier Transformation

s1(t) = cos(2 π f1 t) at frequency f1=38kHz s2(t) = cos(2 π f2 t) at frequency f2=40kHz s_hi (t) = s1(t) + s2(t) Inaudible Voice Commands: The Long-Range Attack and Defense

Modulated Tone Traversing Voice Capture Device Modulation Demodulation

Nonlinearity Evaluation: Questions • Will the demodulation work well in practice? • Will the demodulated voice signal remain similar to the original one?

Nonlinearity Evaluation: Experimental Setup iPhone SE -> vector signal generator -> power amplifier -> ultrasonic speaker baseband signal -> modulated onto a carrier -> amplified -> transmitted

Nonlinearity Evaluation: Single Tone Results original output signal of MEMS microphone output signal of ECM microphone 20 kHz carrier 2 kHz baseband Demodulation successful!

Nonlinearity Evaluation: Voices Results MCD between original and recorded original TTS generated voice recorded as the 3.1 original voice is played recorded as the 7.6 modulated voice is played by ultrasonic speaker Mel-Cepstral Distortion (MCD) quantifies distortion between two MFCCs Similar! two voices are considered to be acceptable to voice recognition systems if their MCD values are smaller than 8

Attack Design • Generate voice commands • Modulate baseband signals • Launch attack with a portable transmitter

Activation Voice Commands Generation: Brute Force Siri is trained with Google TTS

Activation Voice Commands Generation: Concatenative

Amplitude Modulation (AM): Depth (index) directly related to the utilization of the nonlinearity e ff ect of microphones

Analysis: Modulation Depth Demodulated signals become stronger Signal-to-noise ratio and the attack success rate get higher

Amplitude Modulation (AM): Carrier Frequency f • Factors for choosing f: • frequency range of ultrasounds • bandwidth of the baseband signal • cut-o ff frequency of the low pass filter • frequency response of the microphone on the VCS • frequency response of the attacking speaker

Amplitude Modulation (AM): Carrier Frequency f Inaudibility: • Factors for choosing f: lowest frequency > 20 kHz • frequency range of ultrasounds • bandwidth of the baseband signal • cut-o ff frequency of the low pass filter • frequency response of the microphone on the VCS • frequency response of the attacking speaker

Amplitude Modulation (AM): Carrier Frequency f Inaudibility: • Factors for choosing f: lowest frequency > 20 kHz • frequency range of ultrasounds w: frequency range • bandwidth of the baseband signal of voice command • cut-o ff frequency of the low pass filter • frequency response of the microphone on the VCS • frequency response of the attacking speaker

Amplitude Modulation (AM): Carrier Frequency f Inaudibility: • Factors for choosing f: lowest frequency > 20 kHz • frequency range of ultrasounds w: frequency range • bandwidth of the baseband signal of voice command f - w > 20 kHz • cut-o ff frequency of the low pass filter • frequency response of the microphone on the VCS • frequency response of the attacking speaker

Amplitude Modulation (AM): Carrier Frequency f Inaudibility: • Factors for choosing f: lowest frequency > 20 kHz • frequency range of ultrasounds w: frequency range • bandwidth of the baseband signal of voice command otherwise f - w > 20 kHz • cut-o ff frequency of the low pass filter carrier will not be filtered. • frequency response of the microphone on the VCS • frequency response of the attacking speaker

Amplitude Modulation (AM): Carrier Frequency f

Analysis: Carrier Wave Frequency 400 Hz baseband and higher order harmonics

Analysis: Carrier Wave Frequency amplitude of the harmonics larger than baseband Unacceptable to SR systems! 400 Hz baseband and higher order harmonics

Amplitude Modulation (AM): Voice Selection f - w > 20 kHz • Various voices map to various baseband frequency ranges. • A voice with a small bandwidth shall be selected to create baseband voice signals

Voice Commands Transmitter Powerful transmitter: driven by a dedicated signal generator Portable transmitter: driven by a smartphone

Experimental Goal • Examining the feasibility of attacks. • Quantifying the parameters in tuning a successfully attack. • Measuring the attack performance.

Feasibility Experiments: Device/System & Commands

Impact: Languages

Impact: Background Noise

Impact: Distance

Impact: Sound Pressure Levels

Results Almost all the systems can be attacked!

Defense: Hardware-based • Microphone Enhancement. • Suppress any acoustic signals whose frequencies are in the ultrasound range. • Inaudible Voice Command Cancellation. • Demodulate the signals to obtain the baseband and subtract it.

Defense: Software-based original recorded recovered support vector machine (SVM) -> 10 training sample (5 positive, 5 negative) Q: rigorous? -> 14 testing samples 100% true positive and false positive rates

Remote attack?

Related Work - Embed commands into songs -> distribute through the internet - Use multiple speakers to mitigate leakage

Thanks!

DolphinAttack: Inaudible Voice Commands Guoming Zhang, Chen Yan, - PowerPoint PPT Presentation

DolphinAttack: Inaudible Voice Commands Guoming Zhang, Chen Yan, Xiaoyu Ji, Tianchen Zhang, Taimin Zhang, Wenyuan Xu Zhejiang University Presenter: Huichen Li This paper won the CCS 2017 Best Paper award Speech Recognition Systems Apple Siri

DOLPHIN ATTACK: INAUDIBLE VOICE COMMANDS Guoming Zhang, Chen Yan, Xiaoyu Ji, Tianchen Zhang,

Metamorph: Injecting Inaudible Commands into Over-the-air Voice Controlled Systems Tao Chen 1

Slide 1 Page: 1 The Leader's Voice Slide 3 Page: 5 The Leader's Voice Slide 4 Page: 6 The

DMR and Digital Voice Modes DMR and Digital Voice Modes DMR and Digital Voice Modes DMR and

Digital Voice VHF, UHF, and HF Analog Voice - AM/SSB Analog Voice - FM Digital Voice GMSK UHF

The Shell What does a shell do? - execute commands, programs - but how? For built in commands

Drafting Commands, Metaediting Part II: The Core Commands Announcements HW3... is postponed

Aisle Safety Light Brightness SFMTA Fleet Engineering Voice Annunciator Volume Voice

Speech Processing 15-492/18-492 Speech Synthesis Evaluation Evaluating Speech Synthesis How

There is a voice speaking. That voice is sovereign. That voice alone is sovereign. Jeremiah

A comparison of in inaudible windfarm noise and the natural environment noise whilst monitoring

A simplified A simplified method method for for determination determination of of

Getting Sta rted with Voice API Lorna Mitchell Getting Sta rted with Voice API Use the Voice

Hidden Voice Commands Nicholas Carlini, Pratyush Mishra, Tavish Vaidya, Yuankai Zhang,

SQL , the Structured Query Language Overview Introduction DDL Commands DML Commands SQL

SDSF Enhancements April 2016 New Commands New commands provided in APAR PI56007 ENQ

A Full Bandwidth Audio Codec with Low A Full Bandwidth Audio Codec with Low Complexity and Very

Claudio Fiandrino, IMDEA Networks, Madrid, Spain 1 2 3 Introduction on mm-wave

HIGH FREQUENCY PROPAGATION Results : Metal Oxide Space Cloud (MOSC) Experiment Dev Joshi

Lecture 1 Introduction/Signal Processing, Part I Michael Picheny, Bhuvana Ramabhadran, Stanley F

SEG Spring 2005 Distinguished Lecture: Spectral Decomposition and Spectral Inversion Greg

Filters and Bode magnitude plots ( corrected version ) ENGR 40M lecture notes August 4, 2017

Opening new windows Yashwant Gupta National Centre for Radio Astrophysics Pune India

Microsoft Spectr rum Observatory Ranveer Chandra Microsoft Research Joint work with Techn nology

Sambuz

Useful Links

Newsletter

Mail Us

DolphinAttack: Inaudible Voice Commands Guoming Zhang, Chen Yan, - PowerPoint PPT Presentation

DolphinAttack: Inaudible Voice Commands Guoming Zhang, Chen Yan, Xiaoyu Ji, Tianchen Zhang, Taimin Zhang, Wenyuan Xu Zhejiang University Presenter: Huichen Li This paper won the CCS 2017 Best Paper award Speech Recognition Systems Apple Siri

DOLPHIN ATTACK: INAUDIBLE VOICE COMMANDS Guoming Zhang, Chen Yan, Xiaoyu Ji, Tianchen Zhang,

Metamorph: Injecting Inaudible Commands into Over-the-air Voice Controlled Systems Tao Chen 1

Slide 1 Page: 1 The Leader's Voice Slide 3 Page: 5 The Leader's Voice Slide 4 Page: 6 The

DMR and Digital Voice Modes DMR and Digital Voice Modes DMR and Digital Voice Modes DMR and

Digital Voice VHF, UHF, and HF Analog Voice - AM/SSB Analog Voice - FM Digital Voice GMSK UHF

The Shell What does a shell do? - execute commands, programs - but how? For built in commands

Drafting Commands, Metaediting Part II: The Core Commands Announcements HW3... is postponed

Aisle Safety Light Brightness SFMTA Fleet Engineering Voice Annunciator Volume Voice

Speech Processing 15-492/18-492 Speech Synthesis Evaluation Evaluating Speech Synthesis How

There is a voice speaking. That voice is sovereign. That voice alone is sovereign. Jeremiah

A comparison of in inaudible windfarm noise and the natural environment noise whilst monitoring

A simplified A simplified method method for for determination determination of of

Getting Sta rted with Voice API Lorna Mitchell Getting Sta rted with Voice API Use the Voice

Hidden Voice Commands Nicholas Carlini*, Pratyush Mishra*, Tavish Vaidya**, Yuankai Zhang**,

SQL , the Structured Query Language Overview Introduction DDL Commands DML Commands SQL

SDSF Enhancements April 2016 New Commands New commands provided in APAR PI56007 ENQ

A Full Bandwidth Audio Codec with Low A Full Bandwidth Audio Codec with Low Complexity and Very

Claudio Fiandrino, IMDEA Networks, Madrid, Spain 1 2 3 Introduction on mm-wave

HIGH FREQUENCY PROPAGATION Results : Metal Oxide Space Cloud (MOSC) Experiment Dev Joshi

Lecture 1 Introduction/Signal Processing, Part I Michael Picheny, Bhuvana Ramabhadran, Stanley F

SEG Spring 2005 Distinguished Lecture: Spectral Decomposition and Spectral Inversion Greg

Filters and Bode magnitude plots ( corrected version ) ENGR 40M lecture notes August 4, 2017

Opening new windows Yashwant Gupta National Centre for Radio Astrophysics Pune India

Microsoft Spectr rum Observatory Ranveer Chandra Microsoft Research Joint work with Techn nology

Sambuz

Useful Links

Newsletter

Mail Us

Hidden Voice Commands Nicholas Carlini, Pratyush Mishra, Tavish Vaidya, Yuankai Zhang,