Audio recognition, context-awareness, and its applications - PowerPoint PPT Presentation

Audio recognition, context-awareness, and its applications Yoonchang Han Co-founder & CEO, Cochlear.ai 26 March, 2018

Rule-based Deep learning methods (Source: Softbank Pepper)

See Computer vision Understand Natural language language processing Listen Speech recognition (Source: Softbank Pepper)

Taking an umbrella Closing the window

Foot step sound High heels (Audio source: http://www.freesound.org/people/Damiaan/)

(Source: BBC)

Easy for Humans Hard for Machines

Evolution of data processing technique Data Feature More engineering automatic More Feature human Deep learning engineering effort Better performance ML Classifier Prediction Early days Traditional ML Deep learning

Domain knowledge To tackle each topic (make some “rules”) To simulate how human understand the sound (and prepare data)

Required domain knowledge Signal Cognitive Music Processing Sciences Machine Psychoacoustics Acoustics Learning

“Modern” audio identification pipeline Time-frequency Audio Neural Network Output representation objects in an image ≈ instruments in a spectrogram voice flower piano violin butterfly

“Machine listening” is the use of signal processing and machine learning for making sense of natural / everyday sounds, and recorded music. - Machine listening lab, Queen Mary, Univ. of London

Voice … Age Language Gender Emotion Health Music … Genre Mood Chord Pitch Tempo

Machine listening Acoustic scenes Acoustic events bus park glass break knock … library city centre car horn dog bark driving train footstep water boil home market gun shot snoring cafe … bird chirping crying sneeze … Music Voice “Any” sound we hear everyday

Computer vision Machine listening Optical Character Voice recognition Recognition (OCR) Music search Facial recognition Speaker identification Acoustic Object detection scene/event detection (Sources: Tensorflow, Facebook , Microsoft, Apple, Shazam)

100 92 % 90 76 % 80 70 2013 2017 Scene classification accuracy (IEEE DCASE) (Source: http://www.cs.tut.fi/sgn/arg/dcase2017/, http://c4dm.eecs.qmul.ac.uk/sceneseventschallenge/resultsSC.html)

Deep Machine Artificial Learning Learning Intelligence

Perceive Think Act

Five, Zero Cat

Simple Identification Know what it is (with input restriction) Know what it is Know what/where it is Know what/where it is + why Closer to human

Sense (closed alpha release in April) Activity� Music,�Speech,�Others detection Music�� Speech� Scene� Acoustic� analysis analysis classification event Genre�/�Mood� Age�/�Gender�� Indoor�/�Outdoor� Dog�bark�/�Baby�cry� /�Key�/�Tempo /�Emotion /�Vehicles Car�horn�/�Snoring�...

Why do we need… Activity detection Unified model

It is really challenging because… Recording environment Recording device Noises Local characteristics Overlapped / Polyphonic

Probability or Saliency ?

Example: AI speakers IoT control-tower Simple voice control with context-awareness (footstep sound, door slam, cough, Someone got back home, got a bad cold) “Alexa, turn on the light” turn on light / TV “Alexa, play dance music” play suitable music “Alexa, turn on TV” adjust room temperature warmer (not just a pattern, there is a “reason”) ask to take cold medicine before sleep

Example: Humanoid robots See things Understand speech + Listen things other than voice Know who they talk to (Source: Atlas, Boston Dynamics)

(Source: NVIDIA) Example: Autonomous car Outside - Car horn (normal, air horn), Siren (fire truck, police, ambulance) Inside - Music mood, snoring, baby, anomaly detection (malfunction warning)

ATMO: Generative music for spatial atmo-sphere Architect Musician + AI researcher Visual artist Contemporary dancer

Generative Music with contextual information

Ambient music Background music Generative Music with contextual information

Analysis Result : Typing in a rainy day… Contextual Information Typing… Reading a book… Raining outside…

Microphone Speaker

contact@cochlear.ai

Audio recognition, context-awareness, and its applications - PowerPoint PPT Presentation

Audio recognition, context-awareness, and its applications Yoonchang Han Co-founder & CEO, Cochlear.ai 26 March, 2018 Rule-based Deep learning methods (Source: Softbank Pepper) See Computer vision Understand Natural language

Audio Device Client Better and Faster Audio I/O on Web Hongchan Choi Google Chrome Web Audio

The Varieties of Self- Awareness David Chalmers Self-Awareness n Self-awareness = awareness

Cirrus Audio Solutions Cirrus Audio Solutions Home Audio Portable Audio Personal CD Player

Audio- -Visual Automatic Speech Recognition: Visual Automatic Speech Recognition: Audio Theory,

Create PowerPoint Audio and Video V0B August 2020 V0B V0B Schield: 2020 PPTX Create Audio-Video

Audio and Speech August 13, 2001 Audio 2 Digital sound anti-aliasing amplifier codec filter

Audio Indexing and Retrieval IT6902; Semester B, 2004/2005; Leung Audio Indexing and Retrieval

Game Audio Coding vs. Aesthetics Leonard Paul of Lotus Audio Vancouver, Canada Game Audio :

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches

Phonemic Awareness Phonological Awareness Phonological awareness is ones sensitivity

Context-awareness and Context Modeling Ubiquitous Computing Seminar 2014 Presentation by Sandro

CS378 - Mobile Computing Audio Android Audio Use the MediaPlayer class Common Audio

CobraNet CobraNet Audio Network Audio Network Overview Overview Developed by Peak Audio

ARREL AUDIO ML-118 Mid-Side Unit Livio Argentini, Marco Re ARREL AUDIO Rome Via Arnoldo

ECA-DL e Transformaes Patrcia Dockhorn Costa pdcosta@inf.ufes.br Context-Awareness

The role of human resource-related quality management practices in new product development: A

ESKAY CREEK PROJECT Golde den n Tria iang ngle, e, BC TSX.V: SKE/OTCQX: SKREF Nove vembe

A brief overview of key points relating to paint systems for exterior concrete Based on Code of

Van Lanschot Conditional Pass- Through Covered Bond Programme March 2016 More information:

What s Ahead s Ahead What The arc of attaining competency How is the Coast Guard is

Opioid Overdose Overview Sarah Bryant, MPH, RN Division Manager Health Promotion and Prevention

Dynamic profiles for malware communication Joao Marques, Mick Cox MSc System & Network

Implementing Snort into SURFids Sander Keemink and Michael van Kleij February 6, 2008 1 / 21

Sambuz

Useful Links

Newsletter

Mail Us

Audio recognition, context-awareness, and its applications - PowerPoint PPT Presentation

Audio recognition, context-awareness, and its applications Yoonchang Han Co-founder & CEO, Cochlear.ai 26 March, 2018 Rule-based Deep learning methods (Source: Softbank Pepper) See Computer vision Understand Natural language

Audio Device Client Better and Faster Audio I/O on Web Hongchan Choi Google Chrome Web Audio

The Varieties of Self- Awareness David Chalmers Self-Awareness n Self-awareness = awareness

Cirrus Audio Solutions Cirrus Audio Solutions Home Audio Portable Audio Personal CD Player

Audio- -Visual Automatic Speech Recognition: Visual Automatic Speech Recognition: Audio Theory,

Create PowerPoint Audio and Video V0B August 2020 V0B V0B Schield: 2020 PPTX Create Audio-Video

Audio and Speech August 13, 2001 Audio 2 Digital sound anti-aliasing amplifier codec filter

Audio Indexing and Retrieval IT6902; Semester B, 2004/2005; Leung Audio Indexing and Retrieval

Game Audio Coding vs. Aesthetics Leonard Paul of Lotus Audio Vancouver, Canada Game Audio :

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches

Phonemic Awareness Phonological Awareness Phonological awareness is ones sensitivity

Context-awareness and Context Modeling Ubiquitous Computing Seminar 2014 Presentation by Sandro

CS378 - Mobile Computing Audio Android Audio Use the MediaPlayer class Common Audio

CobraNet CobraNet Audio Network Audio Network Overview Overview Developed by Peak Audio

ARREL AUDIO ML-118 Mid-Side Unit Livio Argentini, Marco Re ARREL AUDIO Rome Via Arnoldo

ECA-DL e Transformaes Patrcia Dockhorn Costa pdcosta@inf.ufes.br Context-Awareness

The role of human resource-related quality management practices in new product development: A

ESKAY CREEK PROJECT Golde den n Tria iang ngle, e, BC TSX.V: SKE/OTCQX: SKREF Nove vembe

A brief overview of key points relating to paint systems for exterior concrete Based on Code of

Van Lanschot Conditional Pass- Through Covered Bond Programme March 2016 More information:

What s Ahead s Ahead What The arc of attaining competency How is the Coast Guard is

Opioid Overdose Overview Sarah Bryant, MPH, RN Division Manager Health Promotion and Prevention

Dynamic profiles for malware communication Joao Marques, Mick Cox MSc System &amp; Network

Implementing Snort into SURFids Sander Keemink and Michael van Kleij February 6, 2008 1 / 21

Sambuz

Useful Links

Newsletter

Mail Us

Dynamic profiles for malware communication Joao Marques, Mick Cox MSc System & Network