Robot audition and its deployment Kazuhiro Nakadai Principal - PowerPoint PPT Presentation

Honda Research Institute JP Robot audition and its deployment Kazuhiro Nakadai Principal Researcher, Honda Research Institute Japan Co. Ltd. Visiting Professor, Tokyo Institute of Technology Visiting Professor, Waseda University 2nd Workshop on Alternative Sensing for Robot Perception: Beyond Laser and Vision 1

Outline Honda Research Institute JP 1. Background of Robot Audition 2. Introduction to Robot Audition Research 3. Open Source Software for Robot Audition 4. Deployment of Robot Audition 5. Summary 2

Background Honda Research Institute JP Humanoid robot  Interaction with human is expected to be a partner. Robot as our partner Service, Interaction, Information, Entertainment… House keeping News provider welfare company Necessity of auditory processing → robot audition

Robot Audition Honda Research Institute JP  When a robot listens to sound with its ears, …. Ego-noise Motors, self-voice It should deal with the mixture of sounds.

Robot Audition Honda Research Institute JP • Proposed by Prof. Okuno (Kyoto Univ. → Waseda Univ.) and Nakadai at AAAI-2000 – http://winne.kuis.kyoto-u.ac.jp/SIG/ Robot Audition • A research field bridging Robotics, AI and Signal processing • Continuously expanding – Japan: Kyoto Univ., Honda RI, Tokyo Tech., ATR, AIST, Kumamoto Univ., Waseda Univ. ， etc – Europe: CNRS-LAAS (France), INRIA (France), Univ. of Erlangen- Nuremberg (Germany), Ruhr-Universität Bochum (Germany), ITU (Turkey), Imperial College London (UK), etc – North America: Sherbrooke Univ.(Canada), MERL (USA), Virginia Tech. (USA), Willow Garage (USA), etc – Oceania: UTS (Australia)

Our Activities for Robot Audition Honda Research Institute JP Special Session on IEEE Int’l Conf. on Acoustics, Speech and Signal Processing (ICASSP 2009)@Taipei, Taiwan (ICASSP 2015)@Brisbane, Australia Organized Sessions on IEEE/RSJ Int’l Conf. on Intelligent Robots and Systems (IROS 2005-2013) * Since 2014, robot audition is registered as an official keyword in IEEE-RAS. HARK Tutorial (OSS) France: 2009, 2012, 2013 Korea: 2008 Japan: once a year since 2008 • Migration to Taxai at Willow Garage 2010 @ Palo Alto, USA • International workshop on Music Robot 2010 @ Taipei, Taiwan

Robot is surrounded by various noises. Honda Research Institute JP Target Speech Ego-noise such as motion and voice) (near field, loud) Diffuse noise Reverberation Directional (BGN, omni-directional) (echo) noise Different characteristics → one-by-one approach • Sound Source Separation mainly for directional noise • Dereverberation • Ego-noise suppression

Sound Source Separation Honda Research Institute JP Source Separation + + Source Separation Matrix Input Output (  (  (  ) ) ) x W y Separation process     ( ) ( ) ( ) y W x Separation Matrix ( W ) Incremental SSS: Update to reduce mixing cost J W    ' ( )  W W J W ： step-size parameter  1 t t 9

Sound Source Separation with Adaptive Step-size control Honda Research Institute JP Fixed step-size: Difficult to adapt to environmental changes like robot motions and moving sources => GHDSS-AS [IEEE-TSLP Nakajima 10] GSS with u = 1 0 250 Fixed μ 500 Separation depth otearai 1k -10 2k Level [dB] -20 Manually-tuned -30 -40 SSS -50 0 200 400 600 800 1000 1200 Small value Number of Updates Time (# of frames) GSS with u = 1 250 0 Adaptively-controlled μ 500 Separation depth 1k 2k -10 Recorded sound Level [dB] -20 Large value -30 Adaptive Step-size (AS) -40 Newton’s method -50 0 200 400 600 800 1000 1200 Number of Updates Time (# of frames) 10

Experiment with Texai [IEEE ICRA 2011] Honda Research Institute JP  Reverberant conference room (RT > 1s), around 20m x 10m. Recorded Talker3 Direction (degree) Talker2 Garbage Talker1 Talker4 Time (frame) http://www.youtube.com/watch?v=xpjPun7Owxg 11

[Neural Computation ‘12, Ego-noise suppression IEEE IROS ‘09-’12] Honda Research Institute JP Robot’s voice & motion noise • closer to mics • Higher power Key idea Robot knows what it utters and what kind of motions it does. Interactive Dancing Robot Semi-blind ICA ⇒ barge-in-able robot Template-based ego-motion noise suppression              Pos ( , ) ( , 0 ( , ( ) Pos  Y f A H H M N f Noise siganal observation       noise ture noise ture    ( , ) 0 1 0 ( )        S f S f  Known signal                   (utterance) Known signal            ( , ) 0 0 1 ( )        S f M S f M 12

Missing-Feature-Theory-based Integration [ASRU 07] Honda Research Institute JP Clean speech, Distorted or speech with speech known noise Automatic Noisy/ Noise Speech Simultaneous Text Suppression Recognition Speech  Mismatch between two blocks  Noise suppression  Automatic speech recognition (ASR) Missing Feature Theory (MFT) for better integration

Missing Feature Theory (MFT) Honda Research Institute JP Missing features An acoustic model Normal ASR caused by separation stored in ASR ( i ) x Large error i One of the most important issues is The features of corrupted sound at time t automatic MFM generation. MFT-based ASR Missing feature mask (MFM) ( i ) x Small error i The features of corrupted sound at time t

An example of automatic generated MFM Honda Research Institute JP 1 (reliable) captured MFM spectrogram 0 (unreliable) speech pass left Arayuru Genjitsu wo … leakage masked center Isshukan bakari … leakage masked right Terebi gemu ya pasokon de …

Open Source Robot Audition Software HARK Honda Research Institute JP  HRI-JP Audition for Robots with Kyoto University hark = listen (old English) Research: Free (Commercial: Licensing) http://www.hark.jp/ Sound Sound Automatic Dialog Source Source Speech Localization Recognition Separation Array Developing under collaboration between Kyoto Univ., HRI-JP, and Tokyo Tech.

His istory ory and nd Tut utoria orials ls Honda Research Institute JP 1. Apr. 2008, First release (0.1.7) 1 st Tutorial: Nov. 17 th , 2008, Kyoto University, Kyoto, Japan – 2 nd Tutorial: Dec. 5 th , 2008, KIST, Seoul, Korea – 2. Nov. 2009, 1.0.0 Pre-release 3 rd Tutorial: Nov. 20 th , 2009, Keio University, Yokohama, Japan – 4 th Tutorial: Dec. 5 th , 2009, Univ. de Pierre et Marie Curie, Paris, France – 3. Nov. 2010, Major version-up (1.0.0) – performance, rich documents – 5 th Tutorial: Nov. 20 th , 2010, Kyoto University, Kyoto, Japan 4. Feb. 2012, Version-up (1.1) – performance, 64bit processing, ROS 6 th Tutorial: Feb. 29 th , 2012, Univ. de Pierre et Marie Curie, Paris, France – 7 th Tutorial: Mar. 9 th , 2012, Nagoya University, Nagoya, Japan – 5. Mar. 2013, Version-up (1.7) – Window, Kinect, PSEye 8 th Tutorial: Mar. 19 th , 2013, Kyoto University, Kyoto, Japan – 6. Oct. 2013, Major Version-up (2.0) – HARKDesigner, Microcone 9 th Tutorial: Oct. 2 nd , 2013, LAAS-CNRS, Toulouse, France – 10 th Tutorial: Dec. 5 th , 2013, Waseda University, Tokyo, Japan – 7. Nov. 2014, Version-up (2.1) 11 th Tutorial: Nov. 21 th , 2014, Waseda University, Tokyo, Japan – 8. Nov., 2015 Version-up (2.2) planned

Features in HARK (1) Honda Research Institute JP  GUI programming environment (HARK Designer) – Web-based programming environment (jQuery, node.js, HTML5) – Chrome/Safari/Firefox on Linux/Windows/Mac – Small overhead in module communication (frame-based processing) provided by FlowDesigner [Cote04] b) Property setting a) Module network An example of robot audition system with HARK

Features in HARK (2) Honda Research Institute JP  Support many multi-channel sound input devices ALSA supported sound Microcone PlayStation eye Kinect (4mics) cards (e.g. RME) (7mics) (4mics)  Advanced signal processing technologies – Localization: GEVD/GSVD [Nakamura’11], 3D localization – Separation: GHDSS [Nakajima ‘09], HRLE [Nakajima ‘10], etc.  Easy to install – Just use a package management tool “apt - get” !  Rich documentation – Manual and cookbook over 300 pages in Japanese and English  Packages ： ROS, OpenCV , Python, …

Outline Honda Research Institute JP 1. Background of Robot Audition 2. Introduction to Robot Audition Research 3. Open Source Software for Robot Audition, HARK 4. Deployment of Robot Audition 5. Summary 21

Musical Robot [IEEE IROS 09 workshop on musical robots] Honda Research Institute JP Human-Robot Interaction according to musical beats • Adaptive beat tracking • HRP2, Nao : Thereminist 22

Robot audition and its deployment Kazuhiro Nakadai Principal - PowerPoint PPT Presentation

Honda Research Institute JP Robot audition and its deployment Kazuhiro Nakadai Principal Researcher, Honda Research Institute Japan Co. Ltd. Visiting Professor, Tokyo Institute of Technology Visiting Professor, Waseda University 2nd Workshop

Robothlon Team competition, each team programs a robot for each event Events Robot

IPv6 Deployment WG in IPv6 Promotion Council and its Deployment Guideline 2005.2.23 IPv6

Rational Robot A Test Automation Tool What is Rational Robot? Rational Robot is a complete

Verifying the Motion of a Robot Arm Akul Penugonda 1 /6 Akul Penugonda - Robot Arm Motion 2

What is a robot? A robot is an intelligent system that interacts with the Robot Lecture 2:

Active Audition and Sensorimotor Integration for Sound Source Localization Mathieu Bernard 25

Presented by: Doretta Richardson Pre-Deployment Brief Got Deployment? 2 Pre-Deployment Workshop

Presented by: Doretta Richardson Pre-Deployment Brief Got Deployment? 2 Pre-Deployment Workshop

Enhanced Robot Audition Based on Microphone Array Source Separation with Post-Filter Jean-Marc

Noise Reduction in Robot Audition Tanja Flemming University of Hamburg Faculty of Mathematics,

? 1 1/31/2012 Every robot maps to a point in Every robot maps to a point in its configuration

Robot Localization Localization Robot and and Kalman Filters Filters Kalman Rudy Negenborn

Robot behaviour and control A robot can be defined as an intelligent link between perception

Robot sensors A robot can be defined as an intelligent link between perception and action

Establishing a Korean Robot Ethics Charter 2007. 4. 14 Robot Division, Ministry of Commerce,

Out line Robot ics Percept ion Robot ics Planning Reading: R&N Sect .

From Speech Perception to Language Andrew Nevins (Harvard University) Lectures at Universidadte

am S Proposing a Meta-Language for Specifying Presentation Complexity in order to Support

Quantifying Air Traffic Controller Mental Workload Nicolas Suarez nstetzlaff@e-crida.enaire.es

Pacific Belltower sound installation for live sonification of earthquake Internet data PerMagnus

Y P O

Models and Causation of Child Language Disorders Models and Causation of Child Language

1. Welcome & Session Explanation 1. Sound Check, M aterial Check 2. Broad Personal Goal:

The State of Ady0 Cmprshn Scott Selfon Senior Development Lead Xbox Advanced Technology Group |

Sambuz

Useful Links

Newsletter

Mail Us

Robot audition and its deployment Kazuhiro Nakadai Principal - PowerPoint PPT Presentation

Honda Research Institute JP Robot audition and its deployment Kazuhiro Nakadai Principal Researcher, Honda Research Institute Japan Co. Ltd. Visiting Professor, Tokyo Institute of Technology Visiting Professor, Waseda University 2nd Workshop

Robothlon Team competition, each team programs a robot for each event Events Robot

IPv6 Deployment WG in IPv6 Promotion Council and its Deployment Guideline 2005.2.23 IPv6

Rational Robot A Test Automation Tool What is Rational Robot? Rational Robot is a complete

Verifying the Motion of a Robot Arm Akul Penugonda 1 /6 Akul Penugonda - Robot Arm Motion 2

What is a robot? A robot is an intelligent system that interacts with the Robot Lecture 2:

Active Audition and Sensorimotor Integration for Sound Source Localization Mathieu Bernard 25

Presented by: Doretta Richardson Pre-Deployment Brief Got Deployment? 2 Pre-Deployment Workshop

Presented by: Doretta Richardson Pre-Deployment Brief Got Deployment? 2 Pre-Deployment Workshop

Enhanced Robot Audition Based on Microphone Array Source Separation with Post-Filter Jean-Marc

Noise Reduction in Robot Audition Tanja Flemming University of Hamburg Faculty of Mathematics,

? 1 1/31/2012 Every robot maps to a point in Every robot maps to a point in its configuration

Robot Localization Localization Robot and and Kalman Filters Filters Kalman Rudy Negenborn

Robot behaviour and control A robot can be defined as an intelligent link between perception

Robot sensors A robot can be defined as an intelligent link between perception and action

Establishing a Korean Robot Ethics Charter 2007. 4. 14 Robot Division, Ministry of Commerce,

Out line Robot ics Percept ion Robot ics Planning Reading: R&amp;N Sect .

From Speech Perception to Language Andrew Nevins (Harvard University) Lectures at Universidadte

am S Proposing a Meta-Language for Specifying Presentation Complexity in order to Support

Quantifying Air Traffic Controller Mental Workload Nicolas Suarez nstetzlaff@e-crida.enaire.es

Pacific Belltower sound installation for live sonification of earthquake Internet data PerMagnus

Y P O

Models and Causation of Child Language Disorders Models and Causation of Child Language

1. Welcome &amp; Session Explanation 1. Sound Check, M aterial Check 2. Broad Personal Goal:

The State of Ady0 Cmprshn Scott Selfon Senior Development Lead Xbox Advanced Technology Group |

Sambuz

Useful Links

Newsletter

Mail Us

Out line Robot ics Percept ion Robot ics Planning Reading: R&N Sect .

1. Welcome & Session Explanation 1. Sound Check, M aterial Check 2. Broad Personal Goal: