Artificial Neural Networks for Multimodal Information Fusion - PowerPoint PPT Presentation

Artificial Neural Networks for Multimodal Information Fusion Friedhelm Schwenker Institute of Neural Information Processing University of Ulm Cairo University April 9, 2010 Schwenker UUlm ANN Informationfusion

Outline Artificial neural networks (ANN) Recognition of bio-acoustic time series Emotion recognition in human computer interatction Schwenker UUlm ANN Informationfusion

Pattern recognition applications at NI Recognition of visual objects from camera images (OCR, faces recognition) Medical diagnosis and bioinformatics Speaker identification Speech recognition/understanding Recognition of human emotions from speech, and facial expressions Bio-acoustic pattern recognition .... Schwenker UUlm ANN Informationfusion

1. Artificial Neural networks Von Neumann Computer Biologial neural net Processor complex simple high speed low speed 1 or a few large number Computing centralized distributed sequential parallel by programs by learning from data Memory localized distributed addressable by keys addressable by content not faulttolerant faulttolerant Schwenker UUlm ANN Informationfusion

Layered Networks Layered neural networks (single or multilayer perceptrons, radial basis function networks) are widely used in pattern recognition and regression applications. Input Weight matrix Nonlinear transfer function Output Schwenker UUlm ANN Informationfusion

Neural Models Linear neuron n Weight vector � y = � x , c � = x i c i c i = 1 Threshold neuron Input � 1 � x , c � ≥ θ x y = 0 sonst Sigmoidal neuron 1 f y = f ( � x , c �− θ ) , f ( s ) = 1 + exp ( − β s ) Output RBF neuron y y = f ( � x − c � 2 ) , f ( r ) = exp ( − r 2 2 σ 2 ) Schwenker UUlm ANN Informationfusion

Learning in artificial neural nets Mapping F C : X → Y , Input connectivity matrix C learnt by C examples x Data x ∈ X or ( x , T ) ∈ X × Y Different types of target function E ( C ) . Optimising E ( C ) leads to Teacher T y Output learning rules for C . Schwenker UUlm ANN Informationfusion

Supervised learning j c ij Output y j , teaching signal T j . c ij adapted, such that y j ≈ T j . Input x Example: Delta-rule i � � ∆ c ij ∼ x i T j − y j Delta-rule minimises T Teacher j E ( c ) = � T − y � 2 Output y j Schwenker UUlm ANN Informationfusion

Unsupervised Learning j c ij c ij adapted without teaching signal Input x Example: Hebbian learning : i ∆ c ij ∼ x i y j Hebbian learning maximises E ( c ) = � y � 2 Output y j Schwenker UUlm ANN Informationfusion

Competitive learning Winner detection j c ij Neurons of neighbourhood of the winner are adapted Input x Example: SOM learning or i k-means : ∆ c ij ∼ ( x i − c ij ) · N j N Winner j + k-means minimises Neigbourhood y Output E ( c ) = � c − x � 2 j Schwenker UUlm ANN Informationfusion

Model complexity and training data Artificial neural networks can solve complex tasks, e.g. high-dimensional input (many input variables), high-dimensional output (multi-class problems). Large networks (with many parameters) are needed to achieve good approximations. Size of the training set grows with the number of free parameters � VCdim log 1 � + 1 � log ( 1 �� M �,δ = O VCdim = O ( W log ( K )) , � δ W ( K ) number of weights (units), � error, 1 − δ confidence. Typically the training data set is to small. Possible approach: Decomposition of the learning task in combination with information/sensor fusion . Schwenker UUlm ANN Informationfusion

Multimodal Informationfusion Sensors Feature extraction Fusion feature extraction 1 vision feature decision extraction 2 audio feature extraction N Schwenker UUlm ANN Informationfusion

Early fusion • Mid-level fusion • Late fusion(MCS) Schwenker UUlm ANN Informationfusion

Multiple Classifier Systems architecture Classifier�Layer Fusion�Layer 1 C (x ) 1 Feature�1 : : : : i C (x�) z i Feature�i F : : : I : C (x�) I Feature�I Classification Schwenker UUlm ANN Informationfusion

Fixed decision fusion mappings Fusion by Averaging : I F ( P ) := 1 � C i ( x i ) (1) I i = 1 Probabilistic Fusion : I 1 + α Pr ( ω = l ) Pr ( ω = l | x i ) Pr ( ω � = l ) � − 1 � � Pr ( ω = l | x 1 ... x I ) = 1 − � Pr ( ω � = l ) Pr ( ω � = l | x i ) Pr ( ω = l i = 1 (2) with Pr ( ω = l | x i ) = C i l ( x i ) + � l ( x i ) Voting, Median-Fusion, .... Schwenker UUlm ANN Informationfusion

Examples of trainable fusion mappings Train the classifier layer by 1 Train the fusion mapping 2 Decision templates Bayes rule Behaviour knowledge space (Linear) Associative memory networks (Hebbian Learning, delta learning rulde, pseudo-inverse solution) Artificial neural networks/Kernel methods Schwenker UUlm ANN Informationfusion

2. Bio-acoustic pattern recognition Schwenker UUlm ANN Informationfusion

Example : Ephippiger 0.6 0.4 0.2 Amplitude 0 −0.2 −0.4 −0.6 1 2 3 4 5 6 Time (s) 0.8 0.6 0.4 0.2 Amplitude 0 −0.2 −0.4 −0.6 −0.8 0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36 Time (s) 0.6 0.4 0.2 Amplitude 0 −0.2 −0.4 −0.6 278 280 282 284 286 Time (ms) Schwenker UUlm ANN Informationfusion

Extraction of Local Features in Time series window W 1 window W X( ) 1 � X( ) 2 �� X(��) s ( t ) T Signal t = 1 I local features X (  ) = ( x 1 (  ) ... x I (  )) Schwenker UUlm ANN Informationfusion

FCT-Architecture 1 j Feature�1 x�( ) x�( ) j j � � Feature i x ( ) j i Feature I x ( ) j Final I Fusion Classification X( ) j Classification j z 1 z o z z � � � � Temporal�fusion j =�1,�..., R Φ , Φ = � I F usion: X (  ) = ( x 1 (  ) ... x I (  )) ∈ I i = 1 d i C  := C  ( X (  )) C lassification: C o := F ( C 1 ... C J ) T emporal fusion: Schwenker UUlm ANN Informationfusion

CDT-Architecture C (x�(��)) j � 1 x�( ) j Decision�fusion � Feature�1 Classification 1 C x ( ) j i Feature i i C x ( ) j I Feature I C I o C Final Temporal�fusion j =�1,�..., Classification C i : I R d i → ∆ , i = 1 ... I C lassification: C 1 ( x 1 (  )) ... C I ( x I (  )) C  := F ( C 1 ( x 1 (  )) ... C I ( x I (  ))) D ecision fusion: C o := F ( C 1 ... C J ) T emporal fusion: Schwenker UUlm ANN Informationfusion

Results for cricket songs Crossvalidation experiments (mean error rates) of 28 cricket species with 4 to 6 animals per species. Radial-Basis-Function Networks as first level classifiers. Extracted features: pulse length, pulse distance, energy contour, Averaged fusion lead to an error ≥ 0 . 1 Algorithm ρ =0.0 ρ =0.2 ρ =0.4 ρ =0.6 ρ =0.8 ρ =1.0 DT 8.61 7.88 8.03 7.74 7.59 7.59 Multiple DT 8.32 8.03 7.15 6.86 6.86 6.72 Cluster DT 8.61 7.30 7.15 7.15 7.30 7.30 Schwenker UUlm ANN Informationfusion

3. Multimodal pattern recognition of emotions in HCI Human machine interaction (HCI) Emotion theory and emotional data collection Recognition of facial expressions Audio-Visual Laughter detection Schwenker UUlm ANN Informationfusion

Human machine interaction (1) Schwenker UUlm ANN Informationfusion

Human machine interaction (2) In many situations the human machine interaction (HCI) could be improved by having machines naturally adapt to their users. HCI should take into account information of the emotional state of the user, e.g. frustration, confusion, disliking, interest, surprise, anger, ... Schwenker UUlm ANN Informationfusion

Ekman’s 6 basic emotions Based on psychophysical experiments of facial expressions Ekman/Friesen defined 6 basic emotions: Anger Surprise Disgust Sadness Happiness Fear Schwenker UUlm ANN Informationfusion

More complex emotion theories Schwenker UUlm ANN Informationfusion

Frontal views Recognition of emotions in facial expressions based on frontal views seem to be easy ... Schwenker UUlm ANN Informationfusion

Helmut data set Three camera views to the user: frontal, back, total Person is labelled to be interested Schwenker UUlm ANN Informationfusion

Face detection from the frontal view Viola-Jones classifier is implemented to detect the region of the user’s face Sobel edge detector is applied to extract features relevant to classify the user’s emotional state Schwenker UUlm ANN Informationfusion

Multimodal emotions Emotions are expressed through Body movements (head, arms, torso, legs) Hand gestures Gaze Facial expressions Speech Biophysiological measures (e.g. skin conductance, heart rate, blood volume pressure) Schwenker UUlm ANN Informationfusion

Multimodal emotional data Nexus with 24 EEG sensors, 4 EMG sensors, blood pressure and respiration meter 1 camera 1 microphone Schwenker UUlm ANN Informationfusion

3.1 Emotion regnition from facial expressions Cohn-Kanade benchmark data base Basic emotions (anger, disgust, fear, happiness, sadness, surprise) acted by semi-professional actors. 432 sequences (97 individuals) of 30 frames per second; resolution 640 × 480; Schwenker UUlm ANN Informationfusion

Artificial Neural Networks for Multimodal Information Fusion - PowerPoint PPT Presentation

Artificial Neural Networks for Multimodal Information Fusion Friedhelm Schwenker Institute of Neural Information Processing University of Ulm Cairo University April 9, 2010 Schwenker UUlm ANN Informationfusion Outline Artificial neural

Multimodal Machine Learning Louis-Philippe (LP) Morency CMU Multimodal Communication and Machine

Multimodal Machine Learning Louis-Philippe (LP) Morency CMU Multimodal Communication and Machine

Learning Neural Networks Learning Neural Networks Neural Networks can represent complex Neural

Neural Networks and Handwriting Recognition Background Neural Networks Neural Network Steven

Artificial Neural Networks By: Kodi Neumiller Overview What is an artificial neural network

Introduction to Artificial Intelligence Neural Networks - Deep Learning for NLP Janyl Jumadinova

Neural Networks Neural networks arise from attempts to model Neural Networks human/animal

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Artificial Neural Networks Roger Barlow CODATA School - Roger Barlow -Artificial Neural Networks

How Neural Networks (NN) Biological Neuron: A . . . Can (Hopefully) Learn Artificial Neural . .

Artificial Neural Networks Oliver Schulte - CMPT 726 Feed-forward Networks Network Training

Networks Luke Schuler Overview What is an Artificial Neural Network? History

Sequential Data with Neural Networks Recurrent Neural Networks Sequential input / output Greg

CS4501: Introduction to Computer Vision Neural Networks (NNs) Artificial Neural Networks (ANNs)

CHAPTER II I CHAPTER I Recurrent Neural Networks Recurrent Neural Networks CHAPTER II : I :

CHAPTER II III I CHAPTER Neural Networks as Neural Networks as Associative Memory

Richard Gibson SIAT Faculty Search Presentation February 28, 2013 One Slide Summary 2009

Artificial intelligence Artificial Intelligence is the science of PHILOSOPHY OF ARTIFICIAL

Intro to Artificial Neural Networks Oscar Maas @oscmansan Outline 1. Perceptrons 2.

The Artificial Jack of All Trades: The Importance of Generality in Approaches to AI Tarek R.

Artificial Intelligence for Games IMGD 4000 Introduction to Artificial Intelligence (AI)

Dynamic Modelling of the Whole Heart Based on a Frequency Formulation and Implementation of

Figure 2.25 from page 92 of Exploring the Heart of Ma2er

Giacomo Fiorin, Fabrizio Marinelli Temple Materials Institute, Philadelphia, PA National Heart,