EM EMOTION RECOGNITION IN IN SOUND ANASTASIYA S. POPOVA HSE NN - - PowerPoint PPT Presentation

em emotion recognition in in sound
SMART_READER_LITE
LIVE PREVIEW

EM EMOTION RECOGNITION IN IN SOUND ANASTASIYA S. POPOVA HSE NN - - PowerPoint PPT Presentation

EM EMOTION RECOGNITION IN IN SOUND ANASTASIYA S. POPOVA HSE NN 2017 INTRODUCTION THE PROBLEM y : X Y y : R n Y THE DATASET (RA RAVDESS DA DATABASE) http://neuron.arts.ryerson.ca/ravdess/?f=3 PRETREATMENT Length equalization


slide-1
SLIDE 1

EM EMOTION RECOGNITION IN IN SOUND

ANASTASIYA S. POPOVA

HSE NN 2017

slide-2
SLIDE 2

INTRODUCTION

slide-3
SLIDE 3

THE PROBLEM

y : X →Y

y : Rn →Y

slide-4
SLIDE 4

THE DATASET (RA RAVDESS DA DATABASE)

http://neuron.arts.ryerson.ca/ravdess/?f=3

slide-5
SLIDE 5

PRETREATMENT

Length equalization

slide-6
SLIDE 6

PRETREATMENT

Loudness normalization

slide-7
SLIDE 7

PRETREATMENT

Highpass&Lowpass filters, voice audio detection (VAD) algorithm

slide-8
SLIDE 8

SPECTROGRAM -> MELSPECTROGRAM

slide-9
SLIDE 9

THE DIFFERENCE BETWEEN CLASSES (HYPOTHESIS )

neutral calm sad angry fearful happy surprised disgust

slide-10
SLIDE 10

CONVOLUTION NETWORK

slide-11
SLIDE 11

VGG-11 à VGG-16

Input RGB image Conv3-64 Maxpool Conv3-128 Maxpool Conv3-256 Conv3-256 Maxpool Conv3-512 Conv3-512 Maxpool Conv3-512 Conv3-512 Maxpool FC-4096 FC-4096 FC-1000 Soft-max Input RGB image Conv3-64 Maxpool Conv3-128 Maxpool Conv3-256 Conv3-256 Conv3-256 Maxpool Conv3-512 Conv3-512 Conv3-512 Maxpool Conv3-512 Conv3-512 Conv3-512 Maxpool FC-4096 FC-4096 FC-1000 Soft-max

slide-12
SLIDE 12

CLASSIFICATION ON 8 CLASSES ACCURACY

VGG-11 + spectrogram VGG-16 + melspectrogram

slide-13
SLIDE 13

CONFUSION MATRIX

slide-14
SLIDE 14

MEL FREQUENCY CEPSTRAL COEFFICIENTS (MFCC)

slide-15
SLIDE 15

stasysp.96@gmail.com