Speech Signal Representations Berlin Chen 2003 References: 1. X. - PowerPoint PPT Presentation

Speech Signal Representations Berlin Chen 2003 References: 1. X. Huang et. al., Spoken Language Processing, Chapters 5, 6 2. J. R. Deller et. al., Discrete-Time Processing of Speech Signals, Chapters 4-6 3. J. W. Picone, “Signal modeling techniques in speech recognition,” proceedings of the IEEE, September 1993, pp. 1215-1247

Introduction • Current speech recognition systems are mainly composed of : – A front-end feature extractor (feature extraction module) • Required to discover salient characteristics suited for classification • Based on scientific and/or heuristic knowledge about patterns to recognize – A back-end classifier (classification module) • Required to set class boundaries accurately in the feature space • Statistically designed according to the fundamental Bayes’ decision theory 2

Background Review: Background Review: Digital Signal Processing 3

Analog Signal to Digital Signal Analog Signal Digital Signal: Discrete-time Discrete-time Signal or Digital Signal signal with discrete [ ] ( ) amplitude = x n x nT , T : sampling period a t = nT sampling period=125 μ s 1 T F s = sampling rate =>sampling rate=8kHz 4

Analog Signal to Digital Signal Discrete-Time Continuous-Time to Discrete-Time Conversion Continuous-Time Signal Signal Sampling Impulse Train ( ) x a t To [ ] ( ) ) ( ) Sequence = ˆ x n ( x nT x t s a ∞ ( ) ( ) ( ) ( ) ∑ = = δ − x t s t x t t nT switch a a = −∞ n ( ) ∞ ( ) ∞ ∞ [ ] ( = δ − ∑ ( ) ( ) ∑ ) s t t nT ∑ = δ − = δ − x nT t nT x n t nT a = −∞ n [ ] = −∞ = −∞ n n [ ] ( ) Periodic Impulse Train x n x s t can be uniquely specified by x n Digital Signal ( ) Discrete-time x a t signal with discrete amplitude ( ) δ = ∀ ≠ t 0 , t 0 ( ) ( ) ∞ ∞ 1 = δ − s t t nT ( ) ∑ ∫ δ = t dt 1 = −∞ n − ∞ -2T -T 0 T 2 T 3 T 4 T 5 T 6 T 7 T 8 T 5

Analog Signal to Digital Signal • A continuous signal sampled at different periods ( ) ( ) x a t x a t T 1 ( ) x t s ∞ ( ) ( ) ( ) ( ) ∑ = = δ − x t s t x t t nT a a = −∞ n ∞ ∞ [ ] ( ( ) ( ) ) ∑ ∑ = δ − = δ − x nT t nT x n t nT a = −∞ = −∞ n n 6

Analog Signal to Digital Signal • Spectra ( ) Ω X a j Ω N π = 2 π 2 ( ) ( ) ∞ Ω = π 2 F (sampling frequency) Ω = δ Ω − Ω S j k ∑ s s s T T = −∞ k 1 ( ) ( ) ( )  Ω < Ω ( ) ( ) ( ) T / 2 Ω = Ω Ω Ω = Ω ∗ Ω ( ) X j R j X j X j X j S j Ω = s R j  Ω a p s a π s 2 Ω 0 otherwise s  π =  ∞ 1 1 ( ) ( ( ) ) ∑ Low-pass filter Ω = Ω − Ω Ω < Ω X j X j k   s a s T N s 2  T  = −∞ k ( ) Ω − Ω > Ω   Q s N N   ⇒ Ω > Ω 2   s N high frequency components π =  1 got superimposed on Ω > Ω   low frequency N s 2  T  components ( ) Ω − Ω < Ω   Q s N N   ⇒ Ω < Ω 2   ( ) ( ) s N Ω Ω aliasing distortion X j can' t be recovered from X j 7 a p

Analog Signal to Digital Signal • To avoid aliasing ( overlapping , fold over ) – The sampling frequency should be greater than two times of Ω > Ω frequency of the signal to be sampled → 2 s N – (Nyquist) sampling theorem • To reconstruct the original continuous signal – Filtered with a low pass filter with band limit Ω s • Convolved in time domain ∞ ( ) ∑ ( ) ( ) = − x t x nT h t nT a a ( ) = −∞ n = sinc Ω h t t s ∞ ( ) ( ) ∑ = Ω − x nT sinc t nT a s = −∞ n 8

Two Main Approaches to Digital Signal Processing • Filtering Signal in Signal out Filter [ ] [ ] x n y n Amplify or attenuate some frequency components of [ ] x n • Parameter Extraction Signal in Parameter out Parameter [ ] Extraction x n     c c  c  21 L 1 11       c c c e.g.:       22 L 2 12     1. Spectrum Estimation         2. Parameters for Recognition             c c c       2 m Lm 1 m 9

Sinusoid Signals [ ] ( ) = ω + φ x n A cos n f : normalized frequency ≤ f ≤ 0 1 – : amplitude ( 振幅 ) A π 2 ω – : angular frequency ( 角頻率 ), ω = π = 2 f T φ – : phase ( 相角 ) Period, represented by number of samples π   [ ] =  ω n −  x n A cos  2  = T 25 samples 10

Sinusoid Signals [ ] is periodic with a period of N (samples) • x n [ ] [ ] + = x n N x n ( ) ( ) ω + + φ = ω + φ A cos ( n N ) A cos n ω = π N 2 π 2 ω = N • Examples (sinusoid signals) [ ] ( ) – is periodic with period N= 8 = π x n cos n / 4 1 [ ] ( ) = π – is periodic with period N= 16 x n cos 3 n / 8 2 [ ] ( ) – = is not periodic x n cos n 3 11

Sinusoid Signals [ ] ( ) = π x n cos n / 4 1 π π π π       = = + = + cos  n  cos  ( n N )  cos  n N  1 1  4   4   4 4  π ( ) ⇒ = π ⋅ ⇒ ⋅ N 2 k 8 k N and k are positive integers 1 1 4 ∴ = N 8 1 [ ] ( ) = π x n cos 3 n / 8 2 π π π π  3   3   3 3  ( ) = ⋅ = + = ⋅ + ⋅       cos n cos n N cos n N 2 2  8   8   8 8  π 3 16 ( ) ⇒ ⋅ = π ⋅ ⇒ = N 2 k N k N and k are positive numbers 2 2 2 8 3 ∴ = N 16 2 [ ] ( ) = x n cos n 3 ( ) ( ( ) ) ( ) = ⋅ = ⋅ + = + cos 1 n cos 1 n N cos n N 3 3 ⇒ = π ⋅ N 2 k 3 N and k are positive integers Q 3 ∴ N doesn' t exist ! 3 12

Sinusoid Signals • Complex Exponential Signal – Use Euler’s relation to express complex numbers = + z x jy ( ) φ ⇒ = j = φ + φ z Ae A cos j sin ( ) A is a real number Im = φ x A cos = φ y A sin Re 13

Sinusoid Signals • A Sinusoid Signal [ ] ( ) = ω + φ x n A cos n { } ( ) ω + φ = j n Re Ae { } ω φ = j n j Re Ae e 14

Sinusoid Signals • Sum of two complex exponential signals with same frequency ( ) ( ) ω + φ ω + φ j n + j n A e A e 0 1 0 1 ( ) ω φ φ = + j n j j e A e A e 0 1 0 1 ω φ = j n j e Ae ( ) ω + φ = j n Ae A , A and A are real numbers 0 1 – When only the real part is considered ( ) ( ) ( ) ω + φ + ω + φ = ω + φ A cos n A cos n A cos n 0 0 1 1 – The sum of N sinusoids of the same frequency is another sinusoid of the same frequency 15

Some Digital Signals 16

Some Digital Signals [ ] • Any signal sequence can be represented x n as a sum of shift and scaled unit impulse sequences (signals) [ ] [ ] [ ] ∞ = δ − x n x k n k ∑ = −∞ k Time-shifted unit scale/weighted impulse sequence ∞ 3 [ ] [ ] [ ] [ ] [ ] ∑ ∑ = δ − = δ − x n x k n k x k n k = −∞ = − k k 2 [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] = − δ + + − δ + + δ + δ − + δ − + δ − x 2 n 2 x 1 n 1 x 0 n x 1 n 1 x 2 n 2 x 3 n 3 ( ) [ ] ) [ ] ( ) [ ] ( ) [ ] ) [ ] ( ) [ ] ( ( = δ + + − δ + + δ + δ − + − δ − + δ − 1 n 2 2 n 1 2 n 1 n 1 1 n 2 1 n 3 17

Digital Systems • A digital system T is a system that, given an input signal x [ n ], generates an output signal y [ n ] [ ] [ ] { } = y n T x n [ ] { } [ ] T x n y n 18

Properties of Digital Systems • Linear – Linear combination of inputs maps to linear combination of outputs [ ] [ ] [ ] [ ] { } { } { } + = + T ax n bx n aT x n bT x n 1 2 1 2 • Time-invariant (Time-shift) – A time shift of in the input by m samples give a shift in the output by m samples [ ] [ ] { } ± = ± ∀ y n m T x n m , m 19

Properties of Digital Systems • Linear time-invariant (LTI) – The system output can be expressed as a convolution ( 迴旋積分 ) of the input x [ n ] and the impulse response h [ n ] – The system can be characterized by the system’s impulse response h [ n ], which also is a signal sequence [ ] • If the input x [ n ] is impulse , the output is h [ n ] δ n 20

Speech Signal Representations Berlin Chen 2003 References: 1. X. - PowerPoint PPT Presentation

Speech Signal Representations Berlin Chen 2003 References: 1. X. Huang et. al., Spoken Language Processing, Chapters 5, 6 2. J. R. Deller et. al., Discrete-Time Processing of Speech Signals, Chapters 4-6 3. J. W. Picone, Signal modeling

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

Speech Signal Representations Part 2: Speech Signal Processing Hsin-min Wang References: 1 X.

Speech Processing 15-492/18-492 Speech Synthesis Signal Processing Signal Manipulation Signal

Chapter 1 Introduction to Speech Signal Processing 1 Outline The

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Recognition Acoustic

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

Speech Signal Representations Part 1: Digital Signal Processing Hsin-min Wang References: 1 X.

Tx Signal: 1000 Hz sine wave; Attenuation; Random noise with 0.5ms spike Tx Signal Noise Rx

The Prediction Error Signal 1 Prediction Error Signal Behavior 2 LP Speech Analysis file:s5,

Speech Processing 11-492/18-492 Speech Synthesis Signal Processing Signal Manipulation

Speech Processing 15-492/18-492 Speech Recognition Signal Processing Analog to Digital Speech

Machine Learning for Signal Processing Lecture 1: Signal Representations Class 1. 27 August

Speech Signal Representations Berlin Chen 2004 References: 1. X. Huang et. al., Spoken Language

Speech Signal Representations Berlin Chen 2004 References: 1. X. Huang et. al., Spoken Language

Speech Processing 15-492/18-492 Speech Recognition Acoustic modeling Pronunciation dictionary

Analysis of speech Dr. Anil Kumar Vuppala IIIT Hyderabad Analysis of speech Representing speech

12/20/2017 Lectures on Signals & systems Engineering Designed and Presented by Dr. Ayman

Chapter 7 Addressing Design Goals Using UML, Patterns, and Java Podcast Ch07-01 Title :

Worst-Case Execution Time Analysis Martin Toft mt@cs.aau.dk PhD student Distributed and

Bootstrapping Statistical Parsers from Small Datasets Anoop Sarkar Department of Computing

-r' ~ . IS ) .j 10 -

Development of a Fully Depleted Back Illumination Sensor Based on SOI CMOS Technology for Future

Spin waves Part I Sylvain Petit Laboratoire Lon Brillouin CE-Saclay F-91191 Gif sur Yvette

Chiral symmetry breaking in graphene: a lattice study of excitonic and antiferromagnetic phase

Speech Signal Representations Berlin Chen 2003 References: 1. X. - PowerPoint PPT Presentation

Speech Signal Representations Berlin Chen 2003 References: 1. X. Huang et. al., Spoken Language Processing, Chapters 5, 6 2. J. R. Deller et. al., Discrete-Time Processing of Speech Signals, Chapters 4-6 3. J. W. Picone, Signal modeling

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

Speech Signal Representations Part 2: Speech Signal Processing Hsin-min Wang References: 1 X.

Speech Processing 15-492/18-492 Speech Synthesis Signal Processing Signal Manipulation Signal

Chapter 1 Introduction to Speech Signal Processing 1 Outline The

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Recognition Acoustic

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

Speech Signal Representations Part 1: Digital Signal Processing Hsin-min Wang References: 1 X.

Tx Signal: 1000 Hz sine wave; Attenuation; Random noise with 0.5ms spike Tx Signal Noise Rx

The Prediction Error Signal 1 Prediction Error Signal Behavior 2 LP Speech Analysis file:s5,

Speech Processing 11-492/18-492 Speech Synthesis Signal Processing Signal Manipulation

Speech Processing 15-492/18-492 Speech Recognition Signal Processing Analog to Digital Speech

Machine Learning for Signal Processing Lecture 1: Signal Representations Class 1. 27 August

Speech Signal Representations Berlin Chen 2004 References: 1. X. Huang et. al., Spoken Language

Speech Signal Representations Berlin Chen 2004 References: 1. X. Huang et. al., Spoken Language

Speech Processing 15-492/18-492 Speech Recognition Acoustic modeling Pronunciation dictionary

Analysis of speech Dr. Anil Kumar Vuppala IIIT Hyderabad Analysis of speech Representing speech

12/20/2017 Lectures on Signals &amp; systems Engineering Designed and Presented by Dr. Ayman

Chapter 7 Addressing Design Goals Using UML, Patterns, and Java Podcast Ch07-01 Title :

Worst-Case Execution Time Analysis Martin Toft mt@cs.aau.dk PhD student Distributed and

Bootstrapping Statistical Parsers from Small Datasets Anoop Sarkar Department of Computing

-r' ~ . IS ) .j 10 -

Development of a Fully Depleted Back Illumination Sensor Based on SOI CMOS Technology for Future

Spin waves Part I Sylvain Petit Laboratoire Lon Brillouin CE-Saclay F-91191 Gif sur Yvette

Chiral symmetry breaking in graphene: a lattice study of excitonic and antiferromagnetic phase

12/20/2017 Lectures on Signals & systems Engineering Designed and Presented by Dr. Ayman