machine learning for signal
play

Machine Learning for Signal Processing Representing Signals: Images - PowerPoint PPT Presentation

Machine Learning for Signal Processing Representing Signals: Images and Sounds Class 4. 10 Sep 2013 Instructor: Bhiksha Raj 10 Sep 2013 11-755/18-797 1 Administrivia Basics of probability: Will not be covered Several very nice


  1. How many frequencies in all? • A max of L/2 periods are possible • If we try to go to (L/2 + X) periods, it ends up being identical to having (L/2 – X) periods – With sign inversion • Example for L = 20 – Red curve = sine with 9 cycles (in a 20 point sequence) • Y(n) = sin(2 p 9n/20) – Green curve = sine with 11 cycles in 20 points • Y(n) = -sin(2 p 11n/20) – The blue lines show the actual samples obtained • These are the only numbers stored on the computer • This set is the same for both sinusoids 10 Sep 2013 11-755/18-797 25

  2. How to compose the signal from sinusoids         w      Signal w B w B w B 1 1 1 2 2 3 3     w   1 =     w   [ ] W B B B B 2 w   1 2 3     2    w  3         w   3     B 1 B 2 B 3  BW Signal  ( ) W pinv B Signal    1 T ( ) . PROJECTION BW B B B B Signal • The sines form the vectors of the projection matrix – Pinv() will do the trick as usual 10 Sep 2013 11-755/18-797 26

  3. How to compose the signal from sinusoids  p p p      sin(2 . 0 . 0 /L) sin(2 . 1 . 0 /L) . . sin(2 . ( / 2 ). 0 /L) [ 0 ] L w s 1       p p p sin(2 . 0 . 1 /L) sin(2 . 1 . 1 /L) . . sin(2 . ( / 2 ). 1 /L) [ 1 ] L w s       2        . . . . . . .        . . . . .   .   .        p  p  p    sin(2 . 0 .( 1 ) /L) sin(2 . 1 .( 1 ) /L) . . sin(2 . ( / 2 ).( 1 ) /L)   w   [ 1 ]  L L L L s L / 2 L L/2 columns only    Signal w B w B w B 1 1 2 2 3 3   w 1     w [ ] W B B B B 2   1 2 3    BW Signal  w  3   [ 0 ] s  ( )   W pinv B Signal [ 1 ] s    Signal   .       [ 1 ]  s L • The sines form the vectors of the projection matrix – Pinv() will do the trick as usual 10 Sep 2013 11-755/18-797 27

  4. Interpretation.. • Each sinusoid’s amplitude is adjusted until it gives us the least squared error – The amplitude is the weight of the sinusoid • This can be done independently for each sinusoid 10 Sep 2013 11-755/18-797 28

  5. Interpretation.. • Each sinusoid’s amplitude is adjusted until it gives us the least squared error – The amplitude is the weight of the sinusoid • This can be done independently for each sinusoid 10 Sep 2013 11-755/18-797 29

  6. Interpretation.. • Each sinusoid’s amplitude is adjusted until it gives us the least squared error – The amplitude is the weight of the sinusoid • This can be done independently for each sinusoid 10 Sep 2013 11-755/18-797 30

  7. Interpretation.. • Each sinusoid’s amplitude is adjusted until it gives us the least squared error – The amplitude is the weight of the sinusoid • This can be done independently for each sinusoid 10 Sep 2013 11-755/18-797 31

  8. Sines by themselves are not enough • Every sine starts at zero – Can never represent a signal that is non-zero in the first sample! • Every cosine starts at 1 – If the first sample is zero, the signal cannot be represented! 10 Sep 2013 11-755/18-797 32

  9. The need for phase Sines are shifted: do not start with value = 0 • Allow the sinusoids to move!  p    p    sin( 2 / ) sin( 2 / ) .... signal w kn N w kn N 1 1 2 2 • How much do the sines shift? 10 Sep 2013 11-755/18-797 33

  10. Determining phase • Least squares fitting: move the sinusoid left / right, and at each shift, try all amplitudes – Find the combination of amplitude and phase that results in the lowest squared error • We can still do this separately for each sinusoid – The sinusoids are still orthogonal to one another 10 Sep 2013 11-755/18-797 34

  11. Determining phase • Least squares fitting: move the sinusoid left / right, and at each shift, try all amplitudes – Find the combination of amplitude and phase that results in the lowest squared error • We can still do this separately for each sinusoid – The sinusoids are still orthogonal to one another 10 Sep 2013 11-755/18-797 35

  12. Determining phase • Least squares fitting: move the sinusoid left / right, and at each shift, try all amplitudes – Find the combination of amplitude and phase that results in the lowest squared error • We can still do this separately for each sinusoid – The sinusoids are still orthogonal to one another 10 Sep 2013 11-755/18-797 36

  13. Determining phase • Least squares fitting: move the sinusoid left / right, and at each shift, try all amplitudes – Find the combination of amplitude and phase that results in the lowest squared error • We can still do this separately for each sinusoid – The sinusoids are still orthogonal to one another 10 Sep 2013 11-755/18-797 37

  14. The problem with phase p   p   p         sin(2 . 0 . 0 /L ) sin(2 . 1 . 0 /L ) . . sin(2 . ( / 2 ). 0 /L ) [ 0 ] L w s 0 1 L/2 1       p   p   p   sin(2 . 0 . 1 /L ) sin(2 . 1 . 1 /L ) . . sin(2 . ( / 2 ). 1 /L ) [ 1 ] L w s       0 1 L/2 2        . . . . . . .        . . . . .   .   .        p    p    p     sin(2 . 0 .( 1 ) /L ) sin(2 . 1 .( 1 ) /L ) . . sin(2 . ( / 2 ).( 1 ) /L )  L L L L   w   [ 1 ]  s L 0 1 L/2 / 2 L L/2 columns only • This can no longer be expressed as a simple linear algebraic equation – The “basis matrix” depends on the unknown phase • I.e. there’s a component of the basis itself that must be estimated! • Linear algebraic notation can only be used if the bases are fully known – We can only (pseudo) invert a known matrix 10 Sep 2013 11-755/18-797 38

  15. Complex Exponential to the rescue  [ ] sin( * ) b n freq n    [ ] exp( * * ) cos( * ) sin( * ) b n j freq n freq n j freq n   1 j           exp( * * ) exp( * * ) exp( ) cos( * ) sin( * ) j freq n j freq n freq n j freq n • The cosine is the real part of a complex exponential – The sine is the imaginary part • A phase term for the sinusoid becomes a multiplicative term for the complex exponential!! 10 Sep 2013 11-755/18-797 39

  16. Explaining with Complex Exponentials  A x  B x  C x 10 Sep 2013 11-755/18-797 40

  17. Complex exponentials are well behaved • Like sinusoids, a complex exponential of one frequency can never explain one of another – They are orthogonal • They represent smooth transitions • Bonus: They are complex – Can even model complex data! • They can also model real data – exp(j x ) + exp(-j x) is real • cos(x) + j sin(x) + cos(x) – j sin(x) = 2cos(x) • More importantly       ( / 2 ) ( / 2 ) L x n L x n – is real p  p     exp 2 exp 2 j j     L L • The complex exponentials with frequencies equally spaced from L/2 are complex conjugates 10 Sep 2013 11-755/18-797 41

  18. Complex exponentials are well behaved       ( / 2 ) ( / 2 ) L x n L x n p  p     • is real exp 2 exp 2 j j     L L – The complex exponentials with frequencies equally spaced from L/2 are complex conjugates • “Frequency = k”  k periods in L samples       ( / 2 ) ( / 2 ) L x n L x n p  p     exp 2 ( ) exp 2 a j conjugate a j     L L – Is also real – If the two exponentials are multiplied by numbers that are conjugates of one another the result is real 10 Sep 2013 11-755/18-797 42

  19. Complex Exponential bases Complex conjugates     w 0     .       w  / 2 1 L   =   w   / 2 L   w    / 2 1 L     .         w  1 L b 0 b 1 b L/2   ( ) w conjugate w  / 2 / 2 L k L k • Explain the data using L complex exponential bases • The weights given to the (L/2 + k)th basis and the (L/2 – k)th basis should be complex conjugates, to make the result real – Because we are dealing with real data • Fortunately, a least squares fit will give us identical weights to both bases automatically; there is no need to impose the constraint externally 10 Sep 2013 11-755/18-797 43

  20. Complex Exponential Bases: Algebraic Formulation  p p p       exp(j2 . 0 . 0 /L) . exp(j2 . ( / 2 ). 0 /L) . . exp(j2 . ( 1 ). 0 /L) [ 0 ] L L S s 0       p p p  exp(j2 . 0 . 1 /L) . exp(j2 . ( / 2 ). 1 /L) . . exp(j2 . ( 1 ). 1 /L) . [ 1 ] L L s              . . . . . . S / 2 L        . . . . .   .   .        p  p  p     exp(j2 . 0 .( 1 ) /L) . exp(j2 . ( / 2 ).( 1 ) /L) . exp(j2 . ( 1 ).( 1 ) /L)     [ 1 ]  L L L L L S s L  1 L • Note that S L/2+x = conjugate(S L/2-x ) for real s 10 Sep 2013 11-755/18-797 44

  21. Shorthand Notation 1 1   , k n  p  p  p exp( 2 / ) cos( 2 / ) sin( 2 / ) W j kn L kn L j kn L L L L    0 , 0 / 2 , 0 1 , 0     L L [ 0 ] S s . . . W W W 0   L L L      0 , 1 / 2 , 1 1 , 1 L L   . [ 1 ] . . . s W W W     L L L        . S . . . . .   / 2 L        .   .  . . . . .            0 , 1 / 2 , 1 1 , 1 L L L L L      [ 1 ]  S s L . .  W W W   1 L L L L • Note that S L/2+x = conjugate(S L/2-x ) 10 Sep 2013 11-755/18-797 45

  22. A quick detour • Real Orthonormal matrix: – XX T = X X T = I • But only if all entries are real – The inverse of X is its own transpose • Definition: Hermitian – X H = Complex conjugate of X T • Conjugate of a number a + ib = a – ib • Conjugate of exp(ix) = exp(-ix) • Complex Orthonormal matrix – XX H = X H X = I – The inverse of a complex orthonormal matrix is its own Hermitian 10 Sep 2013 11-755/18-797 46

  23. W -1 = W H    0 , 0 L / 2 , 0 L 1 , 0 . . . 1 W W W  p k , n L L L exp( 2 / )   W j kn L L  0 , 1 L / 2 , 1 L 1 , 1 L . . . W W W   L L L    W . . . . .    . . . . .        0 , 1 / 2 , 1 1 , 1 L L L L L  . .  W W W L L L      0 , 0 0 , / 2 0 , 1 L L . . . W W W L L L       1 , 0 , 1 , / 2 1 , 1 L L . . . W W W   L L L 1    p ,   k n  exp( 2 / ) W j kn L H W . . . . . L L     . . . . .          ( 1 ), 0 ( 1 ), / 2 ( 1 ), ( 1 ) L L L L L  . .  W W W L L L  The complex exponential basis is orthogonal  Its inverse is its own Hermitian  W -1 = W H 11-755/18-797 47 10 Sep 2013

  24. Doing it in matrix form        0 , 0 / 2 , 0 1 , 0 L L [ 0 ] . . . S s W W W 0 L L L        0 , 1 L / 2 , 1 L 1 , 1 . [ 1 ] . . . s W W W       L L L        . . . . . . S / 2   L     . .   . . . . .                0 , 1 / 2 , 1 1 , 1 L L L L L     [ 1 ]  . .  S s L W W W  1 L L L L          0 , 0 0 , L / 2 0 , L 1 [ 0 ] S . . . s W W W 0 L L L           1 , 0 , 1 , / 2 1 , 1 L L . [ 1 ] . . . s W W W       L L L        . S . . . . . L / 2       . .   . . . . .                   ( 1 ), 0 ( 1 ), / 2 ( 1 ), ( 1 ) L L L L L    [ 1 ]   . .  S s L W W W  1 L L L L – Because W -1 = W H 10 Sep 2013 11-755/18-797 48

  25. The Discrete Fourier Transform          0 , 0 0 , / 2 0 , 1 L L [ 0 ] S s . . . W W W 0   L L L         1 , 0 , 1 , / 2 1 , 1 L L .   [ 1 ] s . . .   W W W   L L L        . S . . . . . / 2   L        .   .  . . . . .               ( 1 ), 0 ( 1 ), / 2 ( 1 ), ( 1 ) L L L L L [ 1 ]       S s L . . W W W    1 L L L L • The matrix to the right is called the “Fourier Matrix” • The weights (S 0 , S 1 . . Etc.) are called the Fourier transform 10 Sep 2013 11-755/18-797 49

  26. The Inverse Discrete Fourier Transform        0 , 0 / 2 , 0 1 , 0 L L [ 0 ] . . . S s W W W 0   L L L      0 , 1 / 2 , 1 1 , 1 L L   . [ 1 ] s . . .   W W W   L L L        . S . . . . .   / 2 L        .   .  . . . . .            0 , 1 / 2 , 1 1 , 1 L L L L L [ 1 ]       S s L . . W W W    1 L L L L • The matrix to the left is the inverse Fourier matrix • Multiplying the Fourier transform by this matrix gives us the signal right back from its Fourier transform 10 Sep 2013 11-755/18-797 50

  27. The Fourier Matrix • Left panel: The real part of the Fourier matrix – For a 32-point signal • Right panel: The imaginary part of the Fourier matrix 10 Sep 2013 11-755/18-797 51

  28. The FAST Fourier Transform • The outcome of the transformation with the Fourier matrix is the DISCRETE FOURIER TRANSFORM (DFT) • The FAST Fourier transform is an algorithm that takes advantage of the symmetry of the matrix to perform the matrix multiplication really fast • The FFT computes the DFT – Is much faster if the length of the signal can be expressed as 2 N 10 Sep 2013 11-755/18-797 52

  29. Images • The complex exponential is two dimensional – Has a separate X frequency and Y frequency • Would be true even for checker boards! – The 2-D complex exponential must be unravelled to form one component of the Fourier matrix • For a KxL image, we’d have K*L bases in the matrix 10 Sep 2013 11-755/18-797 53

  30. Typical Image Bases • Only real components of bases shown 10 Sep 2013 11-755/18-797 54

  31. DFT: Properties • The DFT coefficients are complex – Have both a magnitude and a phase    | | exp( ) S S j S k k k • Simple linear algebra tells us that – DFT(A + B) = DFT(A) + DFT(B) – The DFT of the sum of two signals is the DFT of their sum • A horribly common approximation in sound processing – Magnitude(DFT(A+B)) = Magnitude(DFT(A)) + Magnitude(DFT(B)) – Utterly wrong – Absurdly useful 10 Sep 2013 11-755/18-797 55

  32. Symmetric signals * * * * * * * * * * * * * * * * * * * * * * * * * Contributions from points equidistant from L/2 combine to cancel out imaginary terms • If a signal is (conjugate) symmetric around L/2, the Fourier coefficients are real! – A(L/2-k) * exp(-j *f*(L/2-k)) + A(L/2+k) * exp(-j*f*(L/2+k)) is always real if A(L/2-k) = conjugate(A(L/2+k)) – We can pair up samples around the center all the way; the final summation term is always real • Overall symmetry properties – If the signal is real, the FT is (conjugate) symmetric – If the signal is (conjugate) symmetric, the FT is real – If the signal is real and symmetric, the FT is real and symmetric 10 Sep 2013 11-755/18-797 56

  33. The Discrete Cosine Transform • Compose a symmetric signal or image – Images would be symmetric in two dimensions • Compute the Fourier transform – Since the FT is symmetric, sufficient to store only half the coefficients (quarter for an image) • Or as many coefficients as were originally in the signal / image 10 Sep 2013 11-755/18-797 57

  34. DCT  p p  p       cos(2 ( 0 . 5 ). 0 /2L) cos(2 .( 1 0.5) . 0 /2L) . . cos(2 . ( 0 . 5 ). 0 /2L) [ 0 ] L w s 0       p p  p  cos(2 . ( 0 . 5 ). 1 /2L) cos(2 .( 1 0.5) . 1 /2L) . . cos(2 . ( 0 . 5 ). 1 /2L) [ 1 ] L w s       1        . . . . . . .        . . . . .   .   .        p  p   p     cos(2 . ( 0 . 5 ).( 1 ) /2L) cos(2 .( 1 0.5) .( 1 ) /2L) . . cos(2 . ( 0 . 5 ).( 1 ) /2L)     [ 1 ]  L L L L w s L  1 L L columns • Not necessary to compute a 2xL sized FFT – Enough to compute an L-sized cosine transform – Taking advantage of the symmetry of the problem • This is the Discrete Cosine Transform 10 Sep 2013 11-755/18-797 58

  35. Representing images Multiply by DCT matrix DCT • Most common coding is the DCT • JPEG: Each 8x8 element of the picture is converted using a DCT • The DCT coefficients are quantized and stored – Degree of quantization = degree of compression • Also used to represent textures etc for pattern recognition and other forms of analysis 10 Sep 2013 11-755/18-797 59

  36. Some tricks to computing Fourier transforms • Direct computation of the Fourier transform can result in poor representations • Boundary effects can cause error – Solution : Windowing • The size of the signal can introduce inefficiency – Solution: Zero padding 10 Sep 2013 11-755/18-797 60

  37. What does the DFT represent  p p p       exp(j2 . 0 . 0 /L) . exp(j2 . ( / 2 ). 0 /L) . . exp(j2 . ( 1 ). 0 /L) [ 0 ] L L S s 0       p p p  exp(j2 . 0 . 1 /L) . exp(j2 . ( / 2 ). 1 /L) . . exp(j2 . ( 1 ). 1 /L) . [ 1 ] L L s              . . . . . . S / 2 L          .    . . . . . .       p  p  p     exp(j2 . 0 .( 1 ) /L) . exp(j2 . ( / 2 ).( 1 ) /L) . exp(j2 . ( 1 ).( 1 ) /L)     [ 1 ]  L L L L L S s L  1 L  L 1   p [ ] exp( 2 / ) s n S j kn L k  0 k • The IDFT can be written formulaically as above • There is no restriction on computing the formula for n < 0 or n > L-1 – Its just a formula – But computing these terms behind 0 or beyond L-1 tells us what the signal composed by the DFT looks like outside our narrow window 10 Sep 2013 11-755/18-797 61

  38. What does the DFT represent DFT s[n] [S 0 S 1 .. S 31 ]  1 L   p [ ] exp( 2 / ) s n S j kn L k -32 0 31 63  k 0 • If you extend the DFT-based representation beyond 0 (on the left) or L (on the right) it repeats the signal! • So what does the DFT really mean 10 Sep 2013 11-755/18-797 62

  39. What does the DFT represent • The DFT represents the properties of the infinitely long repeating signal that you can generate with it – Of which the observed signal is ONE period • This gives rise to some odd effects 10 Sep 2013 11-755/18-797 63

  40. The discrete Fourier transform • The discrete Fourier transform of the above signal actually computes the properties of the periodic signal shown below – Which extends from – infinity to +infinity – The period of this signal is 32 samples in this example 10 Sep 2013 11-755/18-797 64

  41. Windowing The DFT of one period of the sinusoid shown in the figure computes  the spectrum of the entire sinusoid from – infinity to +infinity The DFT of a real sinusoid has only one non zero frequency  The second peak in the figure also represents the same frequency as an  effect of aliasing 10 Sep 2013 11-755/18-797 65

  42. Windowing The DFT of one period of the sinusoid shown in the figure computes  the spectrum of the entire sinusoid from – infinity to +infinity The DFT of a real sinusoid has only one non zero frequency  The second peak in the figure also represents the same frequency as an  effect of aliasing 10 Sep 2013 11-755/18-797 66

  43. Windowing Magnitude spectrum The DFT of one period of the sinusoid shown in the figure computes  the spectrum of the entire sinusoid from – infinity to +infinity The DFT of a real sinusoid has only one non zero frequency  The second peak in the figure is the “reflection” around L/2 (for real signals)  10 Sep 2013 11-755/18-797 67

  44. Windowing The DFT of any sequence computes the spectrum for an infinite  repetition of that sequence The DFT of a partial segment of a sinusoid computes the spectrum of  an infinite repetition of that segment, and not of the entire sinusoid This will not give us the DFT of the sinusoid itself!  10 Sep 2013 11-755/18-797 68

  45. Windowing The DFT of any sequence computes the spectrum for an infinite  repetition of that sequence The DFT of a partial segment of a sinusoid computes the spectrum of  an infinite repetition of that segment, and not of the entire sinusoid This will not give us the DFT of the sinusoid itself!  10 Sep 2013 11-755/18-797 69

  46. Windowing Magnitude spectrum The DFT of any sequence computes the spectrum for an infinite  repetition of that sequence The DFT of a partial segment of a sinusoid computes the spectrum of  an infinite repetition of that segment, and not of the entire sinusoid This will not give us the DFT of the sinusoid itself!  10 Sep 2013 11-755/18-797 70

  47. Windowing Magnitude spectrum of segment Magnitude spectrum of complete sine wave 10 Sep 2013 11-755/18-797 71

  48. Windowing  The difference occurs due to two reasons:  The transform cannot know what the signal actually looks like outside the observed window  The implicit repetition of the observed signal introduces large discontinuities at the points of repetition  This distorts even our measurement of what happens at the boundaries of what has been reliably observed 10 Sep 2013 11-755/18-797 72

  49. Windowing  The difference occurs due to two reasons:  The transform cannot know what the signal actually looks like outside the observed window  The implicit repetition of the observed signal introduces large discontinuities at the points of repetition  These are not part of the underlying signal We only want to characterize the underlying signal  The discontinuity is an irrelevant detail  10 Sep 2013 11-755/18-797 73

  50. Windowing  While we can never know what the signal looks like outside the window, we can try to minimize the discontinuities at the boundaries  We do this by multiplying the signal with a window function  We call this procedure windowing  We refer to the resulting signal as a “windowed” signal  Windowing attempts to do the following:  Keep the windowed signal similar to the original in the central regions  Reduce or eliminate the discontinuities in the implicit periodic signal 10 Sep 2013 11-755/18-797 74

  51. Windowing  While we can never know what the signal looks like outside the window, we can try to minimize the discontinuities at the boundaries  We do this by multiplying the signal with a window function  We call this procedure windowing  We refer to the resulting signal as a “windowed” signal  Windowing attempts to do the following:  Keep the windowed signal similar to the original in the central regions  Reduce or eliminate the discontinuities in the implicit periodic signal 11-755/18-797 75 10 Sep 2013

  52. Windowing  While we can never know what the signal looks like outside the window, we can try to minimize the discontinuities at the boundaries  We do this by multiplying the signal with a window function  We call this procedure windowing  We refer to the resulting signal as a “windowed” signal  Windowing attempts to do the following:  Keep the windowed signal similar to the original in the central regions  Reduce or eliminate the discontinuities in the implicit periodic signal 11-755/18-797 76 10 Sep 2013

  53. Windowing Magnitude spectrum 10 Sep 2013 11-755/18-797 77

  54. Windowing Magnitude spectrum of original segment Magnitude spectrum of windowed signal Magnitude spectrum of complete sine wave 10 Sep 2013 11-755/18-797 78

  55. Window functions  Cosine windows:  Window length is M  Index begins at 0  Hamming: w[n] = 0.54 – 0.46 cos(2 p n/M)  Hanning: w[n] = 0.5 – 0.5 cos(2 p n/M)  Blackman: 0.42 – 0.5 cos(2 p n/M) + 0.08 cos(4 p n/M) 10 Sep 2013 11-755/18-797 79

  56. Window functions  Geometric windows:  Rectangular (boxcar):  Triangular (Bartlett):  Trapezoid: 10 Sep 2013 11-755/18-797 80

  57. Zero Padding • We can pad zeros to the end of a signal to make it a desired length – Useful if the FFT (or any other algorithm we use) requires signals of a specified length – E.g. Radix 2 FFTs require signals of length 2 n i.e., some power of 2. We must zero pad the signal to increase its length to the appropriate number • The consequence of zero padding is to change the periodic signal whose Fourier spectrum is being computed by the DFT 10 Sep 2013 11-755/18-797 81

  58. Zero Padding • We can pad zeros to the end of a signal to make it a desired length – Useful if the FFT (or any other algorithm we use) requires signals of a specified length – E.g. Radix 2 FFTs require signals of length 2 n i.e., some power of 2. We must zero pad the signal to increase its length to the appropriate number • The consequence of zero padding is to change the periodic signal whose Fourier spectrum is being computed by the DFT 10 Sep 2013 11-755/18-797 82

  59. Zero Padding Magnitude spectrum • The DFT of the zero padded signal is essentially the same as the DFT of the unpadded signal, with additional spectral samples inserted in between – It does not contain any additional information over the original DFT – It also does not contain less information 10 Sep 2013 11-755/18-797 83

  60. Magnitude spectra 10 Sep 2013 11-755/18-797 84

  61. Zero Padding Windowed signal • The DFT of the zero padded signal is essentially the same as the DFT of the unpadded signal, with additional spectral samples inserted in between – It does not contain any additional information over the original DFT – It also does not contain less information 10 Sep 2013 11-755/18-797 85

  62. Magnitude spectra 10 Sep 2013 11-755/18-797 86

  63. Zero padding a speech signal 128 samples from a speech signal sampled at 16000 Hz time The first 65 points of a 128 point DFT. Plot shows log of the magnitude spectrum frequency 8000 Hz The first 513 points of a 1024 point DFT. Plot shows log of the magnitude spectrum frequency 8000 Hz 10 Sep 2013 11-755/18-797 87

  64. The Fourier Transform and Perception: Sound • The Fourier transforms represents the signal analogously to a bank of tuning forks FT • Our ear has a bank of tuning forks • The output of the Fourier transform is perceptually + very meaningful Inverse FT 10 Sep 2013 11-755/18-797 88

  65. The Fourier Transform and Perception: Sound • Processing Sound: • Analyze the sound using a bank of tuning forks • Sample the transduced FT output of the turning forks at periodic intervals + Inverse FT 10 Sep 2013 11-755/18-797 89

  66. Sound parameterization • The signal is processed in segments of 25-64 ms – Because the properties of audio signals change quickly – They are “stationary” only very briefly 10 Sep 2013 11-755/18-797 90

  67. Sound parameterization • The signal is processed in segments of 25-64 ms – Because the properties of audio signals change quickly – They are “stationary” only very briefly • Adjacent segments overlap by 15-48 ms 10 Sep 2013 11-755/18-797 91

  68. Sound parameterization • The signal is processed in segments of 25-64 ms – Because the properties of audio signals change quickly – They are “stationary” only very briefly • Adjacent segments overlap by 15-48 ms 10 Sep 2013 11-755/18-797 92

  69. Sound parameterization • The signal is processed in segments of 25-64 ms – Because the properties of audio signals change quickly – They are “stationary” only very briefly • Adjacent segments overlap by 15-48 ms 10 Sep 2013 11-755/18-797 93

  70. Sound parameterization • The signal is processed in segments of 25-64 ms – Because the properties of audio signals change quickly – They are “stationary” only very briefly • Adjacent segments overlap by 15-48 ms 10 Sep 2013 11-755/18-797 94

  71. Sound parameterization • The signal is processed in segments of 25-64 ms – Because the properties of audio signals change quickly – They are “stationary” only very briefly • Adjacent segments overlap by 15-48 ms 10 Sep 2013 11-755/18-797 95

  72. Sound parameterization • The signal is processed in segments of 25-64 ms – Because the properties of audio signals change quickly – They are “stationary” only very briefly • Adjacent segments overlap by 15-48 ms 10 Sep 2013 11-755/18-797 96

  73. Sound parameterization Segments shift every 10- Each segment is typically 25-64 16 milliseconds milliseconds wide Audio signals typically do not change significantly within this short time interval 10 Sep 2013 11-755/18-797 97

  74. Sound parameterization Windowing spectrum Complex Each segment is windowed and a DFT is computed from it Frequency (Hz) 10 Sep 2013 11-755/18-797 98

  75. Sound parameterization Windowing Each segment is windowed and a DFT is computed from it 10 Sep 2013 11-755/18-797 99

  76. Computing a Spectrogram Compute Fourier Spectra of segments of audio and stack them side-by-side 10 Sep 2013 11-755/18-797 100

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend