representing images and sounds
play

Representing Images and Sounds Class 4. 3 Sep 2009 Instructor: - PowerPoint PPT Presentation

11-755 Machine Learning for Signal Processing Representing Images and Sounds Class 4. 3 Sep 2009 Instructor: Bhiksha Raj Representing an Elephant It was six men of Indostan, The fifth, who chanced to touch the ear, n n To learning much


  1. Interpretation.. n Each sinusoid’s amplitude is adjusted until it gives us the least squared error The amplitude is the weight of the sinusoid q n This can be done independently for each sinusoid 11-755 MLSP: Bhiksha Raj

  2. Interpretation.. n Each sinusoid’s amplitude is adjusted until it gives us the least squared error The amplitude is the weight of the sinusoid q n This can be done independently for each sinusoid 11-755 MLSP: Bhiksha Raj

  3. Interpretation.. n Each sinusoid’s amplitude is adjusted until it gives us the least squared error The amplitude is the weight of the sinusoid q n This can be done independently for each sinusoid 11-755 MLSP: Bhiksha Raj

  4. Interpretation.. n Each sinusoid’s amplitude is adjusted until it gives us the least squared error The amplitude is the weight of the sinusoid q n This can be done independently for each sinusoid 11-755 MLSP: Bhiksha Raj

  5. Sines by themselves are not enough n Every sine starts at zero Can never represent a signal that is non-zero in the first q sample! n Every cosine starts at 1 If the first sample is zero, the signal cannot be represented! q 11-755 MLSP: Bhiksha Raj

  6. The need for phase ����� ������������ ����������������� ��������� n Allow the sinusoids to move! signal w sin( 2 kn / N ) w sin( 2 kn / N ) w sin( 2 kn / N ) .... = p + f + p + f + p + f + 1 1 2 2 3 3 n How much do the sines shift? 11-755 MLSP: Bhiksha Raj

  7. Determining phase n Least squares fitting: move the sinusoid left / right, and at each shift, try all amplitudes Find the combination of amplitude and phase that results in q the lowest squared error n We can still do this separately for each sinusoid The sinusoids are still orthogonal to one another q 11-755 MLSP: Bhiksha Raj

  8. Determining phase n Least squares fitting: move the sinusoid left / right, and at each shift, try all amplitudes Find the combination of amplitude and phase that results in q the lowest squared error n We can still do this separately for each sinusoid The sinusoids are still orthogonal to one another q 11-755 MLSP: Bhiksha Raj

  9. Determining phase n Least squares fitting: move the sinusoid left / right, and at each shift, try all amplitudes Find the combination of amplitude and phase that results in q the lowest squared error n We can still do this separately for each sinusoid The sinusoids are still orthogonal to one another q 11-755 MLSP: Bhiksha Raj

  10. Determining phase n Least squares fitting: move the sinusoid left / right, and at each shift, try all amplitudes Find the combination of amplitude and phase that results in q the lowest squared error n We can still do this separately for each sinusoid The sinusoids are still orthogonal to one another q 11-755 MLSP: Bhiksha Raj

  11. The problem w ith phase � � � � � � sin(2 . 0 . 0 /L ) sin(2 . 1 . 0 /L ) . . sin(2 . ( L / 2 ). 0 /L ) w s [ 0 ] p + f p + f p + f 0 1 L/2 1 � � � � � � sin(2 . 0 . 1 /L ) sin(2 . 1 . 1 /L ) . . sin(2 . ( L / 2 ). 1 /L ) w s [ 1 ] p + f p + f p + f � � � � � � 0 1 L/2 2 � � � � � � . . . . . . = . � � � � � � � � � � � � . . . . . . . � � � � � � � � � � � � sin(2 . 0 .( L 1 ) /L ) sin(2 . 1 .( L 1 ) /L ) . . sin(2 . ( L / 2 ).( L 1 ) /L ) w s [ L 1 ] p - + f p - + f p - + f - 0 1 L/2 L / 2 L/2 columns only This can no longer be expressed as a simple linear algebraic n equation The phase is integral to the bases q I.e. there’s a component of the basis itself that must be estimated! n Linear algebraic notation can only be used if the bases are fully n known We can only (pseudo) invert a known matrix q 11-755 MLSP: Bhiksha Raj

  12. Complex Exponential to the rescue b [ n ] sin( freq * n ) = [ ] exp( * * ) cos( * ) sin( * ) b n j freq n freq n j freq n = = + 1 j = - exp( j * freq * n ) exp( j * freq * n ) exp( ) cos( freq * n ) j sin( freq * n ) + f = f = + f + + f n The cosine is the real part of a complex exponential The sine is the imaginary part q n A phase term for the sinusoid becomes a multiplicative term for the complex exponential!! 11-755 MLSP: Bhiksha Raj

  13. Explaining with Complex Exponentials + + + + A x A A A + + + + B x B B B = = = = C x 11-755 MLSP: Bhiksha Raj

  14. Complex exponentials are w ell behaved n Like sinusoids, a complex exponential of one frequency can never explain one of another They are orthogonal q n They represent smooth transitions n Bonus: They are complex Can even model complex data! q n They can also model real data exp(j x ) + exp(-j x) is real q cos(x) + j sin(x) + cos(x) – j sin(x) = 2cos(x) n n More importantly � � � � ( L / 2 x ) n ( L / 2 x ) n - + � � � � exp j 2 exp j 2 p + p is real � � � � q L L The complex exponentials with frequencies equally spaced n from L/2 are complex conjugates 11-755 MLSP: Bhiksha Raj

  15. Complex exponentials are w ell behaved � � � � ( L / 2 x ) n ( L / 2 x ) n - + � � � � exp 2 exp 2 j j is real p + p n � � � � L L The complex exponentials with frequencies equally spaced q from L/2 are complex conjugates “Frequency = k” Ł k periods in L samples n � � � � ( L / 2 x ) n ( L / 2 x ) n - + � � � � a exp j 2 conjugate ( a ) exp j 2 p + p � � � � L L Is also real q If the two exponentials are multiplied by numbers that are q conjugates of one another the result is real 11-755 MLSP: Bhiksha Raj

  16. Complex Exponential bases Complex conjugates � � � � w 0 � � � � . � � � � � � w L / 2 1 - � � = � � w � � L / 2 � � w � � L / 2 1 + � � � � . � � � � � � � � w L 1 - b 0 b 1 b L/2 w conjugate ( w ) + = L / 2 k L / 2 k - Explain the data using L complex exponential bases n The weights given to the (L/2 + k)th basis and the (L/2 – k)th basis should be n complex conjugates, to make the result real Because we are dealing with real data q Fortunately, a least squares fit will give us identical weights to both bases n automatically; there is no need to impose the constraint externally 11-755 MLSP: Bhiksha Raj

  17. Complex Exponential Bases: Algebraic Formulation � � � � � � exp(j2 . 0 . 0 /L) . exp(j2 . ( L / 2 ). 0 /L) . . exp(j2 . ( L 1 ). 0 /L) S s [ 0 ] p p p - 0 � � � � � � exp(j2 . 0 . 1 /L) . exp(j2 . ( L / 2 ). 1 /L) . . exp(j2 . ( L 1 ). 1 /L) . s [ 1 ] � p p p - � � � � � � � � � � � S . . . . . = . L / 2 � � � � � � � � � � � � . . . . . . . � � � � � � � � � � � � exp(j2 . 0 .( L 1 ) /L) . exp(j2 . ( L / 2 ).( L 1 ) /L) . exp(j2 . ( L 1 ).( L 1 ) /L) S s [ L 1 ] p - p - p - - - L 1 - n Note that S L/2+x = conjugate(S L/2-x ) 11-755 MLSP: Bhiksha Raj

  18. Shorthand Notation 1 1 k , n ( ) W exp( j 2 kn / L ) cos( 2 kn / L ) j sin( 2 kn / L ) = p = p + p L L L � � � � � � 0 , 0 L / 2 , 0 L 1 , 0 - S [ 0 ] s W . W . . W 0 � � L L L � � � � 0 , 1 L / 2 , 1 L 1 , 1 - � � . s [ 1 ] W . W . . W � � � � L L L � � � � � � S . = . . . . . � � L / 2 � � � � � � � � � � . . . . . . . � � � � � � 0 , L 1 L / 2 , L 1 L 1 , L 1 - - - - � � � � � � S s [ L 1 ] - � W . W . W � 1 L - L L L n Note that S L/2+x = conjugate(S L/2-x ) 11-755 MLSP: Bhiksha Raj

  19. A quick detour Real Orthonormal matrix: n XX T = X X T = I q But only if all entries are real n The inverse of X is its own transpose q Definition: Hermitian n X H = Complex conjugate of X T q Conjugate of a number a + ib = a – ib n Conjugate of exp(ix) = exp(-ix) n Complex Orthonormal matrix n XX H = X H X = I q The inverse of a complex orthonormal matrix is its own Hermitian q 11-755 MLSP: Bhiksha Raj

  20. Doing it in matrix form 1 1 k , n ( ) W exp( j 2 kn / L ) cos( 2 kn / L ) j sin( 2 kn / L ) = p = p + p L L L 1 1 k , n k , n - ( ) W conjugate ( W ) exp( j 2 kn / L ) cos( 2 kn / L ) j sin( 2 kn / L ) = = - p = p - p = L L L L � � � � � � 0 , 0 0 , L / 2 0 , L 1 - - - S s [ 0 ] W . W . . W 0 � � L L L � � � � 1 , 0 , 1 , L / 2 1 , L 1 - - - - � � . s [ 1 ] � � W . W . . W � � L L L � � � � � � S . = . . . . . � � L / 2 � � � � � � � � � � . . . . . . . � � � � � � ( L 1 ), 0 ( L 1 ), L / 2 ( L 1 ), ( L 1 ) - - - - - - - � � � � � � S s [ L 1 ] - � W . W . W � L 1 - L L L n The complex exponential basis matrix to the left is an orthonormal matrix q Its inverse is its own Hermition -1 = W H q W 11-755 MLSP: Bhiksha Raj

  21. The Discrete Fourier Transform � � � � � � 0 , 0 0 , L / 2 0 , L 1 - - - S s [ 0 ] W . W . . W 0 � � L L L � � � � 1 , 0 , 1 , L / 2 1 , L 1 - - - - � � . s [ 1 ] � � W . W . . W � � L L L � � � � � � S . = . . . . . � � L / 2 � � � � � � � � � � . . . . . . . � � � � � � ( L 1 ), 0 ( L 1 ), L / 2 ( L 1 ), ( L 1 ) - - - - - - - � � � � � � S s [ L 1 ] - W . W . W � � L 1 - L L L n The matrix to the right is called the “Fourier Matrix” n The weights (S 0 , S 1 . . Etc.) are called the Fourier transform 11-755 MLSP: Bhiksha Raj

  22. The Inverse Discrete Fourier Transform � � � � � � 0 , 0 L / 2 , 0 L 1 , 0 - S s [ 0 ] W . W . . W 0 � � L L L � � � � 0 , 1 L / 2 , 1 L 1 , 1 - � � . s [ 1 ] W . W . . W � � � � L L L � � � � � � S . = . . . . . � � L / 2 � � � � � � � � � � . . . . . . . � � � � � � 0 , L 1 L / 2 , L 1 L 1 , L 1 - - - - � � � � � � S s [ L 1 ] . . - � W W W � L 1 - L L L n The matrix to the left is the inverse Fourier matrix n Multiplying the Fourier transform by this matrix gives us the signal right back from its Fourier transform 11-755 MLSP: Bhiksha Raj

  23. The Fourier Matrix n Left panel: The real part of the Fourier matrix For a 32-point signal q n Right panel: The imaginary part of the Fourier matrix 11-755 MLSP: Bhiksha Raj

  24. The FAST Fourier Transform The outcome of the transformation with the Fourier matrix is the n DISCRETE FOURIER TRANSFORM (DFT) The FAST Fourier transform is an algorithm that takes advantage of n the symmetry of the matrix to perform the matrix multiplication really fast The FFT computes the DFT n Is much faster if the length of the signal can be expressed as 2 N q 11-755 MLSP: Bhiksha Raj

  25. Images n The complex exponential is two dimensional q Has a separate X frequency and Y frequency Would be true even for checker boards! n q The 2-D complex exponential must be unravelled to form one component of the Fourier matrix For a KxL image, we’d have K*L bases in the matrix n 11-755 MLSP: Bhiksha Raj

  26. DFT: Properties The DFT coefficients are complex n Have both a magnitude and a phase q EQUN q Simple linear algebra tells us that n DFT(A + B) = DFT(A) + DFT(B) q The DFT of the sum of two signals is the DFT of their sum q A horribly common approximation in sound processing n Magnitude(DFT(A+B)) = Magnitude(DFT(A)) + Magnitude(DFT(B)) q Utterly wrong q Absurdly useful q 11-755 MLSP: Bhiksha Raj

  27. The Fourier Transform and Perception: Sound n The Fourier transforms represents the signal analogously to a bank of tuning forks FT n Our ear has a bank of tuning forks n The output of the Fourier transform is perceptually + very meaningful Inverse FT 11-755 MLSP: Bhiksha Raj

  28. Symmetric signals ** * * ** * * * * * ** * **** * * * * * * * Contributions from points equidistant from L/2 combine to cancel out imaginary terms If a signal is symmetric around L/2, the Fourier coefficients are real! n A(L/2-k) * exp(-j *f*(L/2-k)) + A(L/2+k) * exp(-j*f*(L/2+k)) is always real if q A(L/2-k) = A(L/2+k) We can pair up samples around the center all the way; the final summation term is q always real Overall symmetry properties n If the *signal* is real, the FT is symmetric q If the signal is symmetric, the FT is real q If the signal is real and symmetric, the FT is real and symmetric q 11-755 MLSP: Bhiksha Raj

  29. The Discrete Cosine Transform Compose a symmetric signal or image n Images would be symmetric in two dimensions q Compute the Fourier transform n Since the FT is symmetric, sufficient to store only half the q coefficients (quarter for an image) Or as many coefficients as were originally in the signal / image n 11-755 MLSP: Bhiksha Raj

  30. DCT � � � � � � cos(2 ( 0 . 5 ). 0 /2L) cos(2 .( 1 0.5) . 0 /2L) . . cos(2 . ( L 0 . 5 ). 0 /2L) w s [ 0 ] p p + p - 0 � � � � � � cos(2 . ( 0 . 5 ). 1 /2L) cos(2 .( 1 0.5) . 1 /2L) . . cos(2 . ( L 0 . 5 ). 1 /2L) w s [ 1 ] � p p + p - � � � � � 1 � � � � � � . . . . . . = . � � � � � � � � � � � � . . . . . . . � � � � � � � � � � � � cos(2 . ( 0 . 5 ).( L 1 ) /2L) cos(2 .( 1 0.5) .( L 1 ) /2L) . . cos(2 . ( L 0 . 5 ).( L 1 ) /2L) w s [ L 1 ] p - p + - p - - - L 1 - L columns n Not necessary to compute a 2xL sized FFT Enough to compute an L-sized cosine transform q Taking advantage of the symmetry of the problem q n This is the Discrete Cosine Transform 11-755 MLSP: Bhiksha Raj

  31. Representing images Multiply by DCT matrix DCT Most common coding is the DCT n JPEG: Each 8x8 element of the picture is converted using a DCT n The DCT coefficients are quantized and stored n Degree of quantization = degree of compression q Also used to represent textures etc for pattern recognition and n other forms of analysis 11-755 MLSP: Bhiksha Raj

  32. What does the DFT represent � � � � � � exp(j2 . 0 . 0 /L) . exp(j2 . ( L / 2 ). 0 /L) . . exp(j2 . ( L 1 ). 0 /L) S s [ 0 ] p p p - 0 � � � � � � . exp(j2 . 0 . 1 /L) . exp(j2 . ( L / 2 ). 1 /L) . . exp(j2 . ( L 1 ). 1 /L) s [ 1 ] � p p p - � � � � � � � � � � � . . . . . S . = L / 2 � � � � � � � � � � � � . . . . . . . � � � � � � � � � � exp(j2 . 0 .( L 1 ) /L) . exp(j2 . ( L / 2 ).( L 1 ) /L) . exp(j2 . ( L 1 ).( L 1 ) /L) � S � s [ L 1 ] p - p - p - - - L - 1 L 1 - � s [ n ] S exp( j 2 kn / L ) = p k k 0 = The DFT can be written formulaically as above n There is no restriction on computing the formula for n < 0 or n > n L-1 Its just a formula q But computing these terms behind 0 or beyond L-1 tells us what q the signal composed by the DFT looks like outside our narrow window 11-755 MLSP: Bhiksha Raj

  33. What does the DFT represent DFT s[n] [S 0 S 1 .. S 31 ] L 1 - � s [ n ] S exp( j 2 kn / L ) = p k -32 0 31 63 k = 0 n If you extend the DFT-based representation beyond 0 (on the left) or L (on the right) it repeats the signal! n So what does the DFT really mean 11-755 MLSP: Bhiksha Raj

  34. What does the DFT represent n The DFT represents the properties of the infinitely long repeating signal that you can generate with it q Of which the observed signal is ONE period n This gives rise to some odd effects 11-755 MLSP: Bhiksha Raj

  35. The discrete Fourier transform The discrete Fourier transform of the above signal actually n computes the properties of the periodic signal shown below Which extends from –infinity to +infinity q The period of this signal is 32 samples in this example q 11-755 MLSP: Bhiksha Raj

  36. Windowing The DFT of one period of the sinusoid shown in the figure computes n the spectrum of the entire sinusoid from –infinity to +infinity The DFT of a real sinusoid has only one non zero frequency n The second peak in the figure also represents the same frequency as an q effect of aliasing 11-755 MLSP: Bhiksha Raj

  37. Windowing The DFT of one period of the sinusoid shown in the figure computes n the spectrum of the entire sinusoid from –infinity to +infinity The DFT of a real sinusoid has only one non zero frequency n The second peak in the figure also represents the same frequency as an q effect of aliasing 11-755 MLSP: Bhiksha Raj

  38. Windowing Magnitude spectrum The DFT of one period of the sinusoid shown in the figure computes n the spectrum of the entire sinusoid from –infinity to +infinity The DFT of a real sinusoid has only one non zero frequency n The second peak in the figure is the “reflection” around L/2 (for real signals) q 11-755 MLSP: Bhiksha Raj

  39. Windowing The DFT of any sequence computes the spectrum for an infinite n repetition of that sequence The DFT of a partial segment of a sinusoid computes the spectrum of n an infinite repetition of that segment, and not of the entire sinusoid This will not give us the DFT of the sinusoid itself! n 11-755 MLSP: Bhiksha Raj

  40. Windowing The DFT of any sequence computes the spectrum for an infinite n repetition of that sequence The DFT of a partial segment of a sinusoid computes the spectrum of n an infinite repetition of that segment, and not of the entire sinusoid This will not give us the DFT of the sinusoid itself! n 11-755 MLSP: Bhiksha Raj

  41. Windowing Magnitude spectrum The DFT of any sequence computes the spectrum for an infinite n repetition of that sequence The DFT of a partial segment of a sinusoid computes the spectrum of n an infinite repetition of that segment, and not of the entire sinusoid This will not give us the DFT of the sinusoid itself! n 11-755 MLSP: Bhiksha Raj

  42. Windowing Magnitude spectrum of segment Magnitude spectrum of complete sine wave 11-755 MLSP: Bhiksha Raj

  43. Windowing The difference occurs due to two reasons: n The transform cannot know what the signal actually looks like n outside the observed window The implicit repetition of the observed signal introduces large n discontinuities at the points of repetition This distorts even our measurement of what happens at the q boundaries of what has been reliably observed 11-755 MLSP: Bhiksha Raj

  44. Windowing The difference occurs due to two reasons: n The transform cannot know what the signal actually looks like n outside the observed window The implicit repetition of the observed signal introduces large n discontinuities at the points of repetition These are not part of the underlying signal q We only want to characterize the underlying signal n The discontinuity is an irrelevant detail q 11-755 MLSP: Bhiksha Raj

  45. Windowing While we can never know what the signal looks like outside the n window, we can try to minimize the discontinuities at the boundaries We do this by multiplying the signal with a window function n We call this procedure windowing q We refer to the resulting signal as a “windowed” signal q Windowing attempts to do the following: n Keep the windowed signal similar to the original in the central q regions Reduce or eliminate the discontinuities in the implicit periodic signal q 11-755 MLSP: Bhiksha Raj

  46. Windowing While we can never know what the signal looks like outside the n window, we can try to minimize the discontinuities at the boundaries We do this by multiplying the signal with a window function n We call this procedure windowing q We refer to the resulting signal as a “windowed” signal q Windowing attempts to do the following: n Keep the windowed signal similar to the original in the central q regions Reduce or eliminate the discontinuities in the implicit periodic signal q 11-755 MLSP: Bhiksha Raj

  47. Windowing While we can never know what the signal looks like outside the n window, we can try to minimize the discontinuities at the boundaries We do this by multiplying the signal with a window function n We call this procedure windowing q We refer to the resulting signal as a “windowed” signal q Windowing attempts to do the following: n Keep the windowed signal similar to the original in the central q regions Reduce or eliminate the discontinuities in the implicit periodic signal 11-755 MLSP: Bhiksha Raj q

  48. Windowing Magnitude spectrum The DFT of the windowed signal does not have any artefacts n introduced by discontinuities in the signal Often it is also a more faithful reproduction of the DFT of the complete n signal whose segment we have analyzed 11-755 MLSP: Bhiksha Raj

  49. Windowing Magnitude spectrum of original segment Magnitude spectrum of windowed signal Magnitude spectrum of complete sine wave 11-755 MLSP: Bhiksha Raj

  50. Windowing Windowing is not a perfect solution n The original (unwindowed) segment is identical to the original (complete) signal q within the segment The windowed segment is often not identical to the complete signal anywhere q Several windowing functions have been proposed that strike different tradeoffs n between the fidelity in the central regions and the smoothing at the boundaries 11-755 MLSP: Bhiksha Raj

  51. Windowing n Cosine windows: Window length is M q Index begins at 0 q n Hamming: w[n] = 0.54 – 0.46 cos(2 p n/M) n Hanning: w[n] = 0.5 – 0.5 cos(2 p n/M) n Blackman: 0.42 – 0.5 cos(2 p n/M) + 0.08 cos(4 p n/M) 11-755 MLSP: Bhiksha Raj

  52. Windowing n Geometric windows: Rectangular (boxcar): q Triangular (Bartlett): q Trapezoid: q 11-755 MLSP: Bhiksha Raj

  53. Zero Padding We can pad zeros to the end of a signal to make it a desired n length Useful if the FFT (or any other algorithm we use) requires signals q of a specified length E.g. Radix 2 FFTs require signals of length 2 n i.e., some power q of 2. We must zero pad the signal to increase its length to the appropriate number The consequence of zero padding is to change the periodic n signal whose Fourier spectrum is being computed by the DFT 11-755 MLSP: Bhiksha Raj

  54. Zero Padding We can pad zeros to the end of a signal to make it a desired n length Useful if the FFT (or any other algorithm we use) requires signals of a q specified length E.g. Radix 2 FFTs require signals of length 2 n i.e., some power of 2. q We must zero pad the signal to increase its length to the appropriate number The consequence of zero padding is to change the periodic n signal whose Fourier spectrum is being computed by the DFT 11-755 MLSP: Bhiksha Raj

  55. Zero Padding Magnitude spectrum The DFT of the zero padded signal is essentially the same as n the DFT of the unpadded signal, with additional spectral samples inserted in between It does not contain any additional information over the original DFT q It also does not contain less information q 11-755 MLSP: Bhiksha Raj

  56. Magnitude spectra 11-755 MLSP: Bhiksha Raj

  57. Zero Padding n Zero padding windowed signals results in signals that appear to be less discontinuous at the edges This is only illusory q Again, we do not introduce any new information into the q signal by merely padding it with zeros 11-755 MLSP: Bhiksha Raj

  58. Zero Padding The DFT of the zero padded signal is essentially the same as n the DFT of the unpadded signal, with additional spectral samples inserted in between It does not contain any additional information over the original DFT q It also does not contain less information q 11-755 MLSP: Bhiksha Raj

  59. Magnitude spectra 11-755 MLSP: Bhiksha Raj

  60. Zero padding a speech signal 128 samples from a speech signal sampled at 16000 Hz time The first 65 points of a 128 point DFT. Plot shows log of the magnitude spectrum frequency 8000 Hz 8000 8000 8000 The first 513 points of a 1024 point DFT. Plot shows log of the magnitude spectrum frequency 8000 Hz 8000 8000 8000 11-755 MLSP: Bhiksha Raj

  61. The process of parameterization n The signal is processed in segments of 25-64 ms Because the properties of audio signals change quickly q They are “stationary” only very briefly q 11-755 MLSP: Bhiksha Raj

  62. The process of parameterization n The signal is processed in segments of 25-64 ms Because the properties of audio signals change quickly q They are “stationary” only very briefly q n Adjacent segments overlap by 15-48 ms 11-755 MLSP: Bhiksha Raj

  63. The process of parameterization n The signal is processed in segments of 25-64 ms Because the properties of audio signals change quickly q They are “stationary” only very briefly q n Adjacent segments overlap by 15-48 ms 11-755 MLSP: Bhiksha Raj

  64. The process of parameterization n The signal is processed in segments of 25-64 ms Because the properties of audio signals change quickly q They are “stationary” only very briefly q n Adjacent segments overlap by 15-48 ms 11-755 MLSP: Bhiksha Raj

  65. The process of parameterization n The signal is processed in segments of 25-64 ms Because the properties of audio signals change quickly q They are “stationary” only very briefly q n Adjacent segments overlap by 15-48 ms 11-755 MLSP: Bhiksha Raj

  66. The process of parameterization n The signal is processed in segments of 25-64 ms Because the properties of audio signals change quickly q They are “stationary” only very briefly q n Adjacent segments overlap by 15-48 ms 11-755 MLSP: Bhiksha Raj

  67. The process of parameterization n The signal is processed in segments of 25-64 ms Because the properties of audio signals change quickly q They are “stationary” only very briefly q n Adjacent segments overlap by 15-48 ms 11-755 MLSP: Bhiksha Raj

  68. The process of parameterization Segments shift every 10- Each segment is typically 25-64 16 milliseconds milliseconds wide Audio signals typically do not change significantly within this short time interval 11-755 MLSP: Bhiksha Raj

  69. The process of parameterization Windowing Complex spectrum Each segment is windowed and a DFT is computed from it Frequency (Hz) 11-755 MLSP: Bhiksha Raj

  70. The process of parameterization Windowing Each segment is windowed and a DFT is computed from it 11-755 MLSP: Bhiksha Raj

  71. Computing a Spectrogram Compute Fourier Spectra of segments of audio and stack them side-by-side 11-755 MLSP: Bhiksha Raj

  72. Computing a Spectrogram frequency frequency frequency frequency frequency frequency frequency Compute Fourier Spectra of segments of audio and stack them side-by-side 11-755 MLSP: Bhiksha Raj

  73. Computing a Spectrogram frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency Compute Fourier Spectra of segments of audio and stack them side-by-side 11-755 MLSP: Bhiksha Raj

  74. Computing a Spectrogram frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency Compute Fourier Spectra of segments of audio and stack them side-by-side 11-755 MLSP: Bhiksha Raj

  75. Computing a Spectrogram frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency Compute Fourier Spectra of segments of audio and stack them side-by-side 11-755 MLSP: Bhiksha Raj

  76. Computing a Spectrogram frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency Compute Fourier Spectra of segments of audio and stack them side-by-side 11-755 MLSP: Bhiksha Raj

  77. Computing a Spectrogram frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency Compute Fourier Spectra of segments of audio and stack them side-by-side 11-755 MLSP: Bhiksha Raj

  78. Computing a Spectrogram frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency Compute Fourier Spectra of segments of audio and stack them side-by-side 11-755 MLSP: Bhiksha Raj

  79. Computing a Spectrogram frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency Compute Fourier Spectra of segments of audio and stack them side-by-side 11-755 MLSP: Bhiksha Raj

  80. Computing a Spectrogram frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency frequency Compute Fourier Spectra of segments of audio and stack them side-by-side 11-755 MLSP: Bhiksha Raj

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend