processing
play

Processing Independent Component Analysis Class 8. 23 Sep 2013 - PowerPoint PPT Presentation

Machine Learning for Signal Processing Independent Component Analysis Class 8. 23 Sep 2013 Instructor: Bhiksha Raj 23 Sep 2013 11755/18797 1 Correlation vs. Causation The consumption of burgers has gone up steadily in the past decade


  1. Machine Learning for Signal Processing Independent Component Analysis Class 8. 23 Sep 2013 Instructor: Bhiksha Raj 23 Sep 2013 11755/18797 1

  2. Correlation vs. Causation • The consumption of burgers has gone up steadily in the past decade • In the same period, the penguin population of Antarctica has gone down 23 Sep 2013 11755/18797 2

  3. The concept of correlation • Two variables are correlated if knowing the value of one gives you information about the expected value of the other Penguin population Burger consumption Time 23 Sep 2013 11755/18797 3

  4. The statistical concept of correlatedness • Two variables X and Y are correlated if If knowing X gives you an expected value of Y • X and Y are uncorrelated if knowing X tells you nothing about the expected value of Y – Although it could give you other information – How? 23 Sep 2013 11755/18797 4

  5. A brief review of basic probability • Uncorrelated: Two random variables X and Y are uncorrelated iff: – The average value of the product of the variables equals the product of their individual averages • Setup: Each draw produces one instance of X and one instance of Y – I.e one instance of (X,Y) • E[XY] = E[X]E[Y] • The average value of X is the same regardless of the value of Y 23 Sep 2013 11755/18797 5

  6. Uncorrelatedness • Which of the above represent uncorrelated RVs? 23 Sep 2013 11755/18797 6

  7. The statistical concept of Independence • Two variables X and Y are dependent if If knowing X gives you any information about Y • X and Y are independent if knowing X tells you nothing at all of Y 23 Sep 2013 11755/18797 7

  8. A brief review of basic probability • Independence : Two random variables X and Y are independent iff: – Their joint probability equals the product of their individual probabilities • P(X,Y) = P(X)P(Y) • Independence implies uncorrelatedness – The average value of X is the same regardless of the value of Y • E[X|Y] = E[X] – But not the other way 23 Sep 2013 11755/18797 8

  9. A brief review of basic probability • Independence: Two random variables X and Y are independent iff: • The average value of any function of X is the same regardless of the value of Y – Or any function of Y • E[f(X)g(Y)] = E[f(X)] E[g(Y)] for all f(), g() 23 Sep 2013 11755/18797 9

  10. Independence • Which of the above represent independent RVs? • Which represent uncorrelated RVs? 23 Sep 2013 11755/18797 10

  11. A brief review of basic probability p(x) y = f(x) • The expected value of an odd function of an RV is 0 if – The RV is 0 mean – The PDF is of the RV is symmetric around 0 • E[f(X)] = 0 if f(X) is odd symmetric 23 Sep 2013 11755/18797 11

  12. A brief review of basic info. theory T(all), M(ed), S(hort )…    ( ) ( )[ log ( )] H X P X P X X • Entropy: The minimum average number of bits to transmit to convey a symbol X T, M, S… M F F M..  Y   ( , ) ( , )[ log ( , )] H X Y P X Y P X Y , X Y • Joint entropy: The minimum average number of bits to convey sets (pairs here) of symbols 23 Sep 2013 11755/18797 12

  13. A brief review of basic info. theory X T, M, S… M F F M.. Y        ( | ) ( ) ( | )[ log ( | )] ( , )[ log ( | )] H X Y P Y P X Y P X Y P X Y P X Y , Y X X Y • Conditional Entropy: The minimum average number of bits to transmit to convey a symbol X, after symbol Y has already been conveyed – Averaged over all values of X and Y 23 Sep 2013 11755/18797 13

  14. A brief review of basic info. theory          ( | ) ( ) ( | )[ log ( | )] ( ) ( )[ log ( )] ( ) H X Y P Y P X Y P X Y P Y P X P X H X Y X Y X • Conditional entropy of X = H(X) if X is independent of Y       ( , ) ( , )[ log ( , )] ( , )[ log ( ) ( )] H X Y P X Y P X Y P X Y P X P Y , , X Y X Y        ( , ) log ( ) ( , ) log ( ) ( ) ( ) P X Y P X P X Y P Y H X H Y , , X Y X Y • Joint entropy of X and Y is the sum of the entropies of X and Y if they are independent 23 Sep 2013 11755/18797 14

  15. Onward.. 23 Sep 2013 11755/18797 15

  16. Projection: multiple notes M = W =  P = W ( W T W ) -1 W T  Projected Spectrogram = PM 23 Sep 2013 11755/18797 16

  17. We’re actually computing a score M = H = ? W =  M ~ WH  H = pinv (W)M 23 Sep 2013 11755/18797 17

  18. So what are we doing here? H = ? W = ? • M ~ WH is an approximation • Given W , estimate H to minimize error    2     2 arg min || || arg min ( ) H M W H M W H F ij ij H H i j • Must ideally find transcription of given notes 23 Sep 2013 11755/18797 18

  19. How about the other way? M = H = ? ? U = W =  M ~ WH W = M pinv (H) U = WH 23 Sep 2013 11755/18797 19

  20. Going the other way.. H W =? ? • M ~ WH is an approximation • Given H , estimate W to minimize error    2     2 arg min || || arg min ( ) W M W H M W H F ij ij W H i j • Must ideally find the notes corresponding to the transcription 23 Sep 2013 11755/18797 20

  21. When both parameters are unknown H = ? W =? approx(M) = ? • Must estimate both H and W to best approximate M • Ideally, must learn both the notes and their transcription! 23 Sep 2013 11755/18797 21

  22. A least squares solution   2 , arg min || || W H M W H , F W H • Unconstrained – For any W , H that minimizes the error, W’=WA, H’=A -1 H also minimizes the error for any invertible A H • Too many solutions 23 Sep 2013 11755/18797 22

  23. A constrained least squares solution   2 , arg min || || W H M W H , F W H H • For our problem, lets consider the “truth”.. • When one note occurs, the other does not – h i T h j = 0 for all i != j • The rows of H are uncorrelated

  24. A least squares solution H • Assume: HH T = I – Normalizing all rows of H to length 1 • pinv (H) = H T • Projecting M onto H – W = M pinv (H) = MH T – WH = M H T H   2 , arg min || || W H M W H , F W H   2 T H arg min || || H M M H Constraint: Rank(H) = 4 F H

  25. Finding the notes • Add the constraint: HH T = I       2 T T arg min || || H M M H H H H F H • The solution is obtained through Eigen decomposition  H  T ( M ) Correlatio n H • Note: we are considering the correlation of M T 23 Oct 2012 11755/18797 26

  26. So how does that work? • There are 12 notes in the segment, hence we try to estimate 12 notes.. 23 Oct 2012 11755/18797 28

  27. So how does that work? • The scores of the first three “notes” and their contributions 23 Oct 2012 11755/18797 29

  28. Finding the notes • Can find W instead of H   2 T arg min || || W M W W M F W • Assume the columns of W are orthogonal • This results in the more conventional Eigen decomposition  W  ( M ) Correlatio n W 23 Oct 2012 11755/18797 30

  29. So how does that work? • There are 12 notes in the segment, hence we try to estimate 12 notes.. • Results are not good again 23 Oct 2012 11755/18797 31

  30. Our notes are not orthogonal • Overlapping frequencies • Note occur concurrently – Harmonica continues to resonate to previous note • More generally, simple orthogonality will not give us the desired solution 23 Sep 2013 11755/18797 32

  31. Eigendecomposition and SVD M  M  T WH USV • Matrix M can be decomposed as M = USV T • When we assume the scores are orthogonal, we get H = V T , W = US • When we assume the notes are orthogonal, we get W = U, H = SV T • In either case the results are the same – The notes are orthogonal and so are the scores – Not good in our problem 23 Sep 2013 11755/18797 34

  32. Orthogonality M  WH • In any least-squared error decomposition M=WH , if the columns of W are orthogonal, the rows of H will also be orthogonal • Sometimes mere orthogonality is not enough 23 Sep 2013 11755/18797 35

  33. What else can we look for? • Assume: The “transcription” of one note does not depend on what else is playing – Or, in a multi-instrument piece, instruments are playing independently of one another • Not strictly true, but still.. 23 Sep 2013 11755/18797 36

  34. Formulating it with Independence     2 , arg min || || ( . . . . ) W H M W H rows of H are independen t , F W H • Impose statistical independence constraints on decomposition 23 Sep 2013 11755/18797 37

  35. Changing problems for a bit ( ) h 1 t   ( ) ( ) ( ) m t w h t w h t 1 11 1 12 2   ( ) ( ) ( ) m t w h t w h t 2 21 1 22 2 ( ) h 2 t • Two people speak simultaneously • Recorded by two microphones • Each recorded signal is a mixture of both signals 23 Sep 2013 11755/18797 38

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend