Machine Learning for Signal Processing Independent Component Analysis
Class 8. 24 Sep 2015 Instructor: Bhiksha Raj
11755/18797 1
Processing Independent Component Analysis Class 8. 24 Sep 2015 - - PowerPoint PPT Presentation
Machine Learning for Signal Processing Independent Component Analysis Class 8. 24 Sep 2015 Instructor: Bhiksha Raj 11755/18797 1 Revisiting the Covariance Matrix Assuming centered data C = S X XX T = X 1 X 1 T + X 2 X 2 T + .
11755/18797 1
T + X2X2 T + ….
11755/18797 2
T + X2X2 T + … ) V = X1X1 TV + X2X2 TV + …
11755/18797 3
4
5
Adding
11755/18797 6
Adding
11755/18797 7
Adding
11755/18797 8
11755/18797 9
– X cannot predict Y
– Y cannot predict X
11755/18797 10
– X predicts Y
11755/18797 11
– L1 does not predict L2 and vice versa – In this coordinate system the data are uncorrelated
11755/18797 12
L1 L1
11755/18797 13
11755/18797 14
Correlation, not Causation (unless McDonalds has a top-secret Antarctica division)
11755/18797 15
Burger consumption Penguin population Time
– The average value of the product of the variables equals the product of their individual averages
– I.e one instance of (X,Y)
11755/18797 16
11755/18797 17
Burger consumption Penguin population b1 b2 P1 P2
11755/18797 18
Average Income Burger consumption b1 b2
11755/18797 19
Burger consumption Average Income Y as a function of X X as a function of Y
11755/18797 20
11755/18797 21
X Y X’ Y’
11755/18797 22
X’ Y’ Assuming 0 mean
matrix diagonal Y E X E Y Y X Y X X E Y X Y X E ] ' [ ] ' [ ' ' ' ' ' ' ' ' ' '
2 2 2 2
11755/18797 23
11755/18797 24
11755/18797 25
X Y w1 w2 E1 E2
2 2 1 1
11755/18797 26
11755/18797 27
11755/18797 28
11755/18797 29
11755/18797 30
11755/18797 31
11755/18797 32
11755/18797 33
11755/18797 34
y = f(x) p(x)
11755/18797 35
T(all), M(ed), S(hort)… T, M, S… M F F M..
X
X P X P X H )] ( log )[ ( ) (
Y X
Y X P Y X P Y X H
,
)] , ( log )[ , ( ) , ( X Y
11755/18797 36
T, M, S… M F F M.. X Y
Y X Y X
Y X P Y X P Y X P Y X P Y P Y X H
,
)] | ( log )[ , ( )] | ( log )[ | ( ) ( ) | (
11755/18797 37
) ( )] ( log )[ ( ) ( )] | ( log )[ | ( ) ( ) | ( X H X P X P Y P Y X P Y X P Y P Y X H
Y X Y X
Y X Y X
Y P X P Y X P Y X P Y X P Y X H
, ,
)] ( ) ( log )[ , ( )] , ( log )[ , ( ) , ( ) ( ) ( ) ( log ) , ( ) ( log ) , (
, ,
Y H X H Y P Y X P X P Y X P
Y X Y X
11755/18797 38
11755/18797 39
P = W (WTW)-1 WT Projected Spectrogram = PM
M = W =
11755/18797 40
M ~ WH H = pinv(W)M
M = W = H = ?
11755/18797 41
M ~ WH W = Mpinv(H) U = WH
M = W =
H = U =
W =? H = ? approx(M) = ?
11755/18797 42
2 ,
H W
T F
43
H W
2 ,
F
11755/18797 44
11755/18797 45
11755/18797 46
11755/18797 47
2 ,
H W
T F
11755/18797 48
11755/18797 49
2 ,
F
H W
11755/18797 50
) ( ) ( ) (
2 12 1 11 1
t h w t h w t m ) ( ) ( ) (
2 22 1 21 2
t h w t h w t m
1 t
2 t
11755/18797 51
M H W w11 w12 w21 w22 Signal at mic 1 Signal at mic 2 Signal from speaker 1 Signal from speaker 2
11755/18797 52
M H W w11 w12 w21 w22
11755/18797 53
M H W w11 w12 w21 w22
11755/18797 54
Remember this form
11755/18797 55
2hjhk] = E[hi 2]E[hj]E[hk]
2hj 2] = E[hi 2]E[hj 2]
11755/18797 56
H
11755/18797 57
i i
m
i i
m
11755/18797 58
H H’
Diagonal + rank1 matrix H=AM H=BCM A=BC
11755/18797 59
H H’
Diagonal + rank1 matrix H=AM H=BCM A=BC
2xj 2] = E[xi 2]E [xj 2]
60 11755/18797
– BBT = BTB = I
– Since the rows of H are uncorrelated
11755/18797 61
H H’
Diagonal + rank1 matrix H=AM H=BX A=BC H=BCM
– While ensuring that B is Unitary
62 11755/18797
– Good because it incorporates the energy in all rows of H – Where dij = E[ Sk hk
2 hi hj]
– i.e. D = E[hTh h hT]
63
23 22 21 13 12 11
11755/18797
64
23 22 21 13 12 11
dij = E[ Sk hk
2 hi hj] = mj mi m k mk
h h h cols
2
) ( 1 H
jth component jth component Sum of squares
Shk
2
hi hj Shk
2hi hj 11755/18797
Energy-weighted correlation!!
– For i != j – Centered: E[hj] = 0 E[ Sk hk
2 hi hj]=0 for i != j
– For i = j
– Let us diagonalize D
65
23 22 21 13 12 11
dij = E[ Sk hk
2 hi hj] =
j k i k j i k i j j i k j i k
E E E E E E E E
, 2 3 3 2
h h h h h h h h h h
2 2 4 2
i k k i i k j i k
E E E E h h h h h h
mj mi m k mk
h h h cols
2
) ( 1 H
11755/18797
11755/18797 66
11755/18797 67
68 11755/18797
– C is the (transpose of the) matrix of Eigen vectors of MMT
– B is the (transpose of the) matrix of Eigenvectors of X.diag(XTX).XT
69 11755/18797
– Only a subset of fourth order moments are considered – There are many other ways of constructing fourth-order moment matrices that would ideally be diagonal
is not guaranteed to diagonalize every other fourth-order moment matrix
– Jointly diagonalizes several fourth-order moment matrices – More effective than the procedure shown, but computationally more expensive
71 11755/18797
» F(AM)
11755/18797 72
– Normalize variance along all directions – Eliminate second-order dependence
expected
– In microphone array setup – only K < M sources
– E[xixj] = dij for centered signal
11755/18797 73
11755/18797 74
i i
11755/18797 75
1
1 x h
1
x
11755/18797 76
i i
i i
i i
11755/18797 78
11755/18797 79
g(h11) g(h12) . . . g(h21) g(h22) . . . . . . . . . . . . . . . f(h11) f(h12) f(h21) f(h22)
2
11755/18797 80
P11 P12 . . . P21 P22 . . . . . . . . .
Q11 Q12 . . . Q21 Q22 . . .
j i ij
i i ii
11755/18797 81
. . . Q11 Q12 . . . Q21 Q22 . . .
j i ij
i i ii
11755/18797 82
T
2
F
T
T
11755/18797 83
11755/18797 84
11755/18797 85
86 11755/18797
Input Mix Output
11755/18797 87
11755/18797 88
11755/18797 89
11755/18797 90
Non-Gaussian data ICA PCA
independent
11755/18797 91
example
and concatenate them in a big matrix, do component analysis
localized sinusoids which is a better way to analyze sounds
– ICA returns localizes edge filters
11755/18797 92
ICA-faces Eigenfaces
11755/18797 93
11755/18797 94
11755/18797 95
11755/18797 96
11755/18797 97
– Unlike PCA
– So the sources can come in any order – Permutation invariance
– Scaling the signal does not affect independence
– In the best case – In worse case, output are not desired signals at all..
11755/18797 98
11755/18797 99