SI231 Matrix Computations Lecture 0: Overview
Ziping Zhao Fall Term 2020–2021 School of Information Science and Technology ShanghaiTech University, Shanghai, China
SI231 Matrix Computations Lecture 0: Overview Ziping Zhao Fall - - PowerPoint PPT Presentation
SI231 Matrix Computations Lecture 0: Overview Ziping Zhao Fall Term 20202021 School of Information Science and Technology ShanghaiTech University, Shanghai, China Course Information Ziping Zhao 1 General Information Instructor: Prof.
Ziping Zhao Fall Term 2020–2021 School of Information Science and Technology ShanghaiTech University, Shanghai, China
Ziping Zhao 1
– office: Rm. 1C-503C, SIST Building – e-mail: zhaoziping@shanghaitech.edu.cn – website: http://faculty.sist.shanghaitech.edu.cn/faculty/zhaoziping
– Tuesday/Thursday 10:15am–11:55am, Rm. 101, Teaching Center
– Lin Zhu, zhulin@shanghaitech.edu.cn (leading TA) – Zhihang Xu, xuzhh@...; Jiayi Chang, changyj@... – Song Mao, maosong@...; Zhicheng Wang, wangzhch1@... – Sihang Xu, xush@...; Xinyue Zhang, zhangxy11@... – Chenguang Zhang, zhangchg@...; Bing Jiang, jiangbing@...
website: http://faculty.sist.shanghaitech.edu.cn/faculty/ zhaoziping/si231
Ziping Zhao 2
used in many different fields, e.g., – machine learning, computer vision and graphics, natural language processing, – systems and control, signal and image processing, communications, networks, – optimization, statistics, econometrics, finance, and many more...
– basic matrix concepts, subspace, norms, – linear system of equations, LU decomposition, Cholesky decomposition – linear least squares – eigendecomposition, singular value decomposition – positive semidefinite matrices – pseudo-inverse, QR decomposition – (advanced) tensor decomposition, advanced matrix calculus, compressive sens- ing, structured matrix factorization
Ziping Zhao 3
– Gene H. Golub and Charles F. van Loan, Matrix Computations (Fourth Edition), The John Hopkins University Press, 2013.
– Roger A. Horn and Charles R. Johnson, Matrix Analysis (Second Edition), Cambridge University Press, 2012. – Jan R. Magnus and Heinz Neudecker, Matrix Differential Calculus with Appli- cations in Statistics and Econometrics (Third Edition), John Wiley and Sons, New York, 2007. – Gilbert Strang, Linear Algebra and Learning from Data, Wellesley-Cambridge Press, 2019. – Giuseppe Calafiore and Laurent El Ghaoui, Optimization Models, Cambridge University Press, 2014.
Ziping Zhao 4
– Assignments: 30% ∗ may contain MATLAB questions ∗ where to submit: ShanghaiTech e-learning platform, i.e., Blackboard (Bb) ∗ no late submissions would be accepted, except for exceptional cases. – Mid-term examination: 40% – Final project: 30%
– Students are strongly advised to read the ShanghaiTech Policy on Academic Integrity: https://oaa.shanghaitech.edu.cn/2015/0706/c4076a31250/ page.htm
Ziping Zhao 5
you updated with the course.
reach you
Ziping Zhao 6
Ziping Zhao 7
Ax = y.
– don’t tell me answers like x=inv(A)*y or x=A\y on MATLAB! – this is about matrix computations
– it’s too slow to do the generic trick x=A\y when n is very large – getting better understanding of matrix computations will enable you to exploit problem structures to build efficient solvers
– key to system analysis, or building robust solutions
Ziping Zhao 8
is known, we should be able to find the other unknown values of these quantities.
A
1 1 −1 −R1 − R2 − R3 R5 −R4 −R5
x
I1 I2 I3 =
y
E2 + E3 −E3 − E1 .
Ziping Zhao 9
min
x∈Rn y − Ax2 2,
where · 2 is the Euclidean norm; i.e., x2 = n
i=1 |xi|2.
xLS = (ATA)−1ATy.
Ziping Zhao 10
yt = a1yt−1 + a2yt−2 + · · · + aqyt−q + vt, t = 0, 1, 2, . . . for some coefficients {ai}q
i=1, where vt is noise or modeling error.
i=1 from {yt}t≥0; can be formulated as LS
time-series prediction, speech analysis and coding, spectral
Ziping Zhao 11
10 20 30 40 50 60 1.9 1.95 2 2.05 2.1 2.15 2.2 2.25 2.3 x 10
4
day Hang Seng Index
Hang Seng Index Linear Prediction
blue— Hang Seng Index during a certain time period. red— training phase; the line is q
i=1 aiyt−i; a is obtained by LS; q = 10.
green— prediction phase; the line is ˆ yt = q
i=1 aiˆ
yt−i; the same a in the training phase.
Ziping Zhao 12
Tracking influenza outbreaks by ARGO — a model combining the AR model and GOogle search
Ziping Zhao 13
Av = λv, for some λ.
also admits a decomposition/factorization A = VΛVT, where V ∈ Rn×n is orthogonal, i.e., VTV = I; Λ = Diag(λ1, . . . , λn)
Ziping Zhao 14
Source: Wiki.
Ziping Zhao 15
vj cj = vi, i = 1, . . . , n, where cj is the number of outgoing links from page j; Li is the set of pages with a link to page i; vi is the importance score of page i.
A
1 2
1
1 3 1 3 1 2 1 3
v
v1 v2 v3 v4 =
v
v1 v2 v3 v4 .
Ziping Zhao 16
given Y ∈ Rm×n and an integer r < min{m, n}, find an (A, B) ∈ Rm×r × Rr×n such that either Y = AB or Y ≈ AB.
…
min
A∈Rm×r,B∈Rr×n Y − AB2 F,
where · F is the Frobenius, or matrix Euclidean, norm.
dimensionality reduction, extracting meaningful features from data, low-rank modeling, . . .
Ziping Zhao 17
Y = UΣVT, where U ∈ Rm×m, V ∈ Rn×n are orthogonal; Σ ∈ Rm×n takes a diagonal form.
min
A∈Rm×r,B∈Rr×n Y − AB2 F.
Ziping Zhao 18
truncated SVD, r = 3 truncated SVD, r = 5 truncated SVD, r = 10 truncated SVD, r = 20
Ziping Zhao 19
given a set of data points {y1, y2, . . . , yn} ⊂ Rm and an integer r < min{m, n}, perform a low-dimensional representation yi = Qci + µ + ei, i = 1, . . . , n, where Q ∈ Rm×r is a basis; ci’s are coefficients; µ is a base; ei’s are errors
Ziping Zhao 20
A face image dataset. Image size = 112 × 92, number of face images = 400. Each yi is the vectorization of one face image, leading to m = 112 × 92 = 10304, n = 400.
Ziping Zhao 21
Mean face 1st principal left singular vector 2nd principal left singular vector 3rd principal left singular vector 400th left singu- lar vector
50 100 150 200 250 300 350 400 10 10
1
10
2
10
3
10
4
10
5
Rank Energy
Energy Concentration
Ziping Zhao 22
timization, computer vision, control, communications, . . ., use matrix operations extensively
– sparse recovery or compressed sensing; – matrix completion; structured low-rank matrix approximation; – quadratic system of equations problem or phase retrieval; – deep neural networks; etc.
Ziping Zhao 23
Problem: given y ∈ Rm, A ∈ Rm×n, m < n, find a sparsest x ∈ Rn such that y = Ax.
measurements sparse vector with few nonzero entries
Ziping Zhao 24
Problem: MRI image reconstruction.
(a) (b)
Fourier coefficients are sampled along 22 approximately radial lines. Source: [Cand` es-Romberg- Tao2006]
Ziping Zhao 25
Problem: MRI image reconstruction.
(c) (d)
by a sparse recovery solution. Source: [Cand` es-Romberg-Tao2006]
Ziping Zhao 26
– in 2009, Netflix awarded $1 million to a team that performed best in recom- mending new movies to users based on their previous preference1 .
movies Z = 2 3 1 ? ? 5 5 1 ? 4 2 ? ? ? ? 3 1 ? 2 2 2 ? ? ? 3 ? 1 5 users – some entries zij are missing, since no one watches all movies. – Z is assumed to be of low rank; research shows that only a few factors affect users’ preferences.
1www.netflixprize.com Ziping Zhao 27
[Koren-Bell-Volinsky2009].
min
A∈Rm×r,B∈Rr×n
|zij − [AB]i,j|2 where Ω is an index set that indicates the known entries of Z.
Ziping Zhao 28
Left: An incomplete image with 40% missing pixels. Right: the low-rank matrix completion result. r = 120.
Ziping Zhao 29
min
A∈Rm×r,B∈Rr×n Y − AB2 F
s.t. A ≥ 0, B ≥ 0, where X ≥ 0 means that xij ≥ 0 for all i, j.
Ziping Zhao 30
NMF = × Original
The basis elements extract facial features such as eyes, nose and lips. Source: [Lee-Seung1999].
Ziping Zhao 31
– basis elements allow us to recover different topics; – weights allow us to assign each text to its corresponding topics.
Ziping Zhao 32
A face image dataset. Image size = 101 × 101, number of face images = 13232. Each yi is the vectorization of one face image, leading to m = 101 × 101 = 10201, n = 13232.
Ziping Zhao 33
NMF settings: r = 49, Lee-Seung multiplicative update with 5000 iterations.
Ziping Zhao 34
Mean face 1st principal left singular vector 2nd principal left singular vector 3th principal left singular vector last principal left singular vector
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 10
−3
10
−2
10
−1
10 10
1
10
2
10
3
10
4
10
5
10
6
Rank Energy
Energy Concentration
Ziping Zhao 35
– how to read how people manipulate matrix operations, and how you can manipulate them (learn to use a tool); – what applications we can do, or to find new applications of our own (learn to apply a tool); – deep analysis skills (Why is this tool valid? Can I invent new tools? Key to some topics, should go through at least once in your life time)
Ziping Zhao 36
[Yang-Santillana-Kou2015]
influenza epidemics using Google search data via ARGO,” Proceedings of the National Academy of Sciences, vol. 112, no. 47, pp. 14473–14478, 2015. [Bryan-Tanya2006]
The linear algebra behind Google,” SIAM Review, vol. 48, no. 3, pp. 569–581, 2006. [Cand` es-Romberg-Tao2006]
es, J. Romberg, and T. Tao, “Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information,” IEEE Trans. Information Theory, vol. 52, no. 2, pp. 489–509, 2006. [Koren-Bell-Volinsky2009] B. Koren, R. Bell, and C. Volinsky, “Matrix factorization techniques for recommender systems,” IEEE Computer, vol. 42 no. 8, pp. 30–37, 2009. [Lee-Seung1999] D. D. Lee and H. S. Seung, “Learning the parts of objects by non-negative matrix factorization,” Nature, vol. 401, no. 6755, pp. 788–791, 1999.
Ziping Zhao 37