Lecture 15: Exact Tensor Completion Joint Work with David Steurer - - PowerPoint PPT Presentation
Lecture 15: Exact Tensor Completion Joint Work with David Steurer - - PowerPoint PPT Presentation
Lecture 15: Exact Tensor Completion Joint Work with David Steurer Lecture Outline Part I: Matrix Completion Problem Part II: Matrix Completion via Nuclear Norm Minimization Part III: Generalization to Tensor Completion
Lecture Outline
Part I: Matrix Completion Problem
- Part II:
- Matrix Completion via Nuclear Norm
Minimization Part III:
- Generalization to Tensor Completion
Part IV: SOS
- symmetry to the Rescue
Part
- V: Finding Dual Certificate for Matrix
Completion Part
- VI: Open Problems
Part I: Matrix Completion Problem
Matrix Completion
- Matrix Completion: Let Ξ© be a set of entries
sampled at random. Given the entries {πππ: π, π β Ξ©} from a matrix π, can we determine the remaining entries of π?
- Impossible in general, tractable if π is low rank
i.e. π = Οπ=1
π
πππ£ππ€π
π where π is not too large.
Netflix Challenge
- Canonical example of matrix completion:
Netflix Challenge
- Can we predict usersβ preferences on other
movies from their previous ratings?
Netflix Challenge
10 9 8 9 6 7.5 6 6 5 8 9 9 ? ? ? ? ? ? ? ?
Solving Matrix Completion
- Current best method in practice: Alternating
minimization
- Idea: Write π = Οπ=1
π
π£π π€π
π, alternate
between optimizing {π£π} and {π€π}
- Best known theoretical guarantees: Nuclear
norm minimization
- This lecture: Weβll describe nuclear norm
minimization and how it generalizes to tensor completion via SOS.
Part II: Nuclear Norm Minimization
Theorem Statement
- Theorem [Rec11]: If π = Οπ=1
π
ππ π£ππ€π
π is an
π Γ π matrix then nuclear norm minimization requires π(ππ π0 ππππ 2) random samples to complete π with high probability
- Note: π0 is a parameter related to how
coherent the {π£π} and the {π€π} (see appendix for the definition)
- Example of why this is needed: If π£π = π
π then
π£ππ€π
π = πππ€π π can only be fully detected by
sampling all of row π, which requires sampling almost everything!
Nuclear Norm
- Recall the singular value decomposition (SVD)
- f a matrix π
- π = Οπ=1
π
πππ£π π€π
π where the {π£π} are
- rthonormal, the {π€π} are orthonormal, and
ππ β₯ 0 for all π.
- The nuclear norm of π is π β = Οπ=1
π
ππ
Nuclear Norm Minimization
- Matrix completion problem: Recover π given
randomly sampled entries {πππ: π, π β Ξ©}
- Nuclear norm minimization: Find the matrix
π which minimizes π β while satisfying πππ = πππ whenever π, π β Ξ©.
- How do we minimize π β?
Semidefinite Program
- We can implement nuclear norm minimization
with the following semidefinite program:
- Minimize the trace of π
π ππ π β½ 0 where πππ = πππ whenever π, π β Ξ©
- Why does this work? Weβll first show that the
true solution is a good solution. Weβll then describe how to show the true solution is the
- ptimal solution
True Solution
- Program: Minimize the trace of π
π ππ π β½ 0 where πππ = πππ whenever π, π β Ξ©
- True solution: π
π ππ π = Οπ ππ π£π π€π π£π
π
π€π
π
(recall that π = Οπ ππ π£ππ€π
π)
- Since for all π, π’π π£ππ£π
π = π’π π€ππ€π π = 1,
π’π π π ππ π = 2 Οπ ππ
Dual Certificate
- Program: Minimize the trace of π
π ππ π β½ 0 where πππ = πππ whenever π, π β Ξ©
- Dual Certificate:
π½π βπ΅ βπ΅π π½π β½ 0
- Recall that if π1, π2 β½ 0 then π1β¦π2 β₯ 0
(where β¦ is the entry-wise dot product)
- π½π
βπ΅ βπ΅π π½π β¦ π π ππ π β₯ 0
- If π΅ππ = 0 whenever π, π β Ξ©, this lower
bounds the trace.
True Solution Optimality
- Dual Certificate:
π½π βπ΅ βπ΅π π½π β½ 0 where π΅ππ = 0 whenever π, π β Ξ©
- True solution π
π ππ π = Οπ ππ π£π π€π π£π
π
π€π
π
is optimal if π½π βπ΅ βπ΅π π½π β¦ π π ππ π = 0
- This occurs if
π½π βπ΅ βπ΅π π½π π£π π€π = 0 for all π
Conditions on π΅
- We want π΅ such that
π½π βπ΅ βπ΅π π½π β½ 0, π΅ππ = 0 whenever π, π β Ξ©, and π½π βπ΅ βπ΅π π½π π£π π€π = 0 for all π
- Necessary and sufficient conditions on π΅:
1. π΅ β€ 1
- 2. π΅ππ = 0 whenever π, π β Ξ©
- 3. π΅π€π = π£π for all π
- 4. π΅ππ£π = π€π for all π
Dual Certificate with all entries
- Necessary and sufficient conditions on π΅:
1. π΅ β€ 1
- 2. π΅ππ = 0 whenever π, π β Ξ©
- 3. π΅π€π = π£π for all π
- 4. π΅ππ£π = π€π for all π
- If we have all entries (so we can ignore
condition 2), we can take π΅ = Οπ π£ππ€π
π
- Challenge: Find π΅ when we donβt have all
entries
- Remark: This explains why the semidefinite
program minimizes the nuclear norm.
Part III: Generalization to Tensor Completion
Tensor Completion
Tensor Completion: Let
- Ξ© be a set of entries
sampled at random. Given the entries {ππππ: π, π, π β Ξ©} from a tensor π, can we determine the remaining entries of π? More difficult problem: tensor rank is much
- more complicated
Exact Tensor Completion Theorem
- Theorem [PS17]: If π = Οπ=1
π
πππ£π β π€π β π₯π, the {π£π} are orthogonal, the {π€π} are
- rthogonal, and the {π₯π} are orthogonal then
with high probability we can recover π with π(π ππ
3 2ππππ§πππ(π)) random samples
- First algorithm to obtain exact tensor
completion
- Remark: The orthogonality condition is very
restrictive but this result can likely be extended.
- See appendix for the definition of π.
Semidefinite Program: First Attempt
- Wonβt quite work, but weβll fix it later.
- Minimize the trace of π
π ππ ππ β½ 0 where ππππ = ππππ whenever π, π, π β Ξ©
- Here the top and left blocks are indexed by π
and the bottom and right blocks are indexed by π, π.
True Solution
Program: Minimize trace of
- π
π ππ ππ β½ 0 where ππππ = ππππ whenever π, π, π β Ξ© True solution:
- π
π ππ ππ = Οπ ππ π£π π€π β π₯π π£π
π
π€π β π₯π π (recall that T = Οπ ππ π£π π€π β π₯π π) π’π
- π
π ππ ππ = 2 Οπ ππ
Dual Certificate: First Attempt
Program: Minimize trace of
- π
π ππ ππ β½ 0 where ππππ = ππππ whenever π, π, π β Ξ© Dual Certificate:
- π½π
βπ΅ βπ΅π π½π β½ 0 where π΅πππ = 0 whenever π, π, π β Ξ© We want
- π½π
βπ΅ βπ΅π π½π π£π π€π β π₯π = 0 for all π
Conditions on π΅
- We want π΅ such that
π½π βπ΅ βπ΅π π½π β½ 0, π΅πππ = 0 whenever π, π, π β Ξ©, and π½π βπ΅ βπ΅π π½π π£π π€π β π₯π = 0 for all π
- Necessary and sufficient conditions on π΅:
1. π΅ β€ 1
- 2. π΅πππ = 0 whenever π, π, π β Ξ©
- 3. π΅(π€π β π₯π) = π£π for all π
- 4. π΅ππ£π = π€π β π₯π for all π TOO STRONG, requires
Ξ©(π2) samples!
Part IV: SOS-symmetry to the Rescue
SOS Program
- Minimize the trace of π
π ππ ππ β½ 0 where ππππ = ππππ whenever π, π, π β Ξ© and ππ is SOS-symmetric (i.e. πππππβ²πβ² = πππβ²πππβ² for all π, π, πβ², πβ²)
Review: Matrix Polynomial π(π )
- Definition: Given a symmetric matrix π
indexed by monomials, define q π = ΟπΏ(Οπ½,πΎ:πΏ=π½βͺπΎ(ππ‘ ππ£ππ’ππ‘ππ’π‘) π π½πΎ)π¦πΏ
- Idea: M β π = ΰ·¨
πΉ[π(π )]
Dual Certificate
- Program: Minimize trace of π
π ππ ππ β½ 0 where ππππ = ππππ whenever π, π, π β Ξ© and ππ is SOS-symmetric
- Dual Certificate:
π½π βπ΅ βπ΅π πΆ β½ 0 where π΅πππ = 0 whenever π, π, π β Ξ© and q πΆ = π(π½π)
- We want
π½π βπ΅ βπ΅π πΆ π£π π€π β π₯π = 0 for all π
Dual Certificate Tightness Condition
- Write πΆ = π΅ππ΅ + π½π β π
- Dual Certificate:
π½π βπ΅ βπ΅π π΅ππ΅ + π½π β π β½ 0 where π΅πππ = 0 whenever π, π, π β Ξ© and q πΆ = π(π½π)
- This dual certificate is tight for the true solution
if π½π βπ΅ βπ΅π π΅ππ΅ + π½π β π π£π π€π β π₯π = 0 for all π
Dual Certificate Conditions
- This gives us the following conditions on π΅, π
1. π΅πππ = 0 whenever π, π, π β Ξ© 2. βπ, π΅(π€πβ π₯π) = π£π 3. π β€ 1 4. βπ, π(π€πβ π₯π) = π€π β π₯π 5. π π = π(π΅ππ΅) (so that π πΆ = π π½π = Οπ,π π§π
2π¨π 2)
- Remark: These conditions are sufficient even if
π is not orthogonal. We only prove the theorem for orthogonal tensors because thatβs what our current analysis can handle.
Part V: Finding Dual Certificate for Matrix Completion
Conditions on π΅
Necessary and sufficient conditions on
- π΅:
1. π΅ β€ 1 2. π΅ππ = 0 whenever π, π β Ξ© 3. π΅π€π = π£π for all π 4. π΅ππ£π = π€π for all π
How can we find such an
- π΅?
Idea:
- Alternate between satisfying condition 2
and conditions 3,4, converging to a final solution.
Definition of P
U, P V, P T
- Define ππ to be the projection to π‘πππ π£π .
The equation for this is ππ π¦ = Οπ π¦ β π£π π£π
- Define ππ to be the projection to π‘πππ π€π .
The equation for this is ππ π§ = Οπ π§ β π€π π€π
- Define ππ to be the projection (on the space of
matrices) to π‘πππ{π¦π€π
π, π£π ππ§} (for arbitrary
π¦, π§). The equation for this is πππ = πππ + πππ β πππππ
Restatement of Conditions 3,4
- Necessary and sufficient conditions on π΅:
1. π΅ β€ 1 2. π΅ππ = 0 whenever π, π β Ξ© 3. π΅π€π = π£π for all π 4. π΅ππ£π = π€π for all π
- Without loss of generality, assume π =
Οπ π£ππ€π
π (the values of the ππ donβt affect the
dual certificate)
- Assuming π = Οπ π£ππ€π
π, conditions 3,4 are
equivalent to πππ΅ = π
Definition of πΞ© and ΰ΄€ πΞ©
Definition: Define πΞ©(π) = π1π2π3
π
ππππ if
- π, π, π β Ξ© and 0 otherwise where π1 Γ π2 Γ
π3 are the dimensions of the tensor and each entry is sampled indepently with probability
π π1π2π3.
Define
- ΰ΄€
πΞ©(π) =
π1π2π3 π
β 1 ππππ if π, π, π β Ξ© and βππππ if π, π, π β Ξ©
- πΞ© π πππ = 0 whenever π, π, π β Ξ©
πΉ
- ΰ΄€
πΞ© π = 0 (over the choice of Ξ©)
First Iteration
Start with
- π. P
Tπ = π but π has nonzero
entries outside the sampled entries
- πΞ©(π) is zero outside the sampled entries,
but πππΞ© π β π We take
- A1 = πΞ©(π) as the first
approximation, weβll need to correct for the difference πππΞ©π β π = ππ ΰ΄€ πΞ©π
Technical Note
- For the analysis, actually need to resample
independently for each iteration, obtaining sets of samples Ξ©1, Ξ©2, β¦. This is the source of the ππππ 2 in the upper bound (the lower bound only has log π (reference to be added))
Iterative Equation
- Take
π΅π = Οπ=0
πβ1 β1 ππΞ©π+1(ππ ΰ΄€
πΞ©π) β¦ (ππ ΰ΄€ πΞ©1)π
- Claim:
πππ΅π = π + β1 πβ1(ππ ΰ΄€ πΞ©π) β¦ (ππ ΰ΄€ πΞ©1)π
- Proof idea: Use the facts that πΞ© = 1 + ΰ΄€
πΞ©, ππ
2 = ππ, and πππ = π.
Convergence and Final Step
Take
- π΅π = Οπ=0
πβ1 β1 ππΞ©π+1(ππ ΰ΄€
πΞ©π) β¦ (ππ ΰ΄€ πΞ©1)π Claim:
- πππ΅π = π + β1 πβ1(ππ ΰ΄€
πΞ©π) β¦ (ππ ΰ΄€ πΞ©1)π To show that
- πππ΅π converges to π w.h.p., it is
sufficient to show that the ππ ΰ΄€ πΞ© operation makes matrices βsmallerβ with high probability. Once
- the error is small enough, we then take
- ne final step to satisfy all conditions
- simultaneously. For details, see [Rec11].
Part VI: Open Problems
Open Problems
- For which tensors π can we show that SOS
gives exact tensor completion? Weβve shown it when π is orthogonal, but this can very likely be extended.
- Important subproblem: When can we find π΅
such that π΅ π€π β π₯π = π£π for all π and |π΅ π£, π€, π₯ | β€ 1 for all unit π£, π€, π₯?
- Barak and Moitra [BM16] show that SOS solves
the approximate tensor completion problem in a somewhat broader setting with a different
- analysis. Can these analyses assist each other?
References
- [BM16] B. Barak and A. Moitra, Noisy tensor completion via the sum-of-squares
hierarchy, COLT, JMLR Workshop and Conference Proceedings, vol. 49, JMLR.org p. 417β445, 2016
- [PS17] A. Potechin and D. Steurer. Exact tensor completion with sum-of-squares.
COLT 2017
- [Rec11] B. Recht. A Simpler Approach to Matrix Completion. JMLR Volume 12, p.
3413-3430. 2011
Appendix: π0 and π Definitions
π0 and π Definitions
Definition:
- π0 =
π π β max{maxπ ππππ 2 , max π
ππππ
2}
Definition:
- π = n β max{maxπ,π π£ππ
2 , max π,π π€ππ 2 , max π,π π₯ππ 2 }