Compressed Sensing. Find x with small number of non-zeros using - - PowerPoint PPT Presentation

compressed sensing
SMART_READER_LITE
LIVE PREVIEW

Compressed Sensing. Find x with small number of non-zeros using - - PowerPoint PPT Presentation

Fun with 1 and 2 x 1 n x 2 . Fun with 1 and 2 x 1 n x 2 . x 1 = x sgn ( x ) x 2 sgn ( x ) 2 x 2 Fun with 1 and 2 x 1


slide-1
SLIDE 1

Fun with ℓ1 and ℓ2

x1 ≤ √nx2.

slide-2
SLIDE 2

Fun with ℓ1 and ℓ2

x1 ≤ √nx2. x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • x2
slide-3
SLIDE 3

Fun with ℓ1 and ℓ2

x1 ≤ √nx2. x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • x2

x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • |supp(x)|x2
slide-4
SLIDE 4

Fun with ℓ1 and ℓ2

x1 ≤ √nx2. x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • x2

x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • |supp(x)|x2

supp(x) is non-zero indices of x.

slide-5
SLIDE 5

Fun with ℓ1 and ℓ2

x1 ≤ √nx2. x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • x2

x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • |supp(x)|x2

supp(x) is non-zero indices of x. If concentrated mass, x1 = x2.

slide-6
SLIDE 6

Fun with ℓ1 and ℓ2

x1 ≤ √nx2. x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • x2

x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • |supp(x)|x2

supp(x) is non-zero indices of x. If concentrated mass, x1 = x2. x = (1,0,0,...,0).

slide-7
SLIDE 7

Fun with ℓ1 and ℓ2

x1 ≤ √nx2. x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • x2

x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • |supp(x)|x2

supp(x) is non-zero indices of x. If concentrated mass, x1 = x2. x = (1,0,0,...,0). If spreadout, √nx2 ≤ x1.

slide-8
SLIDE 8

Fun with ℓ1 and ℓ2

x1 ≤ √nx2. x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • x2

x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • |supp(x)|x2

supp(x) is non-zero indices of x. If concentrated mass, x1 = x2. x = (1,0,0,...,0). If spreadout, √nx2 ≤ x1. x = (1,1,1,...,1).

slide-9
SLIDE 9

Fun with ℓ1 and ℓ2

x1 ≤ √nx2. x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • x2

x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • |supp(x)|x2

supp(x) is non-zero indices of x. If concentrated mass, x1 = x2. x = (1,0,0,...,0). If spreadout, √nx2 ≤ x1. x = (1,1,1,...,1). If kind of spread out, x2 ≤

1 √ k x1.

slide-10
SLIDE 10

Fun with ℓ1 and ℓ2

x1 ≤ √nx2. x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • x2

x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • |supp(x)|x2

supp(x) is non-zero indices of x. If concentrated mass, x1 = x2. x = (1,0,0,...,0). If spreadout, √nx2 ≤ x1. x = (1,1,1,...,1). If kind of spread out, x2 ≤

1 √ k x1.

x has k 1’s.

slide-11
SLIDE 11

Fun with ℓ1 and ℓ2

x1 ≤ √nx2. x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • x2

x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • |supp(x)|x2

supp(x) is non-zero indices of x. If concentrated mass, x1 = x2. x = (1,0,0,...,0). If spreadout, √nx2 ≤ x1. x = (1,1,1,...,1). If kind of spread out, x2 ≤

1 √ k x1.

x has k 1’s. Fixing v2, sparse vectors have small v1 norm, dense ones have big v1 norm.

slide-12
SLIDE 12

Compressed Sensing.

Find x with small number of non-zeros using linear measurements.

slide-13
SLIDE 13

Compressed Sensing.

Find x with small number of non-zeros using linear measurements. Ax = b.

slide-14
SLIDE 14

Compressed Sensing.

Find x with small number of non-zeros using linear measurements. Ax = b. Application: MRI.

slide-15
SLIDE 15

Compressed Sensing.

Find x with small number of non-zeros using linear measurements. Ax = b. Application: MRI. Find x with k-sparse x, i.e., supp(x) ≤ k.

slide-16
SLIDE 16

Compressed Sensing.

Find x with small number of non-zeros using linear measurements. Ax = b. Application: MRI. Find x with k-sparse x, i.e., supp(x) ≤ k. ℓ0-minimization.

slide-17
SLIDE 17

Compressed Sensing.

Find x with small number of non-zeros using linear measurements. Ax = b. Application: MRI. Find x with k-sparse x, i.e., supp(x) ≤ k. ℓ0-minimization. Extremely “non-convex”.

slide-18
SLIDE 18

Compressed Sensing.

Find x with small number of non-zeros using linear measurements. Ax = b. Application: MRI. Find x with k-sparse x, i.e., supp(x) ≤ k. ℓ0-minimization. Extremely “non-convex”. Find solution to minw1,Ax = b.

slide-19
SLIDE 19

Compressed Sensing.

Find x with small number of non-zeros using linear measurements. Ax = b. Application: MRI. Find x with k-sparse x, i.e., supp(x) ≤ k. ℓ0-minimization. Extremely “non-convex”. Find solution to minw1,Ax = b. Linear Program!

slide-20
SLIDE 20

Compressed Sensing.

Find x with small number of non-zeros using linear measurements. Ax = b. Application: MRI. Find x with k-sparse x, i.e., supp(x) ≤ k. ℓ0-minimization. Extremely “non-convex”. Find solution to minw1,Ax = b. Linear Program! Exercise.

slide-21
SLIDE 21

Restricted Isometry Property (RIP) matrices.

Definition: A matrix A is RIP for δk if any k-sparse vector x (1−δk)x2 ≤ Ax2 ≤ (1+δk)x2.

slide-22
SLIDE 22

Restricted Isometry Property (RIP) matrices.

Definition: A matrix A is RIP for δk if any k-sparse vector x (1−δk)x2 ≤ Ax2 ≤ (1+δk)x2. Theorem [Candes-Tao]: For any matrix RIP matrix A with δ2k +δ3k < 1, for Ax = b with a k-sparse solution, then the solution to miny1,Ay = b, has y = x.

slide-23
SLIDE 23

Fun with ℓ1 and ℓ2

x1 ≤ √nx2.

slide-24
SLIDE 24

Fun with ℓ1 and ℓ2

x1 ≤ √nx2. x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • x2
slide-25
SLIDE 25

Fun with ℓ1 and ℓ2

x1 ≤ √nx2. x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • x2

x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • |supp(x)|x2
slide-26
SLIDE 26

Fun with ℓ1 and ℓ2

x1 ≤ √nx2. x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • x2

x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • |supp(x)|x2

supp(x) is non-zero indices of x.

slide-27
SLIDE 27

Fun with ℓ1 and ℓ2

x1 ≤ √nx2. x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • x2

x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • |supp(x)|x2

supp(x) is non-zero indices of x. If concentrated mass, x1 = x2.

slide-28
SLIDE 28

Fun with ℓ1 and ℓ2

x1 ≤ √nx2. x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • x2

x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • |supp(x)|x2

supp(x) is non-zero indices of x. If concentrated mass, x1 = x2. x = (1,0,0,...,0).

slide-29
SLIDE 29

Fun with ℓ1 and ℓ2

x1 ≤ √nx2. x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • x2

x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • |supp(x)|x2

supp(x) is non-zero indices of x. If concentrated mass, x1 = x2. x = (1,0,0,...,0). If spreadout, √nx2 ≤ x1.

slide-30
SLIDE 30

Fun with ℓ1 and ℓ2

x1 ≤ √nx2. x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • x2

x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • |supp(x)|x2

supp(x) is non-zero indices of x. If concentrated mass, x1 = x2. x = (1,0,0,...,0). If spreadout, √nx2 ≤ x1. x = (1,1,1,...,1).

slide-31
SLIDE 31

Fun with ℓ1 and ℓ2

x1 ≤ √nx2. x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • x2

x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • |supp(x)|x2

supp(x) is non-zero indices of x. If concentrated mass, x1 = x2. x = (1,0,0,...,0). If spreadout, √nx2 ≤ x1. x = (1,1,1,...,1). If kind of spread out, x2 ≤

1 √ k x1.

slide-32
SLIDE 32

Fun with ℓ1 and ℓ2

x1 ≤ √nx2. x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • x2

x1 = x ·sgn(x) ≤ x2sgn(x)2 ≤

  • |supp(x)|x2

supp(x) is non-zero indices of x. If concentrated mass, x1 = x2. x = (1,0,0,...,0). If spreadout, √nx2 ≤ x1. x = (1,1,1,...,1). If kind of spread out, x2 ≤

1 √ k x1.

x has k 1’s.

slide-33
SLIDE 33

Almost Euclidean Nullspace.

Theorem: For a random ±1, d ×n matrix A, and for any x in ker(A) some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

x2 <

√ 1 √ 16k x1. (∗)

slide-34
SLIDE 34

Almost Euclidean Nullspace.

Theorem: For a random ±1, d ×n matrix A, and for any x in ker(A) some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

x2 <

√ 1 √ 16k x1. (∗)

Intuition: “Mass in x is spread out over k entries.”

slide-35
SLIDE 35

Almost Euclidean Nullspace.

Theorem: For a random ±1, d ×n matrix A, and for any x in ker(A) some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

x2 <

√ 1 √ 16k x1. (∗)

Intuition: “Mass in x is spread out over k entries.” The nullspace of A, is almost euclidean.

slide-36
SLIDE 36

Almost Euclidean Nullspace.

Theorem: For a random ±1, d ×n matrix A, and for any x in ker(A) some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

x2 <

√ 1 √ 16k x1. (∗)

Intuition: “Mass in x is spread out over k entries.” The nullspace of A, is almost euclidean. Typical vectors are spread out: every vector is kind of spread out.

slide-37
SLIDE 37

Almost Euclidean Nullspace.

Theorem: For a random ±1, d ×n matrix A, and for any x in ker(A) some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

x2 <

√ 1 √ 16k x1. (∗)

Intuition: “Mass in x is spread out over k entries.” The nullspace of A, is almost euclidean. Typical vectors are spread out: every vector is kind of spread out. The ℓ1 ball is closer to scaling of ℓ2 ball for vectors in the null-space.

slide-38
SLIDE 38

Almost Euclidean Nullspace.

Theorem: For a random ±1, d ×n matrix A, and for any x in ker(A) some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

x2 <

√ 1 √ 16k x1. (∗)

Intuition: “Mass in x is spread out over k entries.” The nullspace of A, is almost euclidean. Typical vectors are spread out: every vector is kind of spread out. The ℓ1 ball is closer to scaling of ℓ2 ball for vectors in the null-space. Idea: Consider random r ×n matrix A over GF(2).

slide-39
SLIDE 39

Almost Euclidean Nullspace.

Theorem: For a random ±1, d ×n matrix A, and for any x in ker(A) some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

x2 <

√ 1 √ 16k x1. (∗)

Intuition: “Mass in x is spread out over k entries.” The nullspace of A, is almost euclidean. Typical vectors are spread out: every vector is kind of spread out. The ℓ1 ball is closer to scaling of ℓ2 ball for vectors in the null-space. Idea: Consider random r ×n matrix A over GF(2). For a vector x in GF(2).

slide-40
SLIDE 40

Almost Euclidean Nullspace.

Theorem: For a random ±1, d ×n matrix A, and for any x in ker(A) some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

x2 <

√ 1 √ 16k x1. (∗)

Intuition: “Mass in x is spread out over k entries.” The nullspace of A, is almost euclidean. Typical vectors are spread out: every vector is kind of spread out. The ℓ1 ball is closer to scaling of ℓ2 ball for vectors in the null-space. Idea: Consider random r ×n matrix A over GF(2). For a vector x in GF(2). A·x = 0, with probability (1/2)r if r rows.

slide-41
SLIDE 41

Almost Euclidean Nullspace.

Theorem: For a random ±1, d ×n matrix A, and for any x in ker(A) some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

x2 <

√ 1 √ 16k x1. (∗)

Intuition: “Mass in x is spread out over k entries.” The nullspace of A, is almost euclidean. Typical vectors are spread out: every vector is kind of spread out. The ℓ1 ball is closer to scaling of ℓ2 ball for vectors in the null-space. Idea: Consider random r ×n matrix A over GF(2). For a vector x in GF(2). A·x = 0, with probability (1/2)r if r rows. There are < X = 2 n

k

  • vectors x with fewer than k zeros.
slide-42
SLIDE 42

Almost Euclidean Nullspace.

Theorem: For a random ±1, d ×n matrix A, and for any x in ker(A) some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

x2 <

√ 1 √ 16k x1. (∗)

Intuition: “Mass in x is spread out over k entries.” The nullspace of A, is almost euclidean. Typical vectors are spread out: every vector is kind of spread out. The ℓ1 ball is closer to scaling of ℓ2 ball for vectors in the null-space. Idea: Consider random r ×n matrix A over GF(2). For a vector x in GF(2). A·x = 0, with probability (1/2)r if r rows. There are < X = 2 n

k

  • vectors x with fewer than k zeros.

If r > log(2 n

k

  • ) = Θ(k log n

k ), plus union bound.

slide-43
SLIDE 43

Almost Euclidean Nullspace.

Theorem: For a random ±1, d ×n matrix A, and for any x in ker(A) some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

x2 <

√ 1 √ 16k x1. (∗)

Intuition: “Mass in x is spread out over k entries.” The nullspace of A, is almost euclidean. Typical vectors are spread out: every vector is kind of spread out. The ℓ1 ball is closer to scaling of ℓ2 ball for vectors in the null-space. Idea: Consider random r ×n matrix A over GF(2). For a vector x in GF(2). A·x = 0, with probability (1/2)r if r rows. There are < X = 2 n

k

  • vectors x with fewer than k zeros.

If r > log(2 n

k

  • ) = Θ(k log n

k ), plus union bound.

= ⇒ Ax = 0 for all vectors that are k-sparse.

slide-44
SLIDE 44

Almost Euclidean Nullspace.

Theorem: For a random ±1, d ×n matrix A, and for any x in ker(A) some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

x2 <

√ 1 √ 16k x1. (∗)

Intuition: “Mass in x is spread out over k entries.” The nullspace of A, is almost euclidean. Typical vectors are spread out: every vector is kind of spread out. The ℓ1 ball is closer to scaling of ℓ2 ball for vectors in the null-space. Idea: Consider random r ×n matrix A over GF(2). For a vector x in GF(2). A·x = 0, with probability (1/2)r if r rows. There are < X = 2 n

k

  • vectors x with fewer than k zeros.

If r > log(2 n

k

  • ) = Θ(k log n

k ), plus union bound.

= ⇒ Ax = 0 for all vectors that are k-sparse.

slide-45
SLIDE 45

Almost Euclidean Nullspace.

Theorem: For a random ±1, d ×n matrix A, and for any x in ker(A) some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

x2 <

√ 1 √ 16k x1. (∗)

Intuition: “Mass in x is spread out over k entries.” The nullspace of A, is almost euclidean. Typical vectors are spread out: every vector is kind of spread out. The ℓ1 ball is closer to scaling of ℓ2 ball for vectors in the null-space. Idea: Consider random r ×n matrix A over GF(2). For a vector x in GF(2). A·x = 0, with probability (1/2)r if r rows. There are < X = 2 n

k

  • vectors x with fewer than k zeros.

If r > log(2 n

k

  • ) = Θ(k log n

k ), plus union bound.

= ⇒ Ax = 0 for all vectors that are k-sparse. That is, random A has no sparse vectors in null-space.

slide-46
SLIDE 46

Almost Euclidean Nullspace.

Theorem: For a random ±1, d ×n matrix A, and for any x in ker(A) some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

x2 <

√ 1 √ 16k x1. (∗)

Intuition: “Mass in x is spread out over k entries.” The nullspace of A, is almost euclidean. Typical vectors are spread out: every vector is kind of spread out. The ℓ1 ball is closer to scaling of ℓ2 ball for vectors in the null-space. Idea: Consider random r ×n matrix A over GF(2). For a vector x in GF(2). A·x = 0, with probability (1/2)r if r rows. There are < X = 2 n

k

  • vectors x with fewer than k zeros.

If r > log(2 n

k

  • ) = Θ(k log n

k ), plus union bound.

= ⇒ Ax = 0 for all vectors that are k-sparse. That is, random A has no sparse vectors in null-space. Note: Parity check matrix of linear code!

slide-47
SLIDE 47

Small projection onto small set of coordinates.

Consider A with property, x ∈ ker(A), has x2 <

1 16 √ k x1.

slide-48
SLIDE 48

Small projection onto small set of coordinates.

Consider A with property, x ∈ ker(A), has x2 <

1 16 √ k x1.

Lemma: For v ∈ ker(A), T ⊂ [n], |T| < k, vT 1 < v1

4 .

slide-49
SLIDE 49

Small projection onto small set of coordinates.

Consider A with property, x ∈ ker(A), has x2 <

1 16 √ k x1.

Lemma: For v ∈ ker(A), T ⊂ [n], |T| < k, vT 1 < v1

4 .

Proof: vT 1 ≤

slide-50
SLIDE 50

Small projection onto small set of coordinates.

Consider A with property, x ∈ ker(A), has x2 <

1 16 √ k x1.

Lemma: For v ∈ ker(A), T ⊂ [n], |T| < k, vT 1 < v1

4 .

Proof: vT 1 ≤

  • |T|vT 2
slide-51
SLIDE 51

Small projection onto small set of coordinates.

Consider A with property, x ∈ ker(A), has x2 <

1 16 √ k x1.

Lemma: For v ∈ ker(A), T ⊂ [n], |T| < k, vT 1 < v1

4 .

Proof: vT 1 ≤

  • |T|vT 2 ≤
  • |T|v2
slide-52
SLIDE 52

Small projection onto small set of coordinates.

Consider A with property, x ∈ ker(A), has x2 <

1 16 √ k x1.

Lemma: For v ∈ ker(A), T ⊂ [n], |T| < k, vT 1 < v1

4 .

Proof: vT 1 ≤

  • |T|vT 2 ≤
  • |T|v2 ≤
  • |T|

1 √ 16k v1

slide-53
SLIDE 53

Small projection onto small set of coordinates.

Consider A with property, x ∈ ker(A), has x2 <

1 16 √ k x1.

Lemma: For v ∈ ker(A), T ⊂ [n], |T| < k, vT 1 < v1

4 .

Proof: vT 1 ≤

  • |T|vT 2 ≤
  • |T|v2 ≤
  • |T|

1 √ 16k v1 < 1 4v1

slide-54
SLIDE 54

Small projection onto small set of coordinates.

Consider A with property, x ∈ ker(A), has x2 <

1 16 √ k x1.

Lemma: For v ∈ ker(A), T ⊂ [n], |T| < k, vT 1 < v1

4 .

Proof: vT 1 ≤

  • |T|vT 2 ≤
  • |T|v2 ≤
  • |T|

1 √ 16k v1 < 1 4v1

slide-55
SLIDE 55

Small projection onto small set of coordinates.

Consider A with property, x ∈ ker(A), has x2 <

1 16 √ k x1.

Lemma: For v ∈ ker(A), T ⊂ [n], |T| < k, vT 1 < v1

4 .

Proof: vT 1 ≤

  • |T|vT 2 ≤
  • |T|v2 ≤
  • |T|

1 √ 16k v1 < 1 4v1

Intuition:

slide-56
SLIDE 56

Small projection onto small set of coordinates.

Consider A with property, x ∈ ker(A), has x2 <

1 16 √ k x1.

Lemma: For v ∈ ker(A), T ⊂ [n], |T| < k, vT 1 < v1

4 .

Proof: vT 1 ≤

  • |T|vT 2 ≤
  • |T|v2 ≤
  • |T|

1 √ 16k v1 < 1 4v1

Intuition: For any v ∈ ker(A), the amount of mass in any small, k, set of coordinates is small, 1

4v1.

slide-57
SLIDE 57

Small projection onto small set of coordinates.

Consider A with property, x ∈ ker(A), has x2 <

1 16 √ k x1.

Lemma: For v ∈ ker(A), T ⊂ [n], |T| < k, vT 1 < v1

4 .

Proof: vT 1 ≤

  • |T|vT 2 ≤
  • |T|v2 ≤
  • |T|

1 √ 16k v1 < 1 4v1

Intuition: For any v ∈ ker(A), the amount of mass in any small, k, set of coordinates is small, 1

4v1.

Mass is spread out over more than k coordinates.

slide-58
SLIDE 58

Optimum is correct!

Want to find: k-sparse solution to Ax = b.

slide-59
SLIDE 59

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b.

slide-60
SLIDE 60

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k, vT 1 < v1

4 .

slide-61
SLIDE 61

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k, vT 1 < v1

4 .

Idea: any nonzero vector, v ∈ ker(A) has small projection onto any k coordinates.

slide-62
SLIDE 62

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k, vT 1 < v1

4 .

Idea: any nonzero vector, v ∈ ker(A) has small projection onto any k coordinates. Consider solution w.

slide-63
SLIDE 63

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k, vT 1 < v1

4 .

Idea: any nonzero vector, v ∈ ker(A) has small projection onto any k coordinates. Consider solution w. w = x +v where v ∈ ker(A).

slide-64
SLIDE 64

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k, vT 1 < v1

4 .

Idea: any nonzero vector, v ∈ ker(A) has small projection onto any k coordinates. Consider solution w. w = x +v where v ∈ ker(A).

slide-65
SLIDE 65

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k, vT 1 < v1

4 .

Idea: any nonzero vector, v ∈ ker(A) has small projection onto any k coordinates. Consider solution w. w = x +v where v ∈ ker(A). Will prove: v = 0 or w = x.

slide-66
SLIDE 66

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k, vT 1 < v1

4 .

Idea: any nonzero vector, v ∈ ker(A) has small projection onto any k coordinates. Consider solution w. w = x +v where v ∈ ker(A). Will prove: v = 0 or w = x. Contradiction

slide-67
SLIDE 67

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k, vT 1 < v1

4 .

Idea: any nonzero vector, v ∈ ker(A) has small projection onto any k coordinates. Consider solution w. w = x +v where v ∈ ker(A). Will prove: v = 0 or w = x. Contradiction ?

slide-68
SLIDE 68

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k, vT 1 < v1

4 .

Idea: any nonzero vector, v ∈ ker(A) has small projection onto any k coordinates. Consider solution w. w = x +v where v ∈ ker(A). Will prove: v = 0 or w = x. Contradiction ? Hmmm.

slide-69
SLIDE 69

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k, vT 1 < v1

4 .

Idea: any nonzero vector, v ∈ ker(A) has small projection onto any k coordinates. Consider solution w. w = x +v where v ∈ ker(A). Will prove: v = 0 or w = x. Contradiction ? Hmmm. Let T be non-zero coordinates of x.

slide-70
SLIDE 70

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k, vT 1 < v1

4 .

Idea: any nonzero vector, v ∈ ker(A) has small projection onto any k coordinates. Consider solution w. w = x +v where v ∈ ker(A). Will prove: v = 0 or w = x. Contradiction ? Hmmm. Let T be non-zero coordinates of x.

slide-71
SLIDE 71

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k, vT 1 < v1

4 .

Idea: any nonzero vector, v ∈ ker(A) has small projection onto any k coordinates. Consider solution w. w = x +v where v ∈ ker(A). Will prove: v = 0 or w = x. Contradiction ? Hmmm. Let T be non-zero coordinates of x. w1 = x +v1

slide-72
SLIDE 72

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k, vT 1 < v1

4 .

Idea: any nonzero vector, v ∈ ker(A) has small projection onto any k coordinates. Consider solution w. w = x +v where v ∈ ker(A). Will prove: v = 0 or w = x. Contradiction ? Hmmm. Let T be non-zero coordinates of x. w1 = x +v1 = xT +vT 1 +vT 1

slide-73
SLIDE 73

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k, vT 1 < v1

4 .

Idea: any nonzero vector, v ∈ ker(A) has small projection onto any k coordinates. Consider solution w. w = x +v where v ∈ ker(A). Will prove: v = 0 or w = x. Contradiction ? Hmmm. Let T be non-zero coordinates of x. w1 = x +v1 = xT +vT 1 +vT 1 v ≥ vT −vT

slide-74
SLIDE 74

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k, vT 1 < v1

4 .

Idea: any nonzero vector, v ∈ ker(A) has small projection onto any k coordinates. Consider solution w. w = x +v where v ∈ ker(A). Will prove: v = 0 or w = x. Contradiction ? Hmmm. Let T be non-zero coordinates of x. w1 = x +v1 = xT +vT 1 +vT 1 v ≥ vT −vT = ⇒

slide-75
SLIDE 75

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k, vT 1 < v1

4 .

Idea: any nonzero vector, v ∈ ker(A) has small projection onto any k coordinates. Consider solution w. w = x +v where v ∈ ker(A). Will prove: v = 0 or w = x. Contradiction ? Hmmm. Let T be non-zero coordinates of x. w1 = x +v1 = xT +vT 1 +vT 1 v ≥ vT −vT = ⇒ ≥ xT 1 −vT 1 +vT

slide-76
SLIDE 76

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k, vT 1 < v1

4 .

Idea: any nonzero vector, v ∈ ker(A) has small projection onto any k coordinates. Consider solution w. w = x +v where v ∈ ker(A). Will prove: v = 0 or w = x. Contradiction ? Hmmm. Let T be non-zero coordinates of x. w1 = x +v1 = xT +vT 1 +vT 1 v ≥ vT −vT = ⇒ ≥ xT 1 −vT 1 +vT ≥ xT 1 −vT 1 −vT 1 +v1

slide-77
SLIDE 77

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k, vT 1 < v1

4 .

Idea: any nonzero vector, v ∈ ker(A) has small projection onto any k coordinates. Consider solution w. w = x +v where v ∈ ker(A). Will prove: v = 0 or w = x. Contradiction ? Hmmm. Let T be non-zero coordinates of x. w1 = x +v1 = xT +vT 1 +vT 1 v ≥ vT −vT = ⇒ ≥ xT 1 −vT 1 +vT ≥ xT 1 −vT 1 −vT 1 +v1 ≥ x1 −2vT 1 +v1

slide-78
SLIDE 78

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k, vT 1 < v1

4 .

Idea: any nonzero vector, v ∈ ker(A) has small projection onto any k coordinates. Consider solution w. w = x +v where v ∈ ker(A). Will prove: v = 0 or w = x. Contradiction ? Hmmm. Let T be non-zero coordinates of x. w1 = x +v1 = xT +vT 1 +vT 1 v ≥ vT −vT = ⇒ ≥ xT 1 −vT 1 +vT ≥ xT 1 −vT 1 −vT 1 +v1 ≥ x1 −2vT 1 +v1 > x1.

slide-79
SLIDE 79

Optimum is correct!

Want to find: k-sparse solution to Ax = b. Recall: minimize w1 with Aw = b. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k, vT 1 < v1

4 .

Idea: any nonzero vector, v ∈ ker(A) has small projection onto any k coordinates. Consider solution w. w = x +v where v ∈ ker(A). Will prove: v = 0 or w = x. Contradiction ? Hmmm. Let T be non-zero coordinates of x. w1 = x +v1 = xT +vT 1 +vT 1 v ≥ vT −vT = ⇒ ≥ xT 1 −vT 1 +vT ≥ xT 1 −vT 1 −vT 1 +v1 ≥ x1 −2vT 1 +v1 > x1. If v is nonzero.

slide-80
SLIDE 80

Imperfect Case.

What if x is mostly sparse?

slide-81
SLIDE 81

Imperfect Case.

What if x is mostly sparse? σk(x) = min

supp(z)≤k x −z1

slide-82
SLIDE 82

Imperfect Case.

What if x is mostly sparse? σk(x) = min

supp(z)≤k x −z1

“Amount of x outside of k coordinates.”

slide-83
SLIDE 83

Imperfect Case.

What if x is mostly sparse? σk(x) = min

supp(z)≤k x −z1

“Amount of x outside of k coordinates.” Theorem: If v ∈ ker(A) = ⇒ v2 ≤

1 16k v1, then solution to

minw1,Ax = b, has x −w1 ≤ 4σk(x).

slide-84
SLIDE 84

Imperfect Case.

What if x is mostly sparse? σk(x) = min

supp(z)≤k x −z1

“Amount of x outside of k coordinates.” Theorem: If v ∈ ker(A) = ⇒ v2 ≤

1 16k v1, then solution to

minw1,Ax = b, has x −w1 ≤ 4σk(x). Still have.

slide-85
SLIDE 85

Imperfect Case.

What if x is mostly sparse? σk(x) = min

supp(z)≤k x −z1

“Amount of x outside of k coordinates.” Theorem: If v ∈ ker(A) = ⇒ v2 ≤

1 16k v1, then solution to

minw1,Ax = b, has x −w1 ≤ 4σk(x). Still have. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k

16,

vT 1 < v1

4 .

slide-86
SLIDE 86

Proof of w −x ≤ 4σ(x).

Again: σk(x) = minsupp(z)≤k |x −z|1.

slide-87
SLIDE 87

Proof of w −x ≤ 4σ(x).

Again: σk(x) = minsupp(z)≤k |x −z|1. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k

16, vT 1 < v1 4 .

slide-88
SLIDE 88

Proof of w −x ≤ 4σ(x).

Again: σk(x) = minsupp(z)≤k |x −z|1. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k

16, vT 1 < v1 4 .

Proof of Theorem: T be k largest in magnitude coordinates of x.

slide-89
SLIDE 89

Proof of w −x ≤ 4σ(x).

Again: σk(x) = minsupp(z)≤k |x −z|1. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k

16, vT 1 < v1 4 .

Proof of Theorem: T be k largest in magnitude coordinates of x. x −w1 = (x −w)T 1 +(x −w)T 1

slide-90
SLIDE 90

Proof of w −x ≤ 4σ(x).

Again: σk(x) = minsupp(z)≤k |x −z|1. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k

16, vT 1 < v1 4 .

Proof of Theorem: T be k largest in magnitude coordinates of x. x −w1 = (x −w)T 1 +(x −w)T 1 ≤ (x −w)T 1 +xT 1 +wT 1

slide-91
SLIDE 91

Proof of w −x ≤ 4σ(x).

Again: σk(x) = minsupp(z)≤k |x −z|1. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k

16, vT 1 < v1 4 .

Proof of Theorem: T be k largest in magnitude coordinates of x. x −w1 = (x −w)T 1 +(x −w)T 1 ≤ (x −w)T 1 +xT 1 +wT 1 ≤ (x −w)T 1 +xT 1 +w1 −wT 1 triangle inequality on w

slide-92
SLIDE 92

Proof of w −x ≤ 4σ(x).

Again: σk(x) = minsupp(z)≤k |x −z|1. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k

16, vT 1 < v1 4 .

Proof of Theorem: T be k largest in magnitude coordinates of x. x −w1 = (x −w)T 1 +(x −w)T 1 ≤ (x −w)T 1 +xT 1 +wT 1 ≤ (x −w)T 1 +xT 1 +w1 −wT 1 triangle inequality on w

slide-93
SLIDE 93

Proof of w −x ≤ 4σ(x).

Again: σk(x) = minsupp(z)≤k |x −z|1. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k

16, vT 1 < v1 4 .

Proof of Theorem: T be k largest in magnitude coordinates of x. x −w1 = (x −w)T 1 +(x −w)T 1 ≤ (x −w)T 1 +xT 1 +wT 1 ≤ (x −w)T 1 +xT 1 +w1 −wT 1 triangle inequality on w wT 1 = w1 −wT 1

slide-94
SLIDE 94

Proof of w −x ≤ 4σ(x).

Again: σk(x) = minsupp(z)≤k |x −z|1. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k

16, vT 1 < v1 4 .

Proof of Theorem: T be k largest in magnitude coordinates of x. x −w1 = (x −w)T 1 +(x −w)T 1 ≤ (x −w)T 1 +xT 1 +wT 1 ≤ (x −w)T 1 +xT 1 +w1 −wT 1 triangle inequality on w wT 1 = w1 −wT 1 ≤ x1.

slide-95
SLIDE 95

Proof of w −x ≤ 4σ(x).

Again: σk(x) = minsupp(z)≤k |x −z|1. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k

16, vT 1 < v1 4 .

Proof of Theorem: T be k largest in magnitude coordinates of x. x −w1 = (x −w)T 1 +(x −w)T 1 ≤ (x −w)T 1 +xT 1 +wT 1 ≤ (x −w)T 1 +xT 1 +w1 −wT 1 triangle inequality on w wT 1 = w1 −wT 1 ≤ x1. x −w1 ≤ (x −w)T 1 +xT 1 +x1 −wT 1.

slide-96
SLIDE 96

Proof of w −x ≤ 4σ(x).

Again: σk(x) = minsupp(z)≤k |x −z|1. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k

16, vT 1 < v1 4 .

Proof of Theorem: T be k largest in magnitude coordinates of x. x −w1 = (x −w)T 1 +(x −w)T 1 ≤ (x −w)T 1 +xT 1 +wT 1 ≤ (x −w)T 1 +xT 1 +w1 −wT 1 triangle inequality on w wT 1 = w1 −wT 1 ≤ x1. x −w1 ≤ (x −w)T 1 +xT 1 +x1 −wT 1. (∗) = 2xT 1 +xT −wT 1

slide-97
SLIDE 97

Proof of w −x ≤ 4σ(x).

Again: σk(x) = minsupp(z)≤k |x −z|1. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k

16, vT 1 < v1 4 .

Proof of Theorem: T be k largest in magnitude coordinates of x. x −w1 = (x −w)T 1 +(x −w)T 1 ≤ (x −w)T 1 +xT 1 +wT 1 ≤ (x −w)T 1 +xT 1 +w1 −wT 1 triangle inequality on w wT 1 = w1 −wT 1 ≤ x1. x −w1 ≤ (x −w)T 1 +xT 1 +x1 −wT 1. (∗) = 2xT 1 +xT −wT 1 ≤ 2xT 1 +xT −wT 1

slide-98
SLIDE 98

Proof of w −x ≤ 4σ(x).

Again: σk(x) = minsupp(z)≤k |x −z|1. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k

16, vT 1 < v1 4 .

Proof of Theorem: T be k largest in magnitude coordinates of x. x −w1 = (x −w)T 1 +(x −w)T 1 ≤ (x −w)T 1 +xT 1 +wT 1 ≤ (x −w)T 1 +xT 1 +w1 −wT 1 triangle inequality on w wT 1 = w1 −wT 1 ≤ x1. x −w1 ≤ (x −w)T 1 +xT 1 +x1 −wT 1. (∗) = 2xT 1 +xT −wT 1 ≤ 2xT 1 +xT −wT 1 x −w1 ≤ 2(x −w)T 1 +2xT 1

slide-99
SLIDE 99

Proof of w −x ≤ 4σ(x).

Again: σk(x) = minsupp(z)≤k |x −z|1. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k

16, vT 1 < v1 4 .

Proof of Theorem: T be k largest in magnitude coordinates of x. x −w1 = (x −w)T 1 +(x −w)T 1 ≤ (x −w)T 1 +xT 1 +wT 1 ≤ (x −w)T 1 +xT 1 +w1 −wT 1 triangle inequality on w wT 1 = w1 −wT 1 ≤ x1. x −w1 ≤ (x −w)T 1 +xT 1 +x1 −wT 1. (∗) = 2xT 1 +xT −wT 1 ≤ 2xT 1 +xT −wT 1 x −w1 ≤ 2(x −w)T 1 +2xT 1 ≤ 2 (x−w)

4

+2σ(x)

slide-100
SLIDE 100

Proof of w −x ≤ 4σ(x).

Again: σk(x) = minsupp(z)≤k |x −z|1. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k

16, vT 1 < v1 4 .

Proof of Theorem: T be k largest in magnitude coordinates of x. x −w1 = (x −w)T 1 +(x −w)T 1 ≤ (x −w)T 1 +xT 1 +wT 1 ≤ (x −w)T 1 +xT 1 +w1 −wT 1 triangle inequality on w wT 1 = w1 −wT 1 ≤ x1. x −w1 ≤ (x −w)T 1 +xT 1 +x1 −wT 1. (∗) = 2xT 1 +xT −wT 1 ≤ 2xT 1 +xT −wT 1 x −w1 ≤ 2(x −w)T 1 +2xT 1 ≤ 2 (x−w)

4

+2σ(x)

slide-101
SLIDE 101

Proof of w −x ≤ 4σ(x).

Again: σk(x) = minsupp(z)≤k |x −z|1. Lemma: For v ∈ ker(A), T ⊂ [n], |T| ≤ k

16, vT 1 < v1 4 .

Proof of Theorem: T be k largest in magnitude coordinates of x. x −w1 = (x −w)T 1 +(x −w)T 1 ≤ (x −w)T 1 +xT 1 +wT 1 ≤ (x −w)T 1 +xT 1 +w1 −wT 1 triangle inequality on w wT 1 = w1 −wT 1 ≤ x1. x −w1 ≤ (x −w)T 1 +xT 1 +x1 −wT 1. (∗) = 2xT 1 +xT −wT 1 ≤ 2xT 1 +xT −wT 1 x −w1 ≤ 2(x −w)T 1 +2xT 1 ≤ 2 (x−w)

4

+2σ(x) = ⇒ x −w1 ≤ 4σ(x).

slide-102
SLIDE 102

Almost Euclidean Matrices Proof.

Theorem:

For a random ±1, d ×n matrix, and for any x in with Ax some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

xT 2 <

√ 1 √ 16k xT 1. (∗)

slide-103
SLIDE 103

Almost Euclidean Matrices Proof.

Theorem:

For a random ±1, d ×n matrix, and for any x in with Ax some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

xT 2 <

√ 1 √ 16k xT 1. (∗)

Idea in GF(2): Random dot product is 0 with probability 1/2. All r rows 0: (1/2)r. Union bound over n

k

  • vectors.
slide-104
SLIDE 104

Almost Euclidean Matrices Proof.

Theorem:

For a random ±1, d ×n matrix, and for any x in with Ax some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

xT 2 <

√ 1 √ 16k xT 1. (∗)

Idea in GF(2): Random dot product is 0 with probability 1/2. All r rows 0: (1/2)r. Union bound over n

k

  • vectors. =

⇒ log n

k

  • vectors are enough.
slide-105
SLIDE 105

Almost Euclidean Matrices Proof.

Theorem:

For a random ±1, d ×n matrix, and for any x in with Ax some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

xT 2 <

√ 1 √ 16k xT 1. (∗)

Idea in GF(2): Random dot product is 0 with probability 1/2. All r rows 0: (1/2)r. Union bound over n

k

  • vectors. =

⇒ log n

k

  • vectors are enough.

Too many vectors. Real proof is fancy.

slide-106
SLIDE 106

Almost Euclidean Matrices Proof.

Theorem:

For a random ±1, d ×n matrix, and for any x in with Ax some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

xT 2 <

√ 1 √ 16k xT 1. (∗)

Idea in GF(2): Random dot product is 0 with probability 1/2. All r rows 0: (1/2)r. Union bound over n

k

  • vectors. =

⇒ log n

k

  • vectors are enough.

Too many vectors. Real proof is fancy. Discusses distribution of X ·v for a vector v

slide-107
SLIDE 107

Almost Euclidean Matrices Proof.

Theorem:

For a random ±1, d ×n matrix, and for any x in with Ax some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

xT 2 <

√ 1 √ 16k xT 1. (∗)

Idea in GF(2): Random dot product is 0 with probability 1/2. All r rows 0: (1/2)r. Union bound over n

k

  • vectors. =

⇒ log n

k

  • vectors are enough.

Too many vectors. Real proof is fancy. Discusses distribution of X ·v for a vector v and random ±1 vector X

slide-108
SLIDE 108

Almost Euclidean Matrices Proof.

Theorem:

For a random ±1, d ×n matrix, and for any x in with Ax some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

xT 2 <

√ 1 √ 16k xT 1. (∗)

Idea in GF(2): Random dot product is 0 with probability 1/2. All r rows 0: (1/2)r. Union bound over n

k

  • vectors. =

⇒ log n

k

  • vectors are enough.

Too many vectors. Real proof is fancy. Discusses distribution of X ·v for a vector v and random ±1 vector X Poor Man’s proof:

slide-109
SLIDE 109

Almost Euclidean Matrices Proof.

Theorem:

For a random ±1, d ×n matrix, and for any x in with Ax some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

xT 2 <

√ 1 √ 16k xT 1. (∗)

Idea in GF(2): Random dot product is 0 with probability 1/2. All r rows 0: (1/2)r. Union bound over n

k

  • vectors. =

⇒ log n

k

  • vectors are enough.

Too many vectors. Real proof is fancy. Discusses distribution of X ·v for a vector v and random ±1 vector X Poor Man’s proof: Group coordinates of v until groups of same size.

slide-110
SLIDE 110

Almost Euclidean Matrices Proof.

Theorem:

For a random ±1, d ×n matrix, and for any x in with Ax some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

xT 2 <

√ 1 √ 16k xT 1. (∗)

Idea in GF(2): Random dot product is 0 with probability 1/2. All r rows 0: (1/2)r. Union bound over n

k

  • vectors. =

⇒ log n

k

  • vectors are enough.

Too many vectors. Real proof is fancy. Discusses distribution of X ·v for a vector v and random ±1 vector X Poor Man’s proof: Group coordinates of v until groups of same size. ni in each group.

slide-111
SLIDE 111

Almost Euclidean Matrices Proof.

Theorem:

For a random ±1, d ×n matrix, and for any x in with Ax some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

xT 2 <

√ 1 √ 16k xT 1. (∗)

Idea in GF(2): Random dot product is 0 with probability 1/2. All r rows 0: (1/2)r. Union bound over n

k

  • vectors. =

⇒ log n

k

  • vectors are enough.

Too many vectors. Real proof is fancy. Discusses distribution of X ·v for a vector v and random ±1 vector X Poor Man’s proof: Group coordinates of v until groups of same size. ni in each group. Deviation in group ≤ √ni/2 in each group is less than 1/2.

slide-112
SLIDE 112

Almost Euclidean Matrices Proof.

Theorem:

For a random ±1, d ×n matrix, and for any x in with Ax some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

xT 2 <

√ 1 √ 16k xT 1. (∗)

Idea in GF(2): Random dot product is 0 with probability 1/2. All r rows 0: (1/2)r. Union bound over n

k

  • vectors. =

⇒ log n

k

  • vectors are enough.

Too many vectors. Real proof is fancy. Discusses distribution of X ·v for a vector v and random ±1 vector X Poor Man’s proof: Group coordinates of v until groups of same size. ni in each group. Deviation in group ≤ √ni/2 in each group is less than 1/2. Probability groups cancel is small.

slide-113
SLIDE 113

Almost Euclidean Matrices Proof.

Theorem:

For a random ±1, d ×n matrix, and for any x in with Ax some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

xT 2 <

√ 1 √ 16k xT 1. (∗)

Idea in GF(2): Random dot product is 0 with probability 1/2. All r rows 0: (1/2)r. Union bound over n

k

  • vectors. =

⇒ log n

k

  • vectors are enough.

Too many vectors. Real proof is fancy. Discusses distribution of X ·v for a vector v and random ±1 vector X Poor Man’s proof: Group coordinates of v until groups of same size. ni in each group. Deviation in group ≤ √ni/2 in each group is less than 1/2. Probability groups cancel is small. Lots of rows. So, norm is good on average for each group.

slide-114
SLIDE 114

Almost Euclidean Matrices Proof.

Theorem:

For a random ±1, d ×n matrix, and for any x in with Ax some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

xT 2 <

√ 1 √ 16k xT 1. (∗)

Idea in GF(2): Random dot product is 0 with probability 1/2. All r rows 0: (1/2)r. Union bound over n

k

  • vectors. =

⇒ log n

k

  • vectors are enough.

Too many vectors. Real proof is fancy. Discusses distribution of X ·v for a vector v and random ±1 vector X Poor Man’s proof: Group coordinates of v until groups of same size. ni in each group. Deviation in group ≤ √ni/2 in each group is less than 1/2. Probability groups cancel is small. Lots of rows. So, norm is good on average for each group. “Few” vectors with most of mass in small set of coordinates.

slide-115
SLIDE 115

Almost Euclidean Matrices Proof.

Theorem:

For a random ±1, d ×n matrix, and for any x in with Ax some d = Ω(k log n

k ) rows, has for any T ⊂ [n] that

xT 2 <

√ 1 √ 16k xT 1. (∗)

Idea in GF(2): Random dot product is 0 with probability 1/2. All r rows 0: (1/2)r. Union bound over n

k

  • vectors. =

⇒ log n

k

  • vectors are enough.

Too many vectors. Real proof is fancy. Discusses distribution of X ·v for a vector v and random ±1 vector X Poor Man’s proof: Group coordinates of v until groups of same size. ni in each group. Deviation in group ≤ √ni/2 in each group is less than 1/2. Probability groups cancel is small. Lots of rows. So, norm is good on average for each group. “Few” vectors with most of mass in small set of coordinates. Union bound over those.

slide-116
SLIDE 116

Credits

Moitra, MIT,6.854. Roughgarden, CS168, Stanford.

slide-117
SLIDE 117

Credits

Moitra, MIT,6.854. Roughgarden, CS168, Stanford. See Jame Lee, TCS Blog, May 2008 for proof of Almost Euclidean Nature of random subspaces.

slide-118
SLIDE 118

Possible Topics.

TODO: Long tailed distributions.

slide-119
SLIDE 119

Possible Topics.

TODO: Long tailed distributions. Interior Point Algorithms.

slide-120
SLIDE 120

Possible Topics.

TODO: Long tailed distributions. Interior Point Algorithms. Matrix Concentration/Matrix Experts/Semidefinite Programs.

slide-121
SLIDE 121

Possible Topics.

TODO: Long tailed distributions. Interior Point Algorithms. Matrix Concentration/Matrix Experts/Semidefinite Programs. Coding Theory: Low Density Parity Check Codes or Expander codes.

slide-122
SLIDE 122

Possible Topics.

TODO: Long tailed distributions. Interior Point Algorithms. Matrix Concentration/Matrix Experts/Semidefinite Programs. Coding Theory: Low Density Parity Check Codes or Expander codes.

  • Auctions. Mechanism Design.
slide-123
SLIDE 123

Possible Topics.

TODO: Long tailed distributions. Interior Point Algorithms. Matrix Concentration/Matrix Experts/Semidefinite Programs. Coding Theory: Low Density Parity Check Codes or Expander codes.

  • Auctions. Mechanism Design.