How to Hoodwink a Halfspace A survey done in partial fulfillment of - - PowerPoint PPT Presentation

how to hoodwink a halfspace
SMART_READER_LITE
LIVE PREVIEW

How to Hoodwink a Halfspace A survey done in partial fulfillment of - - PowerPoint PPT Presentation

How to Hoodwink a Halfspace A survey done in partial fulfillment of the requirements of the Comprehensive Examination for doctoral candidates Purushottam Kar Y8111062 January 12, 2010 How to Hoodwink a Halfspace 1 / 45 Halfspaces Definition


slide-1
SLIDE 1

How to Hoodwink a Halfspace

A survey done in partial fulfillment of the requirements of the Comprehensive Examination for doctoral candidates

Purushottam Kar Y8111062

January 12, 2010

1 / 45 How to Hoodwink a Halfspace

slide-2
SLIDE 2

Halfspaces

Definition (Halfspaces) A halfspace in a d dimensional Euclidean space is a dichotomy characterized by a weight vector w ∈ Rd and a threshold θ ∈ R. More specifically h(x) = sgn(w · x − θ) ∀x ∈ Rd

2 / 45 How to Hoodwink a Halfspace

slide-3
SLIDE 3

Halfspaces

Definition (Halfspaces) A halfspace in a d dimensional Euclidean space is a dichotomy characterized by a weight vector w ∈ Rd and a threshold θ ∈ R. More specifically h(x) = sgn(w · x − θ) ∀x ∈ Rd Studied extensively in learning theory, geometry, game theory, complexity theory ...

2 / 45 How to Hoodwink a Halfspace

slide-4
SLIDE 4

Halfspaces

Definition (Halfspaces) A halfspace in a d dimensional Euclidean space is a dichotomy characterized by a weight vector w ∈ Rd and a threshold θ ∈ R. More specifically h(x) = sgn(w · x − θ) ∀x ∈ Rd Studied extensively in learning theory, geometry, game theory, complexity theory ...

2 / 45 How to Hoodwink a Halfspace

slide-5
SLIDE 5

Halfspaces

Definition (Halfspaces) A halfspace in a d dimensional Euclidean space is a dichotomy characterized by a weight vector w ∈ Rd and a threshold θ ∈ R. More specifically h(x) = sgn(w · x − θ) ∀x ∈ Rd Studied extensively in learning theory, geometry, game theory, complexity theory ...

2 / 45 How to Hoodwink a Halfspace

slide-6
SLIDE 6

Pre [DGJ+09] ...

The Learning Theory part ...

Halfspaces are weak

3 / 45 How to Hoodwink a Halfspace

slide-7
SLIDE 7

Pre [DGJ+09] ...

The Learning Theory part ...

... very weak

3 / 45 How to Hoodwink a Halfspace

slide-8
SLIDE 8

Pre [DGJ+09] ...

The Learning Theory part ...

... very weak Cannot separate strings based on parity [MP69]

3 / 45 How to Hoodwink a Halfspace

slide-9
SLIDE 9

Pre [DGJ+09] ...

The Learning Theory part ...

... very weak ... let alone regular languages

3 / 45 How to Hoodwink a Halfspace

slide-10
SLIDE 10

Pre [DGJ+09] ...

The Learning Theory part ...

... very weak ... let alone regular languages Cannot separate most interesting language classes due to high (read infinite) VC dimension

3 / 45 How to Hoodwink a Halfspace

slide-11
SLIDE 11

Pre [DGJ+09] ...

The Learning Theory part ...

... very weak ... let alone regular languages Cannot separate most interesting language classes due to high (read infinite) VC dimension Have been shown to be able to adapt to piecewise testable languages using large margin methods methods (no PAC guarantees though) [CKM07]

3 / 45 How to Hoodwink a Halfspace

slide-12
SLIDE 12

Pre [DGJ+09] ...

The Learning Theory part ...

... very weak ... let alone regular languages Cannot separate most interesting language classes due to high (read infinite) VC dimension Have been shown to be able to adapt to piecewise testable languages using large margin methods methods (no PAC guarantees though) [CKM07] Whatever they can represent, can be learnt pretty fast

3 / 45 How to Hoodwink a Halfspace

slide-13
SLIDE 13

Pre [DGJ+09] ...

The Learning Theory part ...

... very weak ... let alone regular languages Cannot separate most interesting language classes due to high (read infinite) VC dimension Have been shown to be able to adapt to piecewise testable languages using large margin methods methods (no PAC guarantees though) [CKM07] Whatever they can represent, can be learnt in “soft” quadratic time

3 / 45 How to Hoodwink a Halfspace

slide-14
SLIDE 14

Pre [DGJ+09] ...

The Learning Theory part ...

... very weak ... let alone regular languages Cannot separate most interesting language classes due to high (read infinite) VC dimension Have been shown to be able to adapt to piecewise testable languages using large margin methods methods (no PAC guarantees though) [CKM07] Whatever they can represent, can be learnt in “soft” quadratic time given the presence of a teacher

3 / 45 How to Hoodwink a Halfspace

slide-15
SLIDE 15

Pre [DGJ+09] ...

The Learning Theory part ...

... very weak ... let alone regular languages Cannot separate most interesting language classes due to high (read infinite) VC dimension Have been shown to be able to adapt to piecewise testable languages using large margin methods methods (no PAC guarantees though) [CKM07] Whatever they can represent, can be learnt in “soft” quadratic time given the presence of a teacher There is a quadratic lower bound on the learning time [MT94]

3 / 45 How to Hoodwink a Halfspace

slide-16
SLIDE 16

Pre [DGJ+09] ...

The Learning Theory part ...

Thresholded polynomials are stronger

4 / 45 How to Hoodwink a Halfspace

slide-17
SLIDE 17

Pre [DGJ+09] ...

The Learning Theory part ...

Thresholded polynomials are stronger can be used to represent DNFs of exponentially larger sizes

4 / 45 How to Hoodwink a Halfspace

slide-18
SLIDE 18

Pre [DGJ+09] ...

The Learning Theory part ...

Thresholded polynomials are stronger can be used to represent DNFs of exponentially larger sizes [KS04] show that s-term DNFs can be computed by polynomial threshold functions of degree O

  • n1/3 log s
  • 4 / 45

How to Hoodwink a Halfspace

slide-19
SLIDE 19

Pre [DGJ+09] ...

The Learning Theory part ...

Thresholded polynomials are stronger can be used to represent DNFs of exponentially larger sizes [KS04] show that s-term DNFs can be computed by polynomial threshold functions of degree O

  • n1/3 log s
  • Matches a lower bound of Ω
  • n1/3

by [MP69]

4 / 45 How to Hoodwink a Halfspace

slide-20
SLIDE 20

Pre [DGJ+09] ...

The Learning Theory part ...

Thresholded polynomials are stronger can be used to represent DNFs of exponentially larger sizes [KS04] show that s-term DNFs can be computed by polynomial threshold functions of degree O

  • n1/3 log s
  • Matches a lower bound of Ω
  • n1/3

by [MP69] The construction gives a 2O(n1/3 log s log n)-time algorithm to learn DNFs by extending halfspace learning algorithms to ones that learn polynomial threshold functions over boolean valued attributes

4 / 45 How to Hoodwink a Halfspace

slide-21
SLIDE 21

Pre [DGJ+09] ...

The Complexity Theory part ...

Halfspaces are resilient

5 / 45 How to Hoodwink a Halfspace

slide-22
SLIDE 22

Pre [DGJ+09] ...

The Complexity Theory part ...

... very resilient

5 / 45 How to Hoodwink a Halfspace

slide-23
SLIDE 23

Pre [DGJ+09] ...

The Complexity Theory part ...

... very resilient Cannot be simulated by low-degree polynomials or AC0

5 / 45 How to Hoodwink a Halfspace

slide-24
SLIDE 24

Pre [DGJ+09] ...

The Complexity Theory part ...

... very resilient Cannot be simulated by low-degree polynomials or AC0 A separation like NP ⊂ HALFSPACE2 still eludes us

5 / 45 How to Hoodwink a Halfspace

slide-25
SLIDE 25

Pre [DGJ+09] ...

The Complexity Theory part ...

... very resilient Cannot be simulated by low-degree polynomials or AC0 A separation like NP ⊂ HALFSPACE2 still eludes us Circuits composed of halfspaces can be simulated by circuits of majority gates of almost same depth

5 / 45 How to Hoodwink a Halfspace

slide-26
SLIDE 26

Pre [DGJ+09] ...

The Complexity Theory part ...

... very resilient Cannot be simulated by low-degree polynomials or AC0 A separation like NP ⊂ HALFSPACE2 still eludes us Circuits composed of halfspaces can be simulated by circuits of majority gates of almost same depth Representational Complexity : Integer weights of size

(n+1) log(n+1) 2

− n bits suffice and n log n

2

− n are necessary [H˚ as94]

5 / 45 How to Hoodwink a Halfspace

slide-27
SLIDE 27

Pre [DGJ+09] ...

The Complexity Theory part ...

... very resilient Cannot be simulated by low-degree polynomials or AC0 A separation like NP ⊂ HALFSPACE2 still eludes us Circuits composed of halfspaces can be simulated by circuits of majority gates of almost same depth Representational Complexity : Integer weights of size

(n+1) log(n+1) 2

− n bits suffice and n log n

2

− n are necessary [H˚ as94] If approximate representations are all we want then √n2 ˜

O(1/ǫ2) bits

suffice to get a halfplane that begs to differ only on an ǫ fraction of the inputs [Ser07]

5 / 45 How to Hoodwink a Halfspace

slide-28
SLIDE 28

The Art and Mathematics of Deception

Definition (Fooling a Function) A distribution D on strings over {−1, 1} of length n is said to ǫ-fool a boolean function f : {−1, 1}n → {−1, 1} if |Ex←D[f (x)] − Ex←U[f (x)]| ≤ ǫ

6 / 45 How to Hoodwink a Halfspace

slide-29
SLIDE 29

The Art and Mathematics of Deception

Definition (Fooling a Function) A distribution D on strings over {−1, 1} of length n is said to ǫ-fool a boolean function f : {−1, 1}n → {−1, 1} if |Ex←D[f (x)] − Ex←U[f (x)]| ≤ ǫ The uniform distribution U fools every function - but it requires too many random bits to implement

6 / 45 How to Hoodwink a Halfspace

slide-30
SLIDE 30

The Art and Mathematics of Deception

Definition (Fooling a Function) A distribution D on strings over {−1, 1} of length n is said to ǫ-fool a boolean function f : {−1, 1}n → {−1, 1} if |Ex←D[f (x)] − Ex←U[f (x)]| ≤ ǫ The uniform distribution U fools every function - but it requires too many random bits to implement Can we fool certain functions using distributions that we can “create” ourselves given smaller amount of randomness ?

6 / 45 How to Hoodwink a Halfspace

slide-31
SLIDE 31

The Art and Mathematics of Deception

Definition (Fooling a Function) A distribution D on strings over {−1, 1} of length n is said to ǫ-fool a boolean function f : {−1, 1}n → {−1, 1} if |Ex←D[f (x)] − Ex←U[f (x)]| ≤ ǫ But why would one want to indulge in such a trivial pursuit ?

6 / 45 How to Hoodwink a Halfspace

slide-32
SLIDE 32

The Art and Mathematics of Deception

Definition (Fooling a Function) A distribution D on strings over {−1, 1} of length n is said to ǫ-fool a boolean function f : {−1, 1}n → {−1, 1} if |Ex←D[f (x)] − Ex←U[f (x)]| ≤ ǫ Can we fool certain functions using distributions that we can “create” ourselves given smaller amount of randomness ?

6 / 45 How to Hoodwink a Halfspace

slide-33
SLIDE 33

Less than random distributions

Definition (k-wise Independence) A distribution D on {−1, +1}n is said to be k-wise independent if the projection of D on any k indices is uniformly distributed over {−1, +1}k

7 / 45 How to Hoodwink a Halfspace

slide-34
SLIDE 34

Less than random distributions

Definition (k-wise Independence) A distribution D on {−1, +1}n is said to be k-wise independent if the projection of D on any k indices is uniformly distributed over {−1, +1}k             1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1            

Example taken from http://www.nada.kth.se/~johanh/verktyg/lecture3.pdf 7 / 45 How to Hoodwink a Halfspace

slide-35
SLIDE 35

Less than random distributions

Definition (k-wise Independence) A distribution D on {−1, +1}n is said to be k-wise independent if the projection of D on any k indices is uniformly distributed over {−1, +1}k             1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1            

Example taken from http://www.nada.kth.se/~johanh/verktyg/lecture3.pdf 7 / 45 How to Hoodwink a Halfspace

slide-36
SLIDE 36

Less than random distributions

Definition (k-wise Independence) A distribution D on {−1, +1}n is said to be k-wise independent if the projection of D on any k indices is uniformly distributed over {−1, +1}k             1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1            

Example taken from http://www.nada.kth.se/~johanh/verktyg/lecture3.pdf 7 / 45 How to Hoodwink a Halfspace

slide-37
SLIDE 37

Less than random distributions

Definition (k-wise Independence) A distribution D on {−1, +1}n is said to be k-wise independent if the projection of D on any k indices is uniformly distributed over {−1, +1}k             1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1            

Example taken from http://www.nada.kth.se/~johanh/verktyg/lecture3.pdf 7 / 45 How to Hoodwink a Halfspace

slide-38
SLIDE 38

Less than random distributions

Definition (k-wise Independence) A distribution D on {−1, +1}n is said to be k-wise independent if the projection of D on any k indices is uniformly distributed over {−1, +1}k             1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1            

Example taken from http://www.nada.kth.se/~johanh/verktyg/lecture3.pdf 7 / 45 How to Hoodwink a Halfspace

slide-39
SLIDE 39

Less than random distributions

Definition (k-wise Independence) A distribution D on {−1, +1}n is said to be k-wise independent if the projection of D on any k indices is uniformly distributed over {−1, +1}k             1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1             Construction Optimal constructions of such distributions exist

Example taken from http://www.nada.kth.se/~johanh/verktyg/lecture3.pdf 7 / 45 How to Hoodwink a Halfspace

slide-40
SLIDE 40

Less than random distributions

Definition (k-wise Independence) A distribution D on {−1, +1}n is said to be k-wise independent if the projection of D on any k indices is uniformly distributed over {−1, +1}k             1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1             Construction Optimal constructions of such distributions exist Randomness Requirement Given a sequence of k log n random bits, one can generate a sequence of n random bits that is k-wise independent.

Example taken from http://www.nada.kth.se/~johanh/verktyg/lecture3.pdf 7 / 45 How to Hoodwink a Halfspace

slide-41
SLIDE 41

Pre [DGJ+09] ...

The Complexity Theory part ... contd

We know how to fool low-degree polynomials, constant depth boolean circuits , ...

8 / 45 How to Hoodwink a Halfspace

slide-42
SLIDE 42

Pre [DGJ+09] ...

The Complexity Theory part ... contd

We know how to fool low-degree polynomials, constant depth boolean circuits , ... Some of these constructions imply that halfspaces with small weights can be fooled

8 / 45 How to Hoodwink a Halfspace

slide-43
SLIDE 43

Pre [DGJ+09] ...

The Complexity Theory part ... contd

We know how to fool low-degree polynomials, constant depth boolean circuits , ... Some of these constructions imply that halfspaces with small weights can be fooled The question of fooling general halfspaces ...

8 / 45 How to Hoodwink a Halfspace

slide-44
SLIDE 44

Pre [DGJ+09] ...

The Complexity Theory part ... contd

We know how to fool low-degree polynomials, constant depth boolean circuits , ... Some of these constructions imply that halfspaces with small weights can be fooled The question of fooling general halfspaces ... [DGJ+09]

8 / 45 How to Hoodwink a Halfspace

slide-45
SLIDE 45

Pre [DGJ+09] ...

The Complexity Theory part ... contd

We know how to fool low-degree polynomials, constant depth boolean circuits , ... Some of these constructions imply that halfspaces with small weights can be fooled The question of fooling general halfspaces ... [DGJ+09] The question investigated by [DGJ+09] is not directly related to construction of pseudo-random generators for halfspaces

8 / 45 How to Hoodwink a Halfspace

slide-46
SLIDE 46

Pre [DGJ+09] ...

The Complexity Theory part ... contd

We know how to fool low-degree polynomials, constant depth boolean circuits , ... Some of these constructions imply that halfspaces with small weights can be fooled The question of fooling general halfspaces ... [DGJ+09] The question being asked is that of a property fooling a class of functions rather than a distribution doing so

8 / 45 How to Hoodwink a Halfspace

slide-47
SLIDE 47

A Key Result

Theorem ([Baz07]) A boolean function f : {−1, 1}n → {−1, 1} can be ǫ-fooled by the class of k-wise independent distributions iff there exist multivariate polynomials u : {−1, 1}n → {−1, 1}, l : {−1, 1}n → {−1, 1}, such that

deg(u), deg(l) ≤ k u(x) ≥ f (x) ≥ l(x) ∀x ∈ {−1, 1}n Ex←U[u(x) − f (x)], Ex←U[f (x) − l(x)] ≤ ǫ

9 / 45 How to Hoodwink a Halfspace

slide-48
SLIDE 48

Pre [DGJ+09] ...

The Complexity Theory part ... contd

Has been used very productively to fool

10 / 45 How to Hoodwink a Halfspace

slide-49
SLIDE 49

Pre [DGJ+09] ...

The Complexity Theory part ... contd

Has been used very productively to fool

DNFs [Baz07] [Raz08]

10 / 45 How to Hoodwink a Halfspace

slide-50
SLIDE 50

Pre [DGJ+09] ...

The Complexity Theory part ... contd

Has been used very productively to fool

DNFs [Baz07] [Raz08] AC0 functions [Bra09]

10 / 45 How to Hoodwink a Halfspace

slide-51
SLIDE 51

Pre [DGJ+09] ...

The Complexity Theory part ... contd

Has been used very productively to fool

DNFs [Baz07] [Raz08] AC0 functions [Bra09] halfspaces [DGJ+09][GOWZ10][KNW10]

10 / 45 How to Hoodwink a Halfspace

slide-52
SLIDE 52

Pre [DGJ+09] ...

The Complexity Theory part ... contd

Has been used very productively to fool

DNFs [Baz07] [Raz08] AC0 functions [Bra09] halfspaces [DGJ+09][GOWZ10][KNW10]

Note : Servedio’s construction in [Ser07] gives us PRGs for halfspaces if ǫ = Ω(1/√log n). The [DGJ+09] construction itself stops working if ǫ = O(1/√n)

10 / 45 How to Hoodwink a Halfspace

slide-53
SLIDE 53

Now [DGJ+09]

11 / 45 How to Hoodwink a Halfspace

slide-54
SLIDE 54

Plan of attack

Goal : Find two low-degree polynomials that sandwich our halfspace function while closely approximating it

12 / 45 How to Hoodwink a Halfspace

slide-55
SLIDE 55

Plan of attack

Plan of attack :

12 / 45 How to Hoodwink a Halfspace

slide-56
SLIDE 56

Plan of attack

Plan of attack :

Construct a polynomial that gives a nice point wise approximation to the sgn function

12 / 45 How to Hoodwink a Halfspace

slide-57
SLIDE 57

Plan of attack

Plan of attack :

Construct a polynomial that gives a nice point wise approximation to the sgn function Use it to construct a polynomial that upper bounds the sgn function while closely approximating it

12 / 45 How to Hoodwink a Halfspace

slide-58
SLIDE 58

Plan of attack

Plan of attack :

Construct a polynomial that gives a nice point wise approximation to the sgn function Use it to construct a polynomial that upper bounds the sgn function while closely approximating it Use it to construct a polynomial that lower bounds the sgn function while closely approximating it

12 / 45 How to Hoodwink a Halfspace

slide-59
SLIDE 59

Plan of attack

Plan of attack :

Construct a polynomial that gives a nice point wise approximation to the sgn function Use it to construct a polynomial that upper bounds the sgn function while closely approximating it Use it to construct a polynomial that lower bounds the sgn function while closely approximating it Wait ... what happened to the halfspace ??

12 / 45 How to Hoodwink a Halfspace

slide-60
SLIDE 60

Plan of attack

Plan of attack :

Construct a polynomial that gives a nice point wise approximation to the sgn function Use it to construct a polynomial that upper bounds the sgn function while closely approximating it Use it to construct a polynomial that lower bounds the sgn function while closely approximating it Probably need to restate some of the goals

12 / 45 How to Hoodwink a Halfspace

slide-61
SLIDE 61

Plan of attack

Plan of attack :

Construct a polynomial that gives a nice point wise approximation to the sgn function Use it to construct a polynomial that upper bounds the sgn function while closely approximating it under the Gaussian distribution Use it to construct a polynomial that lower bounds the sgn function while closely approximating it

12 / 45 How to Hoodwink a Halfspace

slide-62
SLIDE 62

Plan of attack

Plan of attack :

Construct a polynomial that gives a nice point wise approximation to the sgn function Use it to construct a polynomial that upper bounds the sgn function while closely approximating it under the Gaussian distribution Use it to construct a polynomial that lower bounds the sgn function while closely approximating it under the Gaussian distribution

12 / 45 How to Hoodwink a Halfspace

slide-63
SLIDE 63

Plan of attack

Plan of attack :

Construct a polynomial that gives a nice point wise approximation to the sgn function Use it to construct a polynomial that upper bounds the sgn function while closely approximating it under the Gaussian distribution Use it to construct a polynomial that lower bounds the sgn function while closely approximating it under the Gaussian distribution Use the fact that values taken by homogeneous ’regular’ linear polynomials are distributed normally

12 / 45 How to Hoodwink a Halfspace

slide-64
SLIDE 64

Step 1

13 / 45 How to Hoodwink a Halfspace

slide-65
SLIDE 65

Step 2

14 / 45 How to Hoodwink a Halfspace

slide-66
SLIDE 66

Step 3

15 / 45 How to Hoodwink a Halfspace

slide-67
SLIDE 67

Step 3

16 / 45 How to Hoodwink a Halfspace

slide-68
SLIDE 68

Approximating Real-valued Functions - I

Theorem (Jackson) Any bounded continuous function f : [−1, 1] → R admits a 6ωf 1

  • pointwise approximation by a degree-ℓ polynomial in the

domain [−1, 1].

17 / 45 How to Hoodwink a Halfspace

slide-69
SLIDE 69

Approximating Real-valued Functions - I

Theorem (Jackson) Any bounded continuous function f : [−1, 1] → R admits a 6ωf 1

  • pointwise approximation by a degree-ℓ polynomial in the

domain [−1, 1].

Use Jackson’s theorem to O(1)-approximate sgn by a degree O(1/a) polynomial (a = ǫ2/log(1/ǫ))

17 / 45 How to Hoodwink a Halfspace

slide-70
SLIDE 70

Approximating Real-valued Functions - I

Theorem (Jackson) Any bounded continuous function f : [−1, 1] → R admits a 6ωf 1

  • pointwise approximation by a degree-ℓ polynomial in the

domain [−1, 1].

Use Jackson’s theorem to O(1)-approximate sgn by a degree O(1/a) polynomial (a = ǫ2/log(1/ǫ)) Use an amplifying polynomial of degree O(log(1/ǫ)) to reduce the error to ǫ2

17 / 45 How to Hoodwink a Halfspace

slide-71
SLIDE 71

Approximating Real-valued Functions - I

Theorem (Jackson) Any bounded continuous function f : [−1, 1] → R admits a 6ωf 1

  • pointwise approximation by a degree-ℓ polynomial in the

domain [−1, 1].

Use Jackson’s theorem to O(1)-approximate sgn by a degree O(1/a) polynomial (a = ǫ2/log(1/ǫ)) Use an amplifying polynomial of degree O(log(1/ǫ)) to reduce the error to ǫ2

Lemma There is a polynomial p1(x) of degree 2m = O(1/ǫ2 log2(1/ǫ)) which gives a pointwise ǫ2-approximation to the sgn function in the range [−1, −a] ∪ [a, 1].

17 / 45 How to Hoodwink a Halfspace

slide-72
SLIDE 72

Approximating Real-valued Functions - II

Theorem (Chebyshev) For any bounded continuous function f : [k, l] → R and any non-zero continuous function f : [k, l] → R, for every m, there is a unique degree-m polynomial r(z) that minimizes the maximum pointwise error max

x∈[k,l]|f (x) − s(x)r(x)| and is characterized by the

fact that the function s(x)r(x) achieves this maximum error m + 2 times in the interval [k, l] with alternating signs.

18 / 45 How to Hoodwink a Halfspace

slide-73
SLIDE 73

Approximating Real-valued Functions - II

Theorem (Chebyshev) For any bounded continuous function f : [k, l] → R and any non-zero continuous function f : [k, l] → R, for every m, there is a unique degree-m polynomial r(z) that minimizes the maximum pointwise error max

x∈[k,l]|f (x) − s(x)r(x)| and is characterized by the

fact that the function s(x)r(x) achieves this maximum error m + 2 times in the interval [k, l] with alternating signs.

Use Chebyshev’s theorem to get the best degree m approximation r(x) which minimizes max

x∈[a2,1]|1 − √xr(x)|

18 / 45 How to Hoodwink a Halfspace

slide-74
SLIDE 74

Approximating Real-valued Functions - II

Theorem (Chebyshev) For any bounded continuous function f : [k, l] → R and any non-zero continuous function f : [k, l] → R, for every m, there is a unique degree-m polynomial r(z) that minimizes the maximum pointwise error max

x∈[k,l]|f (x) − s(x)r(x)| and is characterized by the

fact that the function s(x)r(x) achieves this maximum error m + 2 times in the interval [k, l] with alternating signs.

Use Chebyshev’s theorem to get the best degree m approximation r(x) which minimizes max

x∈[a2,1]|1 − √xr(x)|

Let p(x) = x · r(x2).

18 / 45 How to Hoodwink a Halfspace

slide-75
SLIDE 75

Completing Step 1

Write p1(x) in the form x · r1(x2)

19 / 45 How to Hoodwink a Halfspace

slide-76
SLIDE 76

Completing Step 1

Write p1(x) in the form x · r1(x2) Use it to bound the error of p(x) in the interval [−1, a] ∪ [a, 1] by ǫ2

19 / 45 How to Hoodwink a Halfspace

slide-77
SLIDE 77

Completing Step 1

Write p1(x) in the form x · r1(x2) Use it to bound the error of p(x) in the interval [−1, a] ∪ [a, 1] by ǫ2 and get some more properties ...

19 / 45 How to Hoodwink a Halfspace

slide-78
SLIDE 78

Completing Step 1

Write p1(x) in the form x · r1(x2) Use it to bound the error of p(x) in the interval [−1, a] ∪ [a, 1] by ǫ2 and get some more properties ...

Lemma There is a polynomial p(x) of degree 2m + 1 = O(1/ǫ2 log2(1/ǫ)) such that

19 / 45 How to Hoodwink a Halfspace

slide-79
SLIDE 79

Completing Step 1

Write p1(x) in the form x · r1(x2) Use it to bound the error of p(x) in the interval [−1, a] ∪ [a, 1] by ǫ2 and get some more properties ...

Lemma There is a polynomial p(x) of degree 2m + 1 = O(1/ǫ2 log2(1/ǫ)) such that

p(x) ∈ sgn(x) ± ǫ2 for all |x| ∈ [a, 1]

19 / 45 How to Hoodwink a Halfspace

slide-80
SLIDE 80

Completing Step 1

Write p1(x) in the form x · r1(x2) Use it to bound the error of p(x) in the interval [−1, a] ∪ [a, 1] by ǫ2 and get some more properties ...

Lemma There is a polynomial p(x) of degree 2m + 1 = O(1/ǫ2 log2(1/ǫ)) such that

p(x) ∈ sgn(x) ± ǫ2 for all |x| ∈ [a, 1] p(x) ∈ ±(1 + ǫ2) for all |x| ∈ [0, a]

19 / 45 How to Hoodwink a Halfspace

slide-81
SLIDE 81

Completing Step 1

Write p1(x) in the form x · r1(x2) Use it to bound the error of p(x) in the interval [−1, a] ∪ [a, 1] by ǫ2 and get some more properties ...

Lemma There is a polynomial p(x) of degree 2m + 1 = O(1/ǫ2 log2(1/ǫ)) such that

p(x) ∈ sgn(x) ± ǫ2 for all |x| ∈ [a, 1] p(x) ∈ ±(1 + ǫ2) for all |x| ∈ [0, a] p(x) is increasing in (∞, −1] ∪ [1, ∞).

19 / 45 How to Hoodwink a Halfspace

slide-82
SLIDE 82

Step 1

20 / 45 How to Hoodwink a Halfspace

slide-83
SLIDE 83

Step 1

20 / 45 How to Hoodwink a Halfspace

slide-84
SLIDE 84

Completing Step 2

Let P(x) = 1

2

  • 1 + ǫ2 + p(x + a)

2 − 1

21 / 45 How to Hoodwink a Halfspace

slide-85
SLIDE 85

Completing Step 2

Let P(x) = 1

2

  • 1 + ǫ2 + p(x + a)

2 − 1 Use simple case analyses

21 / 45 How to Hoodwink a Halfspace

slide-86
SLIDE 86

Completing Step 2

Let P(x) = 1

2

  • 1 + ǫ2 + p(x + a)

2 − 1 ... and the fact that a polynomial of degree d taking values in [−b, b] on [−1, 1] is bounded by b|2x|d for all |x| > 1

21 / 45 How to Hoodwink a Halfspace

slide-87
SLIDE 87

Completing Step 2

Let P(x) = 1

2

  • 1 + ǫ2 + p(x + a)

2 − 1 ... and the fact that a polynomial of degree d taking values in [−b, b] on [−1, 1] is bounded by b|2x|d for all |x| > 1 ... to get the following result

21 / 45 How to Hoodwink a Halfspace

slide-88
SLIDE 88

Completing Step 2

Let P(x) = 1

2

  • 1 + ǫ2 + p(x + a)

2 − 1 ... and the fact that a polynomial of degree d taking values in [−b, b] on [−1, 1] is bounded by b|2x|d for all |x| > 1 ... to get the following result

Lemma There is a polynomial P(x) of degree K = O(1/ǫ2 log2(1/ǫ)) such that

21 / 45 How to Hoodwink a Halfspace

slide-89
SLIDE 89

Completing Step 2

Let P(x) = 1

2

  • 1 + ǫ2 + p(x + a)

2 − 1 ... and the fact that a polynomial of degree d taking values in [−b, b] on [−1, 1] is bounded by b|2x|d for all |x| > 1 ... to get the following result

Lemma There is a polynomial P(x) of degree K = O(1/ǫ2 log2(1/ǫ)) such that

P(x) ≥ sgn(x) for all x ∈ R

21 / 45 How to Hoodwink a Halfspace

slide-90
SLIDE 90

Completing Step 2

Let P(x) = 1

2

  • 1 + ǫ2 + p(x + a)

2 − 1 ... and the fact that a polynomial of degree d taking values in [−b, b] on [−1, 1] is bounded by b|2x|d for all |x| > 1 ... to get the following result

Lemma There is a polynomial P(x) of degree K = O(1/ǫ2 log2(1/ǫ)) such that

P(x) ≥ sgn(x) for all x ∈ R P(x) ∈ [sgn(x), sgn(x) + ǫ] for all x ∈ [−1/2, −2a] ∪ [0, 1/2]

21 / 45 How to Hoodwink a Halfspace

slide-91
SLIDE 91

Completing Step 2

Let P(x) = 1

2

  • 1 + ǫ2 + p(x + a)

2 − 1 ... and the fact that a polynomial of degree d taking values in [−b, b] on [−1, 1] is bounded by b|2x|d for all |x| > 1 ... to get the following result

Lemma There is a polynomial P(x) of degree K = O(1/ǫ2 log2(1/ǫ)) such that

P(x) ≥ sgn(x) for all x ∈ R P(x) ∈ [sgn(x), sgn(x) + ǫ] for all x ∈ [−1/2, −2a] ∪ [0, 1/2] P(x) ∈ [−1, 1 + ǫ] for all x ∈ (−2a, 0)

21 / 45 How to Hoodwink a Halfspace

slide-92
SLIDE 92

Completing Step 2

Let P(x) = 1

2

  • 1 + ǫ2 + p(x + a)

2 − 1 ... and the fact that a polynomial of degree d taking values in [−b, b] on [−1, 1] is bounded by b|2x|d for all |x| > 1 ... to get the following result

Lemma There is a polynomial P(x) of degree K = O(1/ǫ2 log2(1/ǫ)) such that

P(x) ≥ sgn(x) for all x ∈ R P(x) ∈ [sgn(x), sgn(x) + ǫ] for all x ∈ [−1/2, −2a] ∪ [0, 1/2] P(x) ∈ [−1, 1 + ǫ] for all x ∈ (−2a, 0) |P(x)| ≤ 2 · (4x)K for all |x| ≥ 1/2.

21 / 45 How to Hoodwink a Halfspace

slide-93
SLIDE 93

Step 2

22 / 45 How to Hoodwink a Halfspace

slide-94
SLIDE 94

Step 2

22 / 45 How to Hoodwink a Halfspace

slide-95
SLIDE 95

Completing Step 3(i)/3(ii)

Left as an exercise

23 / 45 How to Hoodwink a Halfspace

slide-96
SLIDE 96

Step 3

24 / 45 How to Hoodwink a Halfspace

slide-97
SLIDE 97

Step 3

24 / 45 How to Hoodwink a Halfspace

slide-98
SLIDE 98

Plan of attack

Plan of attack :

Construct a polynomial that gives a nice point wise approximation to the sgn function Use it to construct a polynomial that upper bounds the sgn function while closely approximating it under the Gaussian distribution Use it to construct a polynomial that lower bounds the sgn function while closely approximating it under the Gaussian distribution Now use the fact that values taken by homogeneous ’regular’ linear polynomials are distributed normally

25 / 45 How to Hoodwink a Halfspace

slide-99
SLIDE 99

Plan of attack

Plan of attack :

Construct a polynomial that gives a nice point wise approximation to the sgn function Use it to construct a polynomial that upper bounds the sgn function while closely approximating it under the Gaussian distribution Use it to construct a polynomial that lower bounds the sgn function while closely approximating it under the Gaussian distribution Now use the fact that values taken by homogeneous ’regular’ linear polynomials are distributed normally A regular halfspace is one in which no weight is ”large”, i.e. if wi ≤ ǫw2 for all i, then we call the halfspace ǫ-regular

25 / 45 How to Hoodwink a Halfspace

slide-100
SLIDE 100

An Effective Central Limit Theorem

Theorem (Berry-Ess´ een) Let X1, . . . , Xn be a sequence of independent random variables satisfying E[Xi] = 0 for all i,

  • i E
  • X 2

i

  • = σ and
  • i E
  • X 3

i

  • = ρ. Let S = (X1+, . . . , +Xn)/σ and let F be the

cumulative distribution function of S and Φ be the same for N(0, 1). Then sup

x |F(x) − Φ(x)| ≤ ρ/σ3.

26 / 45 How to Hoodwink a Halfspace

slide-101
SLIDE 101

Regular Halfspaces generate Normally distributed outputs

Theorem Let x1, . . . , xn ∈R −1, 1, w1, . . . , wn ∈ R. Let σ = w2 and assume wi ≤ τ · σ. Then for any [a, b] ⊂ R,

  • Pr[a ≤ w1x1 + . . . + wnxn ≤ b] − Φ

a σ, b σ

  • ≤ 2τ.

27 / 45 How to Hoodwink a Halfspace

slide-102
SLIDE 102

Regular Halfspaces generate Normally distributed outputs

Theorem Let x1, . . . , xn ∈R −1, 1, w1, . . . , wn ∈ R. Let σ = w2 and assume wi ≤ τ · σ. Then for any [a, b] ⊂ R,

  • Pr[a ≤ w1x1 + . . . + wnxn ≤ b] − Φ

a σ, b σ

  • ≤ 2τ.

Let Xi = wixi, then E[Xi] = 0, E[X 2

i ] = w 2 i , E[|Xi|3] = |wi|3

27 / 45 How to Hoodwink a Halfspace

slide-103
SLIDE 103

Regular Halfspaces generate Normally distributed outputs

Theorem Let x1, . . . , xn ∈R −1, 1, w1, . . . , wn ∈ R. Let σ = w2 and assume wi ≤ τ · σ. Then for any [a, b] ⊂ R,

  • Pr[a ≤ w1x1 + . . . + wnxn ≤ b] − Φ

a σ, b σ

  • ≤ 2τ.

Let Xi = wixi, then E[Xi] = 0, E[X 2

i ] = w 2 i , E[|Xi|3] = |wi|3

Theorem (Hoeffding) For any w ∈ Rn. For any γ > 0, we have Pr

x←U[|w · x| > γw] ≤ e−γ2/2

27 / 45 How to Hoodwink a Halfspace

slide-104
SLIDE 104

Let the con begin !

... for a regular halfspace h(x) = sgn(w · x − θ) with small threshold (|θ| ≤ Z/4), Z = ǫ/2a = O(1/ǫ log(1/ǫ))

28 / 45 How to Hoodwink a Halfspace

slide-105
SLIDE 105

Let the con begin !

... for a regular halfspace h(x) = sgn(w · x − θ) with small threshold (|θ| ≤ Z/4), Z = ǫ/2a = O(1/ǫ log(1/ǫ)) Upper bound the halfspace with u(x) = P

  • w·x−θ

Z

  • 28 / 45

How to Hoodwink a Halfspace

slide-106
SLIDE 106

Let the con begin !

... for a regular halfspace h(x) = sgn(w · x − θ) with small threshold (|θ| ≤ Z/4), Z = ǫ/2a = O(1/ǫ log(1/ǫ)) Upper bound the halfspace with u(x) = P

  • w·x−θ

Z

  • Ex[u(x) − h(x)] ≤ Ex∈[−ǫ/Z,0] + E|x|≤1/2 + E|x|≥1/2[u(x) − h(x)]

28 / 45 How to Hoodwink a Halfspace

slide-107
SLIDE 107

Let the con begin !

... for a regular halfspace h(x) = sgn(w · x − θ) with small threshold (|θ| ≤ Z/4), Z = ǫ/2a = O(1/ǫ log(1/ǫ)) Upper bound the halfspace with u(x) = P

  • w·x−θ

Z

  • Ex[u(x) − h(x)] ≤ Ex∈[−ǫ/Z,0] + E|x|≤1/2 + E|x|≥1/2[u(x) − h(x)]

Event 1: x ∈ [−ǫ/Z, 0], Error: 2 + ǫ, Probability : ≤ 3ǫ

28 / 45 How to Hoodwink a Halfspace

slide-108
SLIDE 108

Let the con begin !

... for a regular halfspace h(x) = sgn(w · x − θ) with small threshold (|θ| ≤ Z/4), Z = ǫ/2a = O(1/ǫ log(1/ǫ)) Upper bound the halfspace with u(x) = P

  • w·x−θ

Z

  • Ex[u(x) − h(x)] ≤ Ex∈[−ǫ/Z,0] + E|x|≤1/2 + E|x|≥1/2[u(x) − h(x)]

Event 2: |x| ≤ 1/2, Error: ǫ, Probability : ≤ 1

28 / 45 How to Hoodwink a Halfspace

slide-109
SLIDE 109

Let the con begin !

... for a regular halfspace h(x) = sgn(w · x − θ) with small threshold (|θ| ≤ Z/4), Z = ǫ/2a = O(1/ǫ log(1/ǫ)) Upper bound the halfspace with u(x) = P

  • w·x−θ

Z

  • Ex[u(x) − h(x)] ≤ Ex∈[−ǫ/Z,0] + E|x|≤1/2 + E|x|≥1/2[u(x) − h(x)]

Event 3(i): x ∈ [1/2, 1], Error: 2 · 4K − 1, Probability : ≤ e−Z 2/32

28 / 45 How to Hoodwink a Halfspace

slide-110
SLIDE 110

Let the con begin !

... for a regular halfspace h(x) = sgn(w · x − θ) with small threshold (|θ| ≤ Z/4), Z = ǫ/2a = O(1/ǫ log(1/ǫ)) Upper bound the halfspace with u(x) = P

  • w·x−θ

Z

  • Ex[u(x) − h(x)] ≤ Ex∈[−ǫ/Z,0] + E|x|≤1/2 + E|x|≥1/2[u(x) − h(x)]

Event 3(ii): x ∈ [1, 3/2], Error: 2 · 6K − 1, Probability : e−4Z 2/32

28 / 45 How to Hoodwink a Halfspace

slide-111
SLIDE 111

Let the con begin !

... for a regular halfspace h(x) = sgn(w · x − θ) with small threshold (|θ| ≤ Z/4), Z = ǫ/2a = O(1/ǫ log(1/ǫ)) Upper bound the halfspace with u(x) = P

  • w·x−θ

Z

  • Ex[u(x) − h(x)] ≤ Ex∈[−ǫ/Z,0] + E|x|≤1/2 + E|x|≥1/2[u(x) − h(x)]

Event 3(iii): x ∈ [3/2, 2], Error: 2 · 8K − 1, Probability : e−9Z 2/32

28 / 45 How to Hoodwink a Halfspace

slide-112
SLIDE 112

Let the con begin !

... for a regular halfspace h(x) = sgn(w · x − θ) with small threshold (|θ| ≤ Z/4), Z = ǫ/2a = O(1/ǫ log(1/ǫ)) Upper bound the halfspace with u(x) = P

  • w·x−θ

Z

  • Ex[u(x) − h(x)] ≤ Ex∈[−ǫ/Z,0] + E|x|≤1/2 + E|x|≥1/2[u(x) − h(x)]

Event 3(iv): ...

28 / 45 How to Hoodwink a Halfspace

slide-113
SLIDE 113

Let the con begin !

... for a regular halfspace h(x) = sgn(w · x − θ) with small threshold (|θ| ≤ Z/4), Z = ǫ/2a = O(1/ǫ log(1/ǫ)) Upper bound the halfspace with u(x) = P

  • w·x−θ

Z

  • Ex[u(x) − h(x)] ≤ Ex∈[−ǫ/Z,0] + E|x|≤1/2 + E|x|≥1/2[u(x) − h(x)]

Event 3(iv): ... In all we get Ex[u(x) − h(x)] ≤ O(ǫ)

28 / 45 How to Hoodwink a Halfspace

slide-114
SLIDE 114

Let the con begin !

... for a regular halfspace h(x) = sgn(w · x − θ) with small threshold (|θ| ≤ Z/4), Z = ǫ/2a = O(1/ǫ log(1/ǫ)) Upper bound the halfspace with u(x) = P

  • w·x−θ

Z

  • Ex[u(x) − h(x)] ≤ Ex∈[−ǫ/Z,0] + E|x|≤1/2 + E|x|≥1/2[u(x) − h(x)]

Event 3(iv): ... In all we get Ex[u(x) − h(x)] ≤ O(ǫ) Note : The normalization by Z only required to bound the contribution of events 3(. . . )

28 / 45 How to Hoodwink a Halfspace

slide-115
SLIDE 115

Let the con begin !

... for a regular halfspace h(x) = sgn(w · x − θ) with small threshold (|θ| ≤ Z/4), Z = ǫ/2a = O(1/ǫ log(1/ǫ)) Upper bound the halfspace with u(x) = P

  • w·x−θ

Z

  • Ex[u(x) − h(x)] ≤ Ex∈[−ǫ/Z,0] + E|x|≤1/2 + E|x|≥1/2[u(x) − h(x)]

Event 3(iv): ... In all we get Ex[u(x) − h(x)] ≤ O(ǫ) Note : The normalization by Z only required to bound the contribution of events 3(. . . ) One can lower bound the halfspace using l(x) = −u(−x)

28 / 45 How to Hoodwink a Halfspace

slide-116
SLIDE 116

Let the con continue !

... for a regular halfspace h(x) = sgn(w · x − θ) with large threshold (|θ| > Z/4) - assume that θ > Z/4 w.l.o.t.m.g.

29 / 45 How to Hoodwink a Halfspace

slide-117
SLIDE 117

Let the con continue !

... for a regular halfspace h(x) = sgn(w · x − θ) with large threshold (|θ| > Z/4) - assume that θ > Z/4 w.l.o.t.m.g. Let g(x) = sgn(w · x − Z/4), note g(x) ≥ h(x)

29 / 45 How to Hoodwink a Halfspace

slide-118
SLIDE 118

Let the con continue !

... for a regular halfspace h(x) = sgn(w · x − θ) with large threshold (|θ| > Z/4) - assume that θ > Z/4 w.l.o.t.m.g. Let g(x) = sgn(w · x − Z/4), note g(x) ≥ h(x) Upper bound the halfspace with u(x) = P

  • w·x−Z/4

Z

  • 29 / 45

How to Hoodwink a Halfspace

slide-119
SLIDE 119

Let the con continue !

... for a regular halfspace h(x) = sgn(w · x − θ) with large threshold (|θ| > Z/4) - assume that θ > Z/4 w.l.o.t.m.g. Let g(x) = sgn(w · x − Z/4), note g(x) ≥ h(x) Upper bound the halfspace with u(x) = P

  • w·x−Z/4

Z

  • Ex[u(x) − h(x)] = Ex[u(x) − g(x)] + Ex[g(x) − h(x)]

29 / 45 How to Hoodwink a Halfspace

slide-120
SLIDE 120

Let the con continue !

... for a regular halfspace h(x) = sgn(w · x − θ) with large threshold (|θ| > Z/4) - assume that θ > Z/4 w.l.o.t.m.g. Let g(x) = sgn(w · x − Z/4), note g(x) ≥ h(x) Upper bound the halfspace with u(x) = P

  • w·x−Z/4

Z

  • Ex[u(x) − h(x)] = Ex[u(x) − g(x)] + O(ǫ)

29 / 45 How to Hoodwink a Halfspace

slide-121
SLIDE 121

Let the con continue !

... for a regular halfspace h(x) = sgn(w · x − θ) with large threshold (|θ| > Z/4) - assume that θ > Z/4 w.l.o.t.m.g. Let g(x) = sgn(w · x − Z/4), note g(x) ≥ h(x) Upper bound the halfspace with u(x) = P

  • w·x−Z/4

Z

  • Ex[u(x) − h(x)] = O(ǫ) + O(ǫ)

29 / 45 How to Hoodwink a Halfspace

slide-122
SLIDE 122

Let the con continue !

... for a regular halfspace h(x) = sgn(w · x − θ) with large threshold (|θ| > Z/4) - assume that θ > Z/4 w.l.o.t.m.g. Let g(x) = sgn(w · x − Z/4), note g(x) ≥ h(x) Upper bound the halfspace with u(x) = P

  • w·x−Z/4

Z

  • Ex[u(x) − h(x)] = O(ǫ) + O(ǫ)

In all we get Ex[u(x) − h(x)] ≤ O(ǫ)

29 / 45 How to Hoodwink a Halfspace

slide-123
SLIDE 123

Let the con continue !

... for a regular halfspace h(x) = sgn(w · x − θ) with large threshold (|θ| > Z/4) - assume that θ > Z/4 w.l.o.t.m.g. Let g(x) = sgn(w · x − Z/4), note g(x) ≥ h(x) Upper bound the halfspace with u(x) = P

  • w·x−Z/4

Z

  • Ex[u(x) − h(x)] = O(ǫ) + O(ǫ)

In all we get Ex[u(x) − h(x)] ≤ O(ǫ) Lower bound the halfspace using l(x) = −1 : it works since the halfspace almost always outputs −1

29 / 45 How to Hoodwink a Halfspace

slide-124
SLIDE 124

Goal Accomplished !

Theorem Any K(ǫ)-wise distribution O(ǫ)-fools any ǫ-regular halfspace where K(ǫ) = O(1/ǫ2 log2(1/ǫ)).

30 / 45 How to Hoodwink a Halfspace

slide-125
SLIDE 125

Goal Accomplished !

Theorem Any K(ǫ)-wise distribution O(ǫ)-fools any ǫ-regular halfspace where K(ǫ) = O(1/ǫ2 log2(1/ǫ)). Wait till the end for some fun facts about this statement ...

30 / 45 How to Hoodwink a Halfspace

slide-126
SLIDE 126

Non-regular Halfspaces and Critical Indices

Assume |w1| ≥ |w2| ≥ . . . |wn| i.e. in decreasing order

31 / 45 How to Hoodwink a Halfspace

slide-127
SLIDE 127

Non-regular Halfspaces and Critical Indices

Assume |w1| ≥ |w2| ≥ . . . |wn| i.e. in decreasing order The first point from where the (sub)-halfspace (wi, . . . .wn) becomes ǫ-regular is the critical index at ǫ

31 / 45 How to Hoodwink a Halfspace

slide-128
SLIDE 128

Non-regular Halfspaces and Critical Indices

Assume |w1| ≥ |w2| ≥ . . . |wn| i.e. in decreasing order The first point from where the (sub)-halfspace (wi, . . . .wn) becomes ǫ-regular is the critical index at ǫ We shall condition on how far do we need to go in order to get a regular halfspace

31 / 45 How to Hoodwink a Halfspace

slide-129
SLIDE 129

Small Critical Index

“Most” of the halfspace is ǫ-regular

32 / 45 How to Hoodwink a Halfspace

slide-130
SLIDE 130

Small Critical Index

“Most” of the halfspace is ǫ-regular Feed in full independence for the non-regular part to fool it - hopefully not much would be needed

32 / 45 How to Hoodwink a Halfspace

slide-131
SLIDE 131

Small Critical Index

“Most” of the halfspace is ǫ-regular Feed in full independence for the non-regular part to fool it - hopefully not much would be needed If the critical index at ǫ is less than L(ǫ) = O(1/ǫ2 log2(1/ǫ)) then we are done

32 / 45 How to Hoodwink a Halfspace

slide-132
SLIDE 132

Small Critical Index

“Most” of the halfspace is ǫ-regular Feed in full independence for the non-regular part to fool it - hopefully not much would be needed If the critical index at ǫ is less than L(ǫ) = O(1/ǫ2 log2(1/ǫ)) then we are done

Theorem Any K(ǫ) + L(ǫ)-wise distribution O(ǫ)-fools any halfspace with critical index less than L(ǫ).

32 / 45 How to Hoodwink a Halfspace

slide-133
SLIDE 133

Large Critical Index

Exploit “structural properties” of non-regular halfspaces

33 / 45 How to Hoodwink a Halfspace

slide-134
SLIDE 134

Large Critical Index

Exploit “structural properties” of non-regular halfspaces Weights decrease rather rapidly in non-regular regions of the halfspace

33 / 45 How to Hoodwink a Halfspace

slide-135
SLIDE 135

Large Critical Index

Exploit “structural properties” of non-regular halfspaces Weights decrease rather rapidly in non-regular regions of the halfspace ... and so do the norms of the weight vectors (i.e. w 2

i )

33 / 45 How to Hoodwink a Halfspace

slide-136
SLIDE 136

Large Critical Index

Exploit “structural properties” of non-regular halfspaces Weights decrease rather rapidly in non-regular regions of the halfspace ... and so do the norms of the weight vectors (i.e. w 2

i )

l(ǫ) = O(1/ǫ2 log(1/ǫ))

33 / 45 How to Hoodwink a Halfspace

slide-137
SLIDE 137

Some Technical Results

Intuition later ...

34 / 45 How to Hoodwink a Halfspace

slide-138
SLIDE 138

Some Technical Results

Intuition later ... Theorem Let v1 > v2 > . . . > vt > 0 such that vi ≥ 3vi+1, then for any x, y ∈ {−1, 1}t, x = y, we have |v · x − v · y| ≥ vt.

34 / 45 How to Hoodwink a Halfspace

slide-139
SLIDE 139

Some Technical Results

Intuition later ... Theorem Let v1 > v2 > . . . > vt > 0 such that vi ≥ 3vi+1, then for any x, y ∈ {−1, 1}t, x = y, we have |v · x − v · y| ≥ vt. Theorem Let k = 4/ǫ2 log2(10/ǫ), then with probability at least 1 − ǫ/10,

  • θ −

L(ǫ)

  • i=1

wixi

  • ≥ |wk|/4.

34 / 45 How to Hoodwink a Halfspace

slide-140
SLIDE 140

Some Technical Results

Intuition later ... Theorem Let v1 > v2 > . . . > vt > 0 such that vi ≥ 3vi+1, then for any x, y ∈ {−1, 1}t, x = y, we have |v · x − v · y| ≥ vt. Theorem Let k = 4/ǫ2 log2(10/ǫ), then with probability at least 1 − ǫ/10,

  • θ −

L(ǫ)

  • i=1

wixi

  • ≥ |wk|/4.

Theorem (Chebyshev) For any random variable X with E[X] = µ, Var[X] = σ2, for any k > 0, Pr[|X − µ| > kσ] ≤ 1/k2.

34 / 45 How to Hoodwink a Halfspace

slide-141
SLIDE 141

Just a few more steps ...

If σT =

  • n
  • L(ǫ)

w 2

i , then w.h.p.

  • θ −

L(ǫ)

  • i=1

wixi

  • ≥ σT/4ǫ

35 / 45 How to Hoodwink a Halfspace

slide-142
SLIDE 142

Just a few more steps ...

If σT =

  • n
  • L(ǫ)

w 2

i , then w.h.p.

  • θ −

L(ǫ)

  • i=1

wixi

  • ≥ σT/4ǫ

In such a situation unless

  • n
  • L(ǫ)

wixi

  • > σT/4ǫ, the output of the

halfspace is completely decided by the first L(ǫ) variables

35 / 45 How to Hoodwink a Halfspace

slide-143
SLIDE 143

Just a few more steps ...

If σT =

  • n
  • L(ǫ)

w 2

i , then w.h.p.

  • θ −

L(ǫ)

  • i=1

wixi

  • ≥ σT/4ǫ

In such a situation unless

  • n
  • L(ǫ)

wixi

  • > σT/4ǫ, the output of the

halfspace is completely decided by the first L(ǫ) variables But Chebyshev tells us that

  • n
  • L(ǫ)

wixi

  • ≤ σT/4ǫ w.h.p.

35 / 45 How to Hoodwink a Halfspace

slide-144
SLIDE 144

Just a few more steps ...

If σT =

  • n
  • L(ǫ)

w 2

i , then w.h.p.

  • θ −

L(ǫ)

  • i=1

wixi

  • ≥ σT/4ǫ

In such a situation unless

  • n
  • L(ǫ)

wixi

  • > σT/4ǫ, the output of the

halfspace is completely decided by the first L(ǫ) variables But Chebyshev tells us that

  • n
  • L(ǫ)

wixi

  • ≤ σT/4ǫ w.h.p.

Theorem Any L(ǫ) + 2-wise distribution O(ǫ)-fools any halfspace with critical index more than L(ǫ).

35 / 45 How to Hoodwink a Halfspace

slide-145
SLIDE 145

Done !

Theorem Any K(ǫ)-wise distribution O(ǫ)-fools any halfspace where K(ǫ) =O(1/ǫ2 log2(1/ǫ)).

36 / 45 How to Hoodwink a Halfspace

slide-146
SLIDE 146

Done !

Theorem Any K(ǫ)-wise distribution ǫ -fools any halfspace where K(ǫ) =O(1/ǫ2 log2(1/ǫ)).

36 / 45 How to Hoodwink a Halfspace

slide-147
SLIDE 147

Done !

Theorem Any K(ǫ)-wise distribution 12ǫ -fools any halfspace where K(ǫ) =O(1/ǫ2 log2(1/ǫ)).

36 / 45 How to Hoodwink a Halfspace

slide-148
SLIDE 148

Done !

Theorem Any K(ǫ)-wise distribution 12ǫ -fools any halfspace where K(ǫ) = 1/ǫ2 log2(1/ǫ) .

36 / 45 How to Hoodwink a Halfspace

slide-149
SLIDE 149

Done !

Theorem Any K(ǫ)-wise distribution 12ǫ -fools any halfspace where K(ǫ) = 181923848·1/ǫ2 log2(1/ǫ) .

36 / 45 How to Hoodwink a Halfspace

slide-150
SLIDE 150

Done !

Theorem Any K(ǫ)-wise distribution 12ǫ -fools any halfspace where K(ǫ) = 181923848·1/ǫ2 log2(1/ǫ) .

i.e. the result is non-trivial only if n > 232.

36 / 45 How to Hoodwink a Halfspace

slide-151
SLIDE 151

Done !

Theorem Any K(ǫ)-wise distribution 12ǫ -fools any halfspace where K(ǫ) = 181923848·1/ǫ2 log2(1/ǫ) .

i.e. the result is non-trivial only if n > 4294967296.

36 / 45 How to Hoodwink a Halfspace

slide-152
SLIDE 152

Done !

Theorem Any K(ǫ)-wise distribution 12ǫ -fools any halfspace where K(ǫ) = 181923848·1/ǫ2 log2(1/ǫ) .

i.e. the result is non-trivial only if n > wait ... forgot something.

36 / 45 How to Hoodwink a Halfspace

slide-153
SLIDE 153

Done !

Theorem Any K(ǫ)-wise distribution 12ǫ -fools any halfspace where K(ǫ) = 181923848·1/ǫ2 log2(1/ǫ) .

i.e. the result is non-trivial only if n > 242.

36 / 45 How to Hoodwink a Halfspace

slide-154
SLIDE 154

Done !

Theorem Any K(ǫ)-wise distribution 12ǫ -fools any halfspace where K(ǫ) = 181923848·1/ǫ2 log2(1/ǫ) .

i.e. the result is non-trivial only if n > 4398046511104.

36 / 45 How to Hoodwink a Halfspace

slide-155
SLIDE 155

Done !

Theorem Any K(ǫ)-wise distribution 12ǫ -fools any halfspace where K(ǫ) = 181923848·1/ǫ2 log2(1/ǫ) .

i.e. the result is non-trivial only if n > 4398046511104. The results are tight :

Theorem ([BGGP])

There exists a C > 0 such that for every k ≥ 2, max

D∈A(n,k)

  • Pr

x∈D[Maj(x) = 1] − 1

2

C √k log k .

36 / 45 How to Hoodwink a Halfspace

slide-156
SLIDE 156

Done !

Theorem Any K(ǫ)-wise distribution 12ǫ -fools any halfspace where K(ǫ) = 181923848·1/ǫ2 log2(1/ǫ) .

i.e. the result is non-trivial only if n > 4398046511104. The results are tight :

Theorem ([BGGP])

There exists a C > 0 such that for every k ≥ 2, max

D∈A(n,k)

  • Pr

x∈D[Maj(x) = 1] − 1

2

C √k log k . Easier to verify for k = n − 1

36 / 45 How to Hoodwink a Halfspace

slide-157
SLIDE 157

Post [DGJ+09] ...

[KNW10] give an alternate proof of the [DGJ+09] based on new techniques - there is some worsening of parameters K(ǫ) = ǫ−2 log2+o(1)(1/ǫ)

37 / 45 How to Hoodwink a Halfspace

slide-158
SLIDE 158

Post [DGJ+09] ...

[KNW10] give an alternate proof of the [DGJ+09] based on new techniques - there is some worsening of parameters K(ǫ) = ǫ−2 log2+o(1)(1/ǫ) [DKN] extend ideas used in [KNW10] to show that thresholded quadratic polynomials can be ǫ-fooled by ˜ Ω(ǫ−9) independence

37 / 45 How to Hoodwink a Halfspace

slide-159
SLIDE 159

Post [DGJ+09] ...

[KNW10] give an alternate proof of the [DGJ+09] based on new techniques - there is some worsening of parameters K(ǫ) = ǫ−2 log2+o(1)(1/ǫ) [DKN] extend ideas used in [KNW10] to show that thresholded quadratic polynomials can be ǫ-fooled by ˜ Ω(ǫ−9) independence the result extends to intersection of constant number of halfspaces - dependence on number of halfspaces is polynomial

37 / 45 How to Hoodwink a Halfspace

slide-160
SLIDE 160

Post [DGJ+09] ...

[MZ] give explicit pseudorandom generators with seed length 2O(d) log n/ǫ8d+3for thresholded polynomials of degree d

38 / 45 How to Hoodwink a Halfspace

slide-161
SLIDE 161

Post [DGJ+09] ...

[MZ] give explicit pseudorandom generators with seed length 2O(d) log n/ǫ8d+3for thresholded polynomials of degree d The construction gives improved PRG constructions for halfspaces with seed length O(log n log(1/ǫ)) for ǫ = Ω(1/poly(n))

38 / 45 How to Hoodwink a Halfspace

slide-162
SLIDE 162

Post [DGJ+09] ...

[MZ] give explicit pseudorandom generators with seed length 2O(d) log n/ǫ8d+3for thresholded polynomials of degree d The construction gives improved PRG constructions for halfspaces with seed length O(log n log(1/ǫ)) for ǫ = Ω(1/poly(n)) and seed length O(log n) for ǫ = Ω(1/poly(log n))

38 / 45 How to Hoodwink a Halfspace

slide-163
SLIDE 163

Post [DGJ+09] ...

[MZ] give explicit pseudorandom generators with seed length 2O(d) log n/ǫ8d+3for thresholded polynomials of degree d The construction gives improved PRG constructions for halfspaces with seed length O(log n log(1/ǫ)) for ǫ = Ω(1/poly(n)) and seed length O(log n) for ǫ = Ω(1/poly(log n)) However non-explicit arguments show the existence of O(d log n + log(1/ǫ)) seed length PRGs to fool degree d Polynomial threshold functions [MZ]

38 / 45 How to Hoodwink a Halfspace

slide-164
SLIDE 164

Post [DGJ+09] ...

[GOWZ10] consider fooling functions of halfspaces

39 / 45 How to Hoodwink a Halfspace

slide-165
SLIDE 165

Post [DGJ+09] ...

[GOWZ10] consider fooling functions of halfspaces Give a modification of the [MZ] construction to yield a O((d log(ds/ǫ) + log n) · log(ds/ǫ)) seed length PGR for arbitrary decision trees of halfspaces of size s and depth d

39 / 45 How to Hoodwink a Halfspace

slide-166
SLIDE 166

Post [DGJ+09] ...

[GOWZ10] consider fooling functions of halfspaces Give a modification of the [MZ] construction to yield a O((d log(ds/ǫ) + log n) · log(ds/ǫ)) seed length PGR for arbitrary decision trees of halfspaces of size s and depth d i.e. TC0 can be fooled by a seed length of O(log2(n/ǫ))

39 / 45 How to Hoodwink a Halfspace

slide-167
SLIDE 167

Post [DGJ+09] ...

[GOWZ10] consider fooling functions of halfspaces Give a modification of the [MZ] construction to yield a O((d log(ds/ǫ) + log n) · log(ds/ǫ)) seed length PGR for arbitrary decision trees of halfspaces of size s and depth d i.e. TC0 can be fooled by a seed length of O(log2(n/ǫ)) Also extend the construction given in [DGJ+09] to show that ˜ O(d4s2/ǫ2)-wise independence fools arbitrary decision trees of halfspaces of size s and depth d

39 / 45 How to Hoodwink a Halfspace

slide-168
SLIDE 168

Post [DGJ+09] ...

[GOWZ10] consider fooling functions of halfspaces under various Product Distributions Give a modification of the [MZ] construction to yield a O((d log(ds/ǫ) + log n) · log(ds/ǫ)) seed length PGR for arbitrary decision trees of halfspaces of size s and depth d i.e. TC0 can be fooled by a seed length of O(log2(n/ǫ)) Also extend the construction given in [DGJ+09] to show that ˜ O(d4s2/ǫ2)-wise independence fools arbitrary decision trees of halfspaces of size s and depth d

39 / 45 How to Hoodwink a Halfspace

slide-169
SLIDE 169

Post [DGJ+09] ...

[GOWZ10] consider fooling functions of halfspaces under various Product Distributions Give a modification of the [MZ] construction to yield a O((d log(ds/ǫ) + log n) · log(ds/ǫ)) seed length PGR for arbitrary decision trees of halfspaces of size s and depth d i.e. TC0 can be fooled by a seed length of O(log2(n/ǫ)) Also extend the construction given in [DGJ+09] to show that ˜ O(d4s2/ǫ2)-wise independence fools arbitrary decision trees of halfspaces of size s and depth d [HKM09] do slightly better at fooling intersection of k regular halfspaces using seed length O

  • ǫ−5 log n log9.1 k log(1/ǫ)
  • 39 / 45

How to Hoodwink a Halfspace

slide-170
SLIDE 170

References

Louay Bazzi.

Polylogarithmic independence can fool DNF formulas. In IEEE Symposium on Foundations of Computer Science, 2007.

Itai Benjamini, Ori Gurel-Gurevich, and Ron Peled.

On k-wise independent events and percolation. Available at http://www.cims.nyu.edu/~peled/homepage_ files/K-wise_extended_abstract_2.pdf.

Mark Braverman.

Poly-logarithmic independence fools AC0 circuits. In IEEE Conference on Computational Complexity, 2009.

Corinna Cortes, Leonid Kontorovich, and Mehryar Mohri.

Learning Languages with Rational Kernels. In Computational Learning Theory, 2007.

40 / 45 How to Hoodwink a Halfspace

slide-171
SLIDE 171

References

Ilias Diakonikolas, Parikshit Gopalan, Ragesh Jaiswal, Rocco Servedio, and Emanuele Viola.

Bounded Independence fools Halfspaces. In IEEE Symposium on Foundations of Computer Science, 2009.

Ilias Diakonikolas, Daniel M. Kane, and Jelani Nelson.

Bounded Independence Fools Degree-2 Threshold Functions. Available at http://math.harvard.edu/~dankane/deg2ptf.pdf.

Parikshit Gopalan, Ryan O’Donnell, Yi Wu, and David Zuckerman.

Fooling functions of halfspaces under product distributions. Technical Report TR10-006, Electronic Colloquium on Computational Complexity, 2010.

41 / 45 How to Hoodwink a Halfspace

slide-172
SLIDE 172

References

Johan H˚ astad.

On the size of weights for threshold gates. SIAM Journal on Discrete Mathematics, 7(3):484–492, 1994.

Prahladh Harsha, Adam Klivans, and Raghu Meka.

An Invariance Principle for Polytopes. Technical Report TR09-144, Electronic Colloquium on Computational Complexity, 2009.

Daniel Kane, Jelani Nelson, and David Woodruff.

On the Exact Space Complexity of Sketching and Streaming Small Norms. In 21st Annual ACM-SIAM Symposium on Discrete Algorithms, 2010.

Adam R. Klivans and Rocco A. Servedio.

Learning DNF in time 2

˜ O(n1/3).

Journal of Computer and System Sciences, 68(2):303–318, 2004.

42 / 45 How to Hoodwink a Halfspace

slide-173
SLIDE 173

References

Wolfgang Maass and Gy¨

  • rgy Turan.

How fast can a threshold gate learn ? In Computational Learning Theory and Natural Learning Systems, volume I: Constraints and Prospects, pages 381–414. The MIT Press, 1994.

Raghu Meka and David Zuckerman.

Pseudorandom Generators for Polynomial Threshold Functions. Available at http://arxiv.org/abs/0910.4122.

Marvin Minsky and Seymour Papert.

Perceptrons. The MIT Press, 1969.

Rocco A. Servedio.

Every linear threshold function has a low-weight approximator. Computational Complexity, 16(2):180–209, 2007.

43 / 45 How to Hoodwink a Halfspace

slide-174
SLIDE 174

References

Alexander A. Razborov.

A Simple Proof of Bazzi’s theorem. Technical Report TR08-081, Electronic Colloquium on Computational Complexity, 2008.

44 / 45 How to Hoodwink a Halfspace