Families of Mass Destruction Shagnik Das 1 aros 1 Tam as M esz 1 - - PowerPoint PPT Presentation

families of mass destruction
SMART_READER_LITE
LIVE PREVIEW

Families of Mass Destruction Shagnik Das 1 aros 1 Tam as M esz 1 - - PowerPoint PPT Presentation

Families of Mass Destruction Shagnik Das 1 aros 1 Tam as M esz 1 Freie Universit at Berlin Symposium Diskrete Mathematik, Berlin 15th July 2016 Introducing the problem Some initial bounds An exact solution Concluding remarks


slide-1
SLIDE 1

Families of Mass Destruction

Shagnik Das1 Tam´ as M´ esz´ aros1

1Freie Universit¨

at Berlin

Symposium Diskrete Mathematik, Berlin 15th July 2016

slide-2
SLIDE 2

Introducing the problem Some initial bounds An exact solution Concluding remarks

Shattering sets

Definition (Shattering)

A family F shatters a set A if its members intersect A in every possible way: {F ∩ A : F ∈ F} = 2A. Given a family F, Sh(F) denotes the collection of sets it shatters.

Definition (Families)

Ground set: [n] = {1, 2, . . . , n}. Family F a collection of subsets: F ⊆ 2[n]. k-uniform if all members have size k: F ⊆ [n]

k

  • .

A (small) example

Let F be the family {{1}, {1, 2}, {2, 3}, {2, 4}, {1, 2, 5}, {1, 3, 4}, {2, 4, 5}, {3, 4, 5}, {1, 3, 4, 5}}. Then {1, 4, 5} is shattered.

  • S. Das

FU Berlin

slide-3
SLIDE 3

Introducing the problem Some initial bounds An exact solution Concluding remarks

Shattering sets

Definition (Shattering)

A family F shatters a set A if its members intersect A in every possible way: {F ∩ A : F ∈ F} = 2A. Given a family F, Sh(F) denotes the collection of sets it shatters.

Definition (Families)

Ground set: [n] = {1, 2, . . . , n}. Family F a collection of subsets: F ⊆ 2[n]. k-uniform if all members have size k: F ⊆ [n]

k

  • .

A (small) example

Let F be the family {{1}, {1, 2}, {2, 3}, {2, 4}, {1, 2, 5}, {1, 3, 4}, {2, 4, 5}, {3, 4, 5}, {1, 3, 4, 5}}. Then {1, 4, 5} is shattered.

  • S. Das

FU Berlin

slide-4
SLIDE 4

Introducing the problem Some initial bounds An exact solution Concluding remarks

Shattering sets

Definition (Shattering)

A family F shatters a set A if its members intersect A in every possible way: {F ∩ A : F ∈ F} = 2A. Given a family F, Sh(F) denotes the collection of sets it shatters.

Definition (Families)

Ground set: [n] = {1, 2, . . . , n}. Family F a collection of subsets: F ⊆ 2[n]. k-uniform if all members have size k: F ⊆ [n]

k

  • .

A (small) example

Let F be the family {{1}, {1, 2}, {2, 3}, {2, 4}, {1, 2, 5}, {1, 3, 4}, {2, 4, 5}, {3, 4, 5}, {1, 3, 4, 5}}. Then {1, 4, 5} is shattered.

  • S. Das

FU Berlin

slide-5
SLIDE 5

Introducing the problem Some initial bounds An exact solution Concluding remarks

Shattering sets

Definition (Shattering)

A family F shatters a set A if its members intersect A in every possible way: {F ∩ A : F ∈ F} = 2A. Given a family F, Sh(F) denotes the collection of sets it shatters.

Definition (Families)

Ground set: [n] = {1, 2, . . . , n}. Family F a collection of subsets: F ⊆ 2[n]. k-uniform if all members have size k: F ⊆ [n]

k

  • .

A (small) example

Let F be the family {{1}, {1, 2}, {2, 3}, {2, 4}, {1, 2, 5}, {1, 3, 4}, {2, 4, 5}, {3, 4, 5}, {1, 3, 4, 5}}. Then {1, 4, 5} is shattered.

  • S. Das

FU Berlin

slide-6
SLIDE 6

Introducing the problem Some initial bounds An exact solution Concluding remarks

Shattering sets

Definition (Shattering)

A family F shatters a set A if its members intersect A in every possible way: {F ∩ A : F ∈ F} = 2A. Given a family F, Sh(F) denotes the collection of sets it shatters.

Definition (Families)

Ground set: [n] = {1, 2, . . . , n}. Family F a collection of subsets: F ⊆ 2[n]. k-uniform if all members have size k: F ⊆ [n]

k

  • .

A (small) example

Let F be the family {{1}, {1, 2}, {2, 3}, {2, 4}, {1, 2, 5}, {1, 3, 4}, {2, 4, 5}, {3, 4, 5}, {1, 3, 4, 5}}. Then {1, 2, 3} is not shattered.

  • S. Das

FU Berlin

slide-7
SLIDE 7

Introducing the problem Some initial bounds An exact solution Concluding remarks

Shattering sets

Definition (Shattering)

A family F shatters a set A if its members intersect A in every possible way: {F ∩ A : F ∈ F} = 2A. Given a family F, Sh(F) denotes the collection of sets it shatters.

Definition (Families)

Ground set: [n] = {1, 2, . . . , n}. Family F a collection of subsets: F ⊆ 2[n]. k-uniform if all members have size k: F ⊆ [n]

k

  • .

A (small) example

Let F be the family {{1}, {1, 2}, {2, 3}, {2, 4}, {1, 2, 5}, {1, 3, 4}, {2, 4, 5}, {3, 4, 5}, {1, 3, 4, 5}}. Then {1, 2, 3} is not shattered.

  • S. Das

FU Berlin

slide-8
SLIDE 8

Introducing the problem Some initial bounds An exact solution Concluding remarks

VC dimension and the Sauer–Shelah Lemma

Definition (VC dimension)

The VC dimension of a family F, denoted dimVC(F), is the size of the largest set it shatters: dimVC(F) = max{|A| : A ∈ Sh(F)}.

Lemma (Sauer, Shelah (1972))

For any set family F ⊆ 2[n], |Sh(F)| ≥ |F| , and the bound is best possible.

Corollary

If dimVC(F) < k, then |F| ≤

k−1

  • i=0

n i

  • .

Many applications: computational geometry, machine learning, ...

  • S. Das

FU Berlin

slide-9
SLIDE 9

Introducing the problem Some initial bounds An exact solution Concluding remarks

VC dimension and the Sauer–Shelah Lemma

Definition (VC dimension)

The VC dimension of a family F, denoted dimVC(F), is the size of the largest set it shatters: dimVC(F) = max{|A| : A ∈ Sh(F)}.

Lemma (Sauer, Shelah (1972))

For any set family F ⊆ 2[n], |Sh(F)| ≥ |F| , and the bound is best possible.

Corollary

If dimVC(F) < k, then |F| ≤

k−1

  • i=0

n i

  • .

Many applications: computational geometry, machine learning, ...

  • S. Das

FU Berlin

slide-10
SLIDE 10

Introducing the problem Some initial bounds An exact solution Concluding remarks

VC dimension and the Sauer–Shelah Lemma

Definition (VC dimension)

The VC dimension of a family F, denoted dimVC(F), is the size of the largest set it shatters: dimVC(F) = max{|A| : A ∈ Sh(F)}.

Lemma (Sauer, Shelah (1972))

For any set family F ⊆ 2[n], |Sh(F)| ≥ |F| , and the bound is best possible.

Corollary

If dimVC(F) < k, then |F| ≤

k−1

  • i=0

n i

  • .

Many applications: computational geometry, machine learning, ...

  • S. Das

FU Berlin

slide-11
SLIDE 11

Introducing the problem Some initial bounds An exact solution Concluding remarks

VC dimension and the Sauer–Shelah Lemma

Definition (VC dimension)

The VC dimension of a family F, denoted dimVC(F), is the size of the largest set it shatters: dimVC(F) = max{|A| : A ∈ Sh(F)}.

Lemma (Sauer, Shelah (1972))

For any set family F ⊆ 2[n], |Sh(F)| ≥ |F| , and the bound is best possible.

Corollary

If dimVC(F) < k, then |F| ≤

k−1

  • i=0

n i

  • .

Many applications: computational geometry, machine learning, ...

  • S. Das

FU Berlin

slide-12
SLIDE 12

Introducing the problem Some initial bounds An exact solution Concluding remarks

Maximising the number of shattered sets

Question

Which families of m sets maximise the number of shattered sets?

Why maximise?

  • 1. Machines are

dangerously smart

  • 2. An old problem with

a lot of history

  • 3. Quite a bit of fun
  • S. Das

FU Berlin

slide-13
SLIDE 13

Introducing the problem Some initial bounds An exact solution Concluding remarks

Maximising the number of shattered sets

Question

Which families of m sets maximise the number of shattered sets?

Why maximise?

  • 1. Machines are

dangerously smart

  • 2. An old problem with

a lot of history

  • 3. Quite a bit of fun
  • S. Das

FU Berlin

slide-14
SLIDE 14

Introducing the problem Some initial bounds An exact solution Concluding remarks

Maximising the number of shattered sets

Question

Which families of m sets maximise the number of shattered sets?

Why maximise?

  • 1. Machines are

dangerously smart

  • 2. An old problem with

a lot of history

  • 3. Quite a bit of fun
  • S. Das

FU Berlin

slide-15
SLIDE 15

Introducing the problem Some initial bounds An exact solution Concluding remarks

Maximising the number of shattered sets

Question

Which families of m sets maximise the number of shattered sets?

Why maximise?

  • 1. Machines are

dangerously smart

  • 2. An old problem with

a lot of history

  • 3. Quite a bit of fun

A rose by any other name...

◮ Universal sets,

k-independent sets, covering arrays, k-faulty systems

◮ Used for software testing,

derandomisation

  • S. Das

FU Berlin

slide-16
SLIDE 16

Introducing the problem Some initial bounds An exact solution Concluding remarks

Maximising the number of shattered sets

Question

Which families of m sets maximise the number of shattered sets?

Why maximise?

  • 1. Machines are

dangerously smart

  • 2. An old problem with

a lot of history

  • 3. Quite a bit of fun

A rose by any other name...

◮ Universal sets,

k-independent sets, covering arrays, k-faulty systems

◮ Used for software testing,

derandomisation

  • S. Das

FU Berlin

slide-17
SLIDE 17

Introducing the problem Some initial bounds An exact solution Concluding remarks

Maximally destructive animals

A bull in a china shop

  • S. Das

FU Berlin

slide-18
SLIDE 18

Introducing the problem Some initial bounds An exact solution Concluding remarks

Maximally destructive animals

Ein Elefant im Porzellanladen

  • S. Das

FU Berlin

slide-19
SLIDE 19

Introducing the problem Some initial bounds An exact solution Concluding remarks

Maximally destructive animals

Ein Elefant im Porzellanladen

  • S. Das

FU Berlin

slide-20
SLIDE 20

Introducing the problem Some initial bounds An exact solution Concluding remarks

A random family

Definition (Random family)

Let Frand,m be the family {F1, F2, . . . , Fm}, where each Fi is a uniformly random set in 2[n], chosen independently.

What does Frand,m shatter?

◮ Suppose A ⊆ [n], |A| = k ◮ For each i, Fi ∩ A ∼ Unif

  • 2A

◮ Coupon collector problem with 2k coupons ◮ P (A ∈ Sh(F)) →

  • 1

if m = ω(k2k) if m = o(k2k)

  • S. Das

FU Berlin

slide-21
SLIDE 21

Introducing the problem Some initial bounds An exact solution Concluding remarks

A random family

Definition (Random family)

Let Frand,m be the family {F1, F2, . . . , Fm}, where each Fi is a uniformly random set in 2[n], chosen independently.

What does Frand,m shatter?

◮ Suppose A ⊆ [n], |A| = k ◮ For each i, Fi ∩ A ∼ Unif

  • 2A

◮ Coupon collector problem with 2k coupons ◮ P (A ∈ Sh(F)) →

  • 1

if m = ω(k2k) if m = o(k2k)

  • S. Das

FU Berlin

slide-22
SLIDE 22

Introducing the problem Some initial bounds An exact solution Concluding remarks

A random family

Definition (Random family)

Let Frand,m be the family {F1, F2, . . . , Fm}, where each Fi is a uniformly random set in 2[n], chosen independently.

What does Frand,m shatter?

◮ Suppose A ⊆ [n], |A| = k ◮ For each i, Fi ∩ A ∼ Unif

  • 2A

◮ Coupon collector problem with 2k coupons ◮ P (A ∈ Sh(F)) →

  • 1

if m = ω(k2k) if m = o(k2k)

  • S. Das

FU Berlin

slide-23
SLIDE 23

Introducing the problem Some initial bounds An exact solution Concluding remarks

A random family

Definition (Random family)

Let Frand,m be the family {F1, F2, . . . , Fm}, where each Fi is a uniformly random set in 2[n], chosen independently.

What does Frand,m shatter?

◮ Suppose A ⊆ [n], |A| = k ◮ For each i, Fi ∩ A ∼ Unif

  • 2A

◮ Coupon collector problem with 2k coupons ◮ P (A ∈ Sh(F)) →

  • 1

if m = ω(k2k) if m = o(k2k)

  • S. Das

FU Berlin

slide-24
SLIDE 24

Introducing the problem Some initial bounds An exact solution Concluding remarks

A random family — continued

Observation

◮ We need 2k sets to shatter a k-set ◮ Family of m sets shatters no set of size k > log m

What does Frand,m shatter?

Set size k Sets shattered? log m − log log m − ω(1) Almost all log m − log log m + ω(1) Very few > log m None

  • S. Das

FU Berlin

slide-25
SLIDE 25

Introducing the problem Some initial bounds An exact solution Concluding remarks

A random family — continued

Observation

◮ We need 2k sets to shatter a k-set ◮ Family of m sets shatters no set of size k > log m

What does Frand,m shatter?

Set size k Sets shattered? log m − log log m − ω(1) Almost all log m − log log m + ω(1) Very few > log m None

  • S. Das

FU Berlin

slide-26
SLIDE 26

Introducing the problem Some initial bounds An exact solution Concluding remarks

A random family — continued — continued

log m − log log m log m n

Could we do better?

◮ Almost all sets of size ≤ log m have size ⌊log m⌋ ◮ Frand,m shatters very few sets of this size

  • S. Das

FU Berlin

slide-27
SLIDE 27

Introducing the problem Some initial bounds An exact solution Concluding remarks

A random family — continued — continued

log m − log log m log m n

Could we do better?

◮ Almost all sets of size ≤ log m have size ⌊log m⌋ ◮ Frand,m shatters very few sets of this size

  • S. Das

FU Berlin

slide-28
SLIDE 28

Introducing the problem Some initial bounds An exact solution Concluding remarks

The uniform setting

Definition

Given 0 ≤ k ≤ n and 0 ≤ m ≤ 2n, let shmax(n, k, m) denote the maximum number of k-sets in [n] that can be shattered by a family of size m.

The smallest families

We need 2k sets to shatter a single k-set. Can we shatter many more? What is shmax(n, k, 2k)?

Universal families

How many sets are needed to shatter all k-sets? That is, what is m0(n, k) = min

  • m : shmax(n, k, m) =

n

k

  • ?
  • S. Das

FU Berlin

slide-29
SLIDE 29

Introducing the problem Some initial bounds An exact solution Concluding remarks

The uniform setting

Definition

Given 0 ≤ k ≤ n and 0 ≤ m ≤ 2n, let shmax(n, k, m) denote the maximum number of k-sets in [n] that can be shattered by a family of size m.

The smallest families

We need 2k sets to shatter a single k-set. Can we shatter many more? What is shmax(n, k, 2k)?

Universal families

How many sets are needed to shatter all k-sets? That is, what is m0(n, k) = min

  • m : shmax(n, k, m) =

n

k

  • ?
  • S. Das

FU Berlin

slide-30
SLIDE 30

Introducing the problem Some initial bounds An exact solution Concluding remarks

The uniform setting

Definition

Given 0 ≤ k ≤ n and 0 ≤ m ≤ 2n, let shmax(n, k, m) denote the maximum number of k-sets in [n] that can be shattered by a family of size m.

The smallest families

We need 2k sets to shatter a single k-set. Can we shatter many more? What is shmax(n, k, 2k)?

Universal families

How many sets are needed to shatter all k-sets? That is, what is m0(n, k) = min

  • m : shmax(n, k, m) =

n

k

  • ?
  • S. Das

FU Berlin

slide-31
SLIDE 31

Introducing the problem Some initial bounds An exact solution Concluding remarks

Universal families

Theorem (R´ enyi, 1971)

m0(n, 2) = min

  • m : n ≤

m−1

⌈m/2⌉

  • ∼ log n + 1

2 log log n.

Theorem (Kleitman–Spencer, 1973)

  • 2k log n
  • ≤ m0(n, k) ≤ O
  • k2k log n
  • .

Theorem (Alon, 1986)

m0(n, k) ≤ 2O(k4) log n, via an explicit construction. Applications: experimental design, derandomisation

  • S. Das

FU Berlin

slide-32
SLIDE 32

Introducing the problem Some initial bounds An exact solution Concluding remarks

Universal families

Theorem (R´ enyi, 1971)

m0(n, 2) = min

  • m : n ≤

m−1

⌈m/2⌉

  • ∼ log n + 1

2 log log n.

Theorem (Kleitman–Spencer, 1973)

  • 2k log n
  • ≤ m0(n, k) ≤ O
  • k2k log n
  • .

Theorem (Alon, 1986)

m0(n, k) ≤ 2O(k4) log n, via an explicit construction. Applications: experimental design, derandomisation

  • S. Das

FU Berlin

slide-33
SLIDE 33

Introducing the problem Some initial bounds An exact solution Concluding remarks

Universal families

Theorem (R´ enyi, 1971)

m0(n, 2) = min

  • m : n ≤

m−1

⌈m/2⌉

  • ∼ log n + 1

2 log log n.

Theorem (Kleitman–Spencer, 1973)

  • 2k log n
  • ≤ m0(n, k) ≤ O
  • k2k log n
  • .

Theorem (Alon, 1986)

m0(n, k) ≤ 2O(k4) log n, via an explicit construction. Applications: experimental design, derandomisation

  • S. Das

FU Berlin

slide-34
SLIDE 34

Introducing the problem Some initial bounds An exact solution Concluding remarks

Shattering with 2k sets

A random fact

◮ If A ∈

[n]

k

  • , P
  • A ∈ Sh(Frand,2k)
  • =

(2k)! (2k)2k ≈ e−2k ◮ ⇒ shmax

  • n, k, 2k

≥ e−2kn

k

  • .

Beating random

⇒ shmax

  • n, k, 2k
  • n

k

k ≈ e−kn

k

  • .
  • S. Das

FU Berlin

slide-35
SLIDE 35

Introducing the problem Some initial bounds An exact solution Concluding remarks

Shattering with 2k sets

A random fact

◮ If A ∈

[n]

k

  • , P
  • A ∈ Sh(Frand,2k)
  • =

(2k)! (2k)2k ≈ e−2k ◮ ⇒ shmax

  • n, k, 2k

≥ e−2kn

k

  • .

P1 . . . P2 . . . Pk . . .

Beating random

⇒ shmax

  • n, k, 2k
  • n

k

k ≈ e−kn

k

  • .
  • S. Das

FU Berlin

slide-36
SLIDE 36

Introducing the problem Some initial bounds An exact solution Concluding remarks

Shattering with 2k sets

A random fact

◮ If A ∈

[n]

k

  • , P
  • A ∈ Sh(Frand,2k)
  • =

(2k)! (2k)2k ≈ e−2k ◮ ⇒ shmax

  • n, k, 2k

≥ e−2kn

k

  • .

P1 . . . P2 . . . Pk . . .

Beating random

⇒ shmax

  • n, k, 2k
  • n

k

k ≈ e−kn

k

  • .
  • S. Das

FU Berlin

slide-37
SLIDE 37

Introducing the problem Some initial bounds An exact solution Concluding remarks

Shattering with 2k sets

A random fact

◮ If A ∈

[n]

k

  • , P
  • A ∈ Sh(Frand,2k)
  • =

(2k)! (2k)2k ≈ e−2k ◮ ⇒ shmax

  • n, k, 2k

≥ e−2kn

k

  • .

P1 . . . P2 . . . Pk . . .

Beating random

⇒ shmax

  • n, k, 2k
  • n

k

k ≈ e−kn

k

  • .
  • S. Das

FU Berlin

slide-38
SLIDE 38

Introducing the problem Some initial bounds An exact solution Concluding remarks

Shattering with 2k sets

A random fact

◮ If A ∈

[n]

k

  • , P
  • A ∈ Sh(Frand,2k)
  • =

(2k)! (2k)2k ≈ e−2k ◮ ⇒ shmax

  • n, k, 2k

≥ e−2kn

k

  • .

P1 . . . P2 . . . Pk . . .

Beating random

⇒ shmax

  • n, k, 2k
  • n

k

k ≈ e−kn

k

  • .
  • S. Das

FU Berlin

slide-39
SLIDE 39

Introducing the problem Some initial bounds An exact solution Concluding remarks

Shattering with 2k sets

A random fact

◮ If A ∈

[n]

k

  • , P
  • A ∈ Sh(Frand,2k)
  • =

(2k)! (2k)2k ≈ e−2k ◮ ⇒ shmax

  • n, k, 2k

≥ e−2kn

k

  • .

P1 . . . P2 . . . Pk . . .

Beating random

⇒ shmax

  • n, k, 2k
  • n

k

k ≈ e−kn

k

  • .
  • S. Das

FU Berlin

slide-40
SLIDE 40

Introducing the problem Some initial bounds An exact solution Concluding remarks

Shattering with 2k sets

A random fact

◮ If A ∈

[n]

k

  • , P
  • A ∈ Sh(Frand,2k)
  • =

(2k)! (2k)2k ≈ e−2k ◮ ⇒ shmax

  • n, k, 2k

≥ e−2kn

k

  • .

P1 . . . P2 . . . Pk . . .

Beating random

⇒ shmax

  • n, k, 2k
  • n

k

k ≈ e−kn

k

  • .
  • S. Das

FU Berlin

slide-41
SLIDE 41

Introducing the problem Some initial bounds An exact solution Concluding remarks

Shattering with 2k sets

A random fact

◮ If A ∈

[n]

k

  • , P
  • A ∈ Sh(Frand,2k)
  • =

(2k)! (2k)2k ≈ e−2k ◮ ⇒ shmax

  • n, k, 2k

≥ e−2kn

k

  • .

P1 . . . P2 . . . Pk . . .

Beating random

⇒ shmax

  • n, k, 2k
  • n

k

k ≈ e−kn

k

  • .
  • S. Das

FU Berlin

slide-42
SLIDE 42

Introducing the problem Some initial bounds An exact solution Concluding remarks

Shattering with 2k sets

A random fact

◮ If A ∈

[n]

k

  • , P
  • A ∈ Sh(Frand,2k)
  • =

(2k)! (2k)2k ≈ e−2k ◮ ⇒ shmax

  • n, k, 2k

≥ e−2kn

k

  • .

P1 . . . P2 . . . Pk . . .

Beating random

⇒ shmax

  • n, k, 2k
  • n

k

k ≈ e−kn

k

  • .
  • S. Das

FU Berlin

slide-43
SLIDE 43

Introducing the problem Some initial bounds An exact solution Concluding remarks

Shattering with 2k sets

A random fact

◮ If A ∈

[n]

k

  • , P
  • A ∈ Sh(Frand,2k)
  • =

(2k)! (2k)2k ≈ e−2k ◮ ⇒ shmax

  • n, k, 2k

≥ e−2kn

k

  • .

P1 . . . P2 . . . Pk . . .

Beating random

⇒ shmax

  • n, k, 2k
  • n

k

k ≈ e−kn

k

  • .
  • S. Das

FU Berlin

slide-44
SLIDE 44

Introducing the problem Some initial bounds An exact solution Concluding remarks

Shattering with 2k sets

A random fact

◮ If A ∈

[n]

k

  • , P
  • A ∈ Sh(Frand,2k)
  • =

(2k)! (2k)2k ≈ e−2k ◮ ⇒ shmax

  • n, k, 2k

≥ e−2kn

k

  • .

P1 . . . P2 . . . Pk . . .

Beating random

⇒ shmax

  • n, k, 2k
  • n

k

k ≈ e−kn

k

  • .
  • S. Das

FU Berlin

slide-45
SLIDE 45

Introducing the problem Some initial bounds An exact solution Concluding remarks

Shattering with 2k sets

A random fact

◮ If A ∈

[n]

k

  • , P
  • A ∈ Sh(Frand,2k)
  • =

(2k)! (2k)2k ≈ e−2k ◮ ⇒ shmax

  • n, k, 2k

≥ e−2kn

k

  • .

P1 . . . P2 . . . Pk . . .

Beating random

⇒ shmax

  • n, k, 2k
  • n

k

k ≈ e−kn

k

  • .
  • S. Das

FU Berlin

slide-46
SLIDE 46

Introducing the problem Some initial bounds An exact solution Concluding remarks

Our main result

Theorem (D.–M´ esz´ aros, 2016+)

If 2k − 1|n, we have shmax

  • n, k, 2k

=

  • n

2k − 1 k · 2k2 k!

k

  • i=1
  • 1 − 2−i

.

◮ As soon as we can shatter a single k-set, we can shatter a

constant proportion of all k-sets

◮ Optimal family is explicitly constructed

  • S. Das

FU Berlin

slide-47
SLIDE 47

Introducing the problem Some initial bounds An exact solution Concluding remarks

Our main result

Theorem (D.–M´ esz´ aros, 2016+)

If n ≫ k2k, we have shmax

  • n, k, 2k

= (1 + o(1))ck n k

  • ,

where ck = k

i=1

  • 1 − 2−i

→ 0.288788 . . . as k → ∞.

◮ As soon as we can shatter a single k-set, we can shatter a

constant proportion of all k-sets

◮ Optimal family is explicitly constructed

  • S. Das

FU Berlin

slide-48
SLIDE 48

Introducing the problem Some initial bounds An exact solution Concluding remarks

Our main result

Theorem (D.–M´ esz´ aros, 2016+)

If n ≫ k2k, we have shmax

  • n, k, 2k

= (1 + o(1))ck n k

  • ,

where ck = k

i=1

  • 1 − 2−i

→ 0.288788 . . . as k → ∞.

◮ As soon as we can shatter a single k-set, we can shatter a

constant proportion of all k-sets

◮ Optimal family is explicitly constructed

  • S. Das

FU Berlin

slide-49
SLIDE 49

Introducing the problem Some initial bounds An exact solution Concluding remarks

A dual perspective

Definition (Incidence vector)

Given i ∈ [n] and a family F = {F1, F2, . . . , Fm} over [n], the incidence vector wi ∈ {±1}m is defined by wi,j =

  • +1

if i ∈ Fj −1 if i / ∈ Fj .

Observation

The set A = {a1, a2, . . . , ak} is shattered if and only if the m × k matrix

  • wa1
  • wa2

. . .

  • wak
  • has all 2k vectors in {±1}k

appearing as rows.

  • S. Das

FU Berlin

slide-50
SLIDE 50

Introducing the problem Some initial bounds An exact solution Concluding remarks

A dual perspective

Definition (Incidence vector)

Given i ∈ [n] and a family F = {F1, F2, . . . , Fm} over [n], the incidence vector wi ∈ {±1}m is defined by wi,j =

  • +1

if i ∈ Fj −1 if i / ∈ Fj .

Observation

The set A = {a1, a2, . . . , ak} is shattered if and only if the m × k matrix

  • wa1
  • wa2

. . .

  • wak
  • has all 2k vectors in {±1}k

appearing as rows.

  • S. Das

FU Berlin

slide-51
SLIDE 51

Introducing the problem Some initial bounds An exact solution Concluding remarks

An auxiliary hypergraph

Definition (Shatter hypergraph)

The k-uniform shatter hypergraph Hk = (Vk, Ek) has

◮ Vk = {±1}2k ◮ Ek = {{

v1, . . . , vk} :

  • v1

. . .

  • vk
  • has all 2k distinct rows}.

Key properties

◮ Positive degree ⇔ exactly 2k−1 coordinates equal to +1 ◮ Link hypergraph of such a vertex: Hk−1 ⊗ Hk−1 ◮ Positive codegree ⇒ vectors orthogonal

  • S. Das

FU Berlin

slide-52
SLIDE 52

Introducing the problem Some initial bounds An exact solution Concluding remarks

An auxiliary hypergraph

Definition (Shatter hypergraph)

The k-uniform shatter hypergraph Hk = (Vk, Ek) has

◮ Vk = {±1}2k ◮ Ek = {{

v1, . . . , vk} :

  • v1

. . .

  • vk
  • has all 2k distinct rows}.

Key properties

◮ Positive degree ⇔ exactly 2k−1 coordinates equal to +1 ◮ Link hypergraph of such a vertex: Hk−1 ⊗ Hk−1 ◮ Positive codegree ⇒ vectors orthogonal

  • S. Das

FU Berlin

slide-53
SLIDE 53

Introducing the problem Some initial bounds An exact solution Concluding remarks

Hadamard matrices

Definition (Sylvester’s Hadamard matrices)

The matrix Mk is an orthogonal 2k × 2k matrix defined recursively by M0 =

  • 1
  • and, for k ≥ 0,

Mk+1 = Mk Mk Mk −Mk

  • .

M0 =

  • 1
  • M1 =

1 1 1 −1

  • M2 =

    1 1 1 1 1 − 1 1 −1 1 1 −1 −1 1 −1 −1 1    

  • S. Das

FU Berlin

slide-54
SLIDE 54

Introducing the problem Some initial bounds An exact solution Concluding remarks

Hadamard matrices

Definition (Sylvester’s Hadamard matrices)

The matrix Mk is an orthogonal 2k × 2k matrix defined recursively by M0 =

  • 1
  • and, for k ≥ 0,

Mk+1 = Mk Mk Mk −Mk

  • .

M0 =

  • 1
  • M1 =

1 1 1 −1

  • M2 =

    1 1 1 1 1 − 1 1 −1 1 1 −1 −1 1 −1 −1 1    

  • S. Das

FU Berlin

slide-55
SLIDE 55

Introducing the problem Some initial bounds An exact solution Concluding remarks

Hadamard matrices

Definition (Sylvester’s Hadamard matrices)

The matrix Mk is an orthogonal 2k × 2k matrix defined recursively by M0 =

  • 1
  • and, for k ≥ 0,

Mk+1 = Mk Mk Mk −Mk

  • .

M0 =

  • 1
  • M1 =

1 1 1 −1

  • M2 =

    1 1 1 1 1 − 1 1 −1 1 1 −1 −1 1 −1 −1 1    

  • S. Das

FU Berlin

slide-56
SLIDE 56

Introducing the problem Some initial bounds An exact solution Concluding remarks

The construction

Construction

  • 1. Start with the matrix Mk
  • 2. Delete the first (all ‘+1’) column
  • 3. Let each of the 2k − 1 remaining columns be the incidence

vector for a set Si of

n 2k−1 elements

    1 1 1 1 1 −1 1 −1 1 1 −1 −1 1 −1 −1 1     The construction for k = 2

  • S. Das

FU Berlin

slide-57
SLIDE 57

Introducing the problem Some initial bounds An exact solution Concluding remarks

The construction

Construction

  • 1. Start with the matrix Mk
  • 2. Delete the first (all ‘+1’) column
  • 3. Let each of the 2k − 1 remaining columns be the incidence

vector for a set Si of

n 2k−1 elements

    1 1 1 1 1 −1 1 −1 1 1 −1 −1 1 −1 −1 1     The construction for k = 2

  • S. Das

FU Berlin

slide-58
SLIDE 58

Introducing the problem Some initial bounds An exact solution Concluding remarks

The construction

Construction

  • 1. Start with the matrix Mk
  • 2. Delete the first (all ‘+1’) column
  • 3. Let each of the 2k − 1 remaining columns be the incidence

vector for a set Si of

n 2k−1 elements

    1 1 1 − 1 1 − 1 1 − 1 − 1 − 1 − 1 1     The construction for k = 2

  • S. Das

FU Berlin

slide-59
SLIDE 59

Introducing the problem Some initial bounds An exact solution Concluding remarks

The construction

Construction

  • 1. Start with the matrix Mk
  • 2. Delete the first (all ‘+1’) column
  • 3. Let each of the 2k − 1 remaining columns be the incidence

vector for a set Si of

n 2k−1 elements

    1 1 1 − 1 1 − 1 1 − 1 − 1 − 1 − 1 1     S1 S2 S3 The construction for k = 2

  • S. Das

FU Berlin

slide-60
SLIDE 60

Introducing the problem Some initial bounds An exact solution Concluding remarks

The construction

Construction

  • 1. Start with the matrix Mk
  • 2. Delete the first (all ‘+1’) column
  • 3. Let each of the 2k − 1 remaining columns be the incidence

vector for a set Si of

n 2k−1 elements

    1 1 1 − 1 1 − 1 1 − 1 − 1 − 1 − 1 1     S1 S2 S3 The construction for k = 2

  • S. Das

FU Berlin

slide-61
SLIDE 61

Introducing the problem Some initial bounds An exact solution Concluding remarks

The construction

Construction

  • 1. Start with the matrix Mk
  • 2. Delete the first (all ‘+1’) column
  • 3. Let each of the 2k − 1 remaining columns be the incidence

vector for a set Si of

n 2k−1 elements

    1 1 1 − 1 1 − 1 1 − 1 − 1 − 1 − 1 1     S1 S2 S3 The construction for k = 2

  • S. Das

FU Berlin

slide-62
SLIDE 62

Introducing the problem Some initial bounds An exact solution Concluding remarks

The construction

Construction

  • 1. Start with the matrix Mk
  • 2. Delete the first (all ‘+1’) column
  • 3. Let each of the 2k − 1 remaining columns be the incidence

vector for a set Si of

n 2k−1 elements

    1 1 1 − 1 1 − 1 1 − 1 − 1 − 1 − 1 1     S1 S2 S3 The construction for k = 2

  • S. Das

FU Berlin

slide-63
SLIDE 63

Introducing the problem Some initial bounds An exact solution Concluding remarks

The construction

Construction

  • 1. Start with the matrix Mk
  • 2. Delete the first (all ‘+1’) column
  • 3. Let each of the 2k − 1 remaining columns be the incidence

vector for a set Si of

n 2k−1 elements

    1 1 1 − 1 1 − 1 1 − 1 − 1 − 1 − 1 1     S1 S2 S3 The construction for k = 2

  • S. Das

FU Berlin

slide-64
SLIDE 64

Introducing the problem Some initial bounds An exact solution Concluding remarks

Counting shattered sets

The construction

◮ Hadamard recursion + induction + symmetry ⇒ vectors in

the construction induce ek edges in Hk, where ek = 2k2 k!

k

  • i=1

(1 − 2−i).

◮ Blow-up ⇒

  • n

2k−1

k · 2k2

k!

k

i=1(1 − 2−i) sets shattered

The upper bound

◮ Upper bound: determine Lagrangian of Hk ◮ Link graphs obey similar recursion as Hadamard matrices

  • S. Das

FU Berlin

slide-65
SLIDE 65

Introducing the problem Some initial bounds An exact solution Concluding remarks

Counting shattered sets

The construction

◮ Hadamard recursion + induction + symmetry ⇒ vectors in

the construction induce ek edges in Hk, where ek = 2k2 k!

k

  • i=1

(1 − 2−i).

◮ Blow-up ⇒

  • n

2k−1

k · 2k2

k!

k

i=1(1 − 2−i) sets shattered

The upper bound

◮ Upper bound: determine Lagrangian of Hk ◮ Link graphs obey similar recursion as Hadamard matrices

  • S. Das

FU Berlin

slide-66
SLIDE 66

Introducing the problem Some initial bounds An exact solution Concluding remarks

Some final thoughts

Universal constructions

◮ Random copies of Hadamard construction:

m0(n, k) = O

  • k2k log n
  • ◮ Constant worse than for Frand,m

◮ But uses less randomness

◮ Can we use the Hadamard construction more efficiently?

Larger families

◮ What is shmax(n, k, m) for m ≥ 2k + 1?

◮ Lose linear algebraic characterisation of shattering

  • S. Das

FU Berlin

slide-67
SLIDE 67

Introducing the problem Some initial bounds An exact solution Concluding remarks

  • S. Das

FU Berlin