Using Steins method to show Poisson and normal limit laws for fringe - - PowerPoint PPT Presentation

using stein s method to show poisson and normal limit
SMART_READER_LITE
LIVE PREVIEW

Using Steins method to show Poisson and normal limit laws for fringe - - PowerPoint PPT Presentation

Using Steins method to show Poisson and normal limit laws for fringe trees Cecilia Holmgren, Stockholm University study together with Svante Janson, Uppsala University AofA, Paris June 19, 2014 Aim of Study To show new general results,


slide-1
SLIDE 1

Using Stein’s method to show Poisson and normal limit laws for fringe trees

Cecilia Holmgren, Stockholm University study together with Svante Janson, Uppsala University AofA, Paris June 19, 2014

slide-2
SLIDE 2

Aim of Study

◮ To show new general results, as well as provide more

direct proofs, of earlier results on fringe trees (i.e., subtrees consisting of some node and its descendants) in binary search trees and random recursive trees.

◮ To apply Stein’s method with couplings (as described

by Barbour, Holst and Janson) in the study of fringe trees.

◮ To see whether the general results on fringe trees

could lead to simple solutions of various types of problems on random trees, such as for example the asymptotic distribution of the number of protected nodes in the binary search tree.

slide-3
SLIDE 3

The Binary Search Tree

Start to draw a number so-called key from the set {1, 2, 3, . . . , 10} and place it in the root.

slide-4
SLIDE 4

The Binary Search Tree

Start to draw a number so-called key from the set {1, 2, 3, . . . , 10} and place it in the root.

6

slide-5
SLIDE 5

The Binary Search Tree

Draw a new number/key from the remaining numbers in the set {1, 2, 3, . . . , 10}. Compare the new key to the root’s key. If it is smaller/larger it is associated with the left/right child.

6 3

slide-6
SLIDE 6

The Binary Search Tree

Continue to draw new keys and start the comparison from the root.

6 3 8

slide-7
SLIDE 7

The Binary Search Tree

6 3 8 10

slide-8
SLIDE 8

The Binary Search Tree

6 3 8 5 10

slide-9
SLIDE 9

The Binary Search Tree

6 3 8 1 5 10

slide-10
SLIDE 10

The Binary Search Tree

6 3 8 1 5 10 4

slide-11
SLIDE 11

The Binary Search Tree

6 3 8 1 5 7 10 4

slide-12
SLIDE 12

The Binary Search Tree

6 3 8 1 5 7 10 4 9

slide-13
SLIDE 13

The Binary Search Tree

6 3 8 1 5 7 10 2 4 9

slide-14
SLIDE 14

Construction with Allocated Random Time Stamps

◮ We use the representation of the binary search tree Tn by

  • Devroye. We interpret the permutation as assigning a

random time stamp Uk to each key k describing when it is

  • inserted. We sometimes use the notation (k, Uk) to denote

this connection.

◮ This tree is constructed from (1, U1), . . . , (10, U10), where

1 = U6 < U3 < U8 < U10 < U5 < U1 < U4 < U7 < U9 < U2 = 10.

6 3 8 1 5 7 10 2 4 9

slide-15
SLIDE 15

Construction with Allocated Random Time Stamps

The unique binary search tree constructed from (1, U1), . . . , (n, Un) have two characterizing properties.

◮ It is a binary search tree with respect to the first

coordinates in the pairs.

◮ Along every path down from the root the values Ui, are

increasing.

6,1 3,2 8,3 1,6 5.5 7,8 10,4 2,10 4,7 9,9

slide-16
SLIDE 16

What is a Fringe Tree?

6,1 3,2 8,3 1,6 5.5 7,8 10,4 2,10 4,7 9,9

slide-17
SLIDE 17

A Subtree is Associated with Keys and Time Stamps

◮ The fringe tree rooted at 3 in the example is associated

with (1, 6), (2, 10), (3, 2), (4, 7), (5, 5).

◮ We use ”subtree” and ”fringe tree” as synonyms. 6,1 3,2 8,3 1,6 5.5 7,8 10,4 2,10 4,7 9,9

slide-18
SLIDE 18

Functions of Subtrees

◮ Each node u in the binary search tree can thus be

associated with a subset Tn(u) of {(1, U1), . . . , (n, Un)}.

◮ Let f(T) be a function from the set of (unlabelled) binary

trees to R. Set Xn =

  • u

f(Tn(u)), summing over all nodes in Tn. Since the shape of the subtree rooted at u is determined by Tn(u), we can use Xn to calculate the number of subtrees with properties that interest us by choosing appropriate functions f.

slide-19
SLIDE 19

Examples of Functions of Subtrees

Let f(T) be a function from the set of (unlabelled) binary trees to R. Set Xn =

  • u

f(Tn(u)). Examples:

◮ Let f(Tn(u)) = 1{Tn(u) ≈ T}. Then Xn is the number of

subtrees that are equal to T (since each permutation uniquely determines a subtree shape).

◮ Let f(Tn(u)) = 1{| Tn(u) |= k}. Then Xn is the number of

subtrees with exactly k nodes.

◮ Let f(Tn(u)) = 1{| Tn(u) |= 1}. Then Xn is the number of

leaves.

slide-20
SLIDE 20

Requirements for Being A Subtree

◮ Write

σ(i, k) = {(i, Ui), . . . , (i + k − 1, Ui+k−1)} for k ≥ 1 and 1 ≤ i ≤ n − k + 1.

◮ We define the indicator variable

Ii,k = 1{σ(i, k) is a subtree in Tn}. Defining U0 = Un+1 = 0, we see that Ii,k = 1

  • Ui−1 and Ui+k are the smallest of Ui−1, . . . , Ui+k
  • .

◮ Note that we have two boundary cases for i = 1 and

i = n − k + 1.

slide-21
SLIDE 21

Requirements for Being A Subtree

6,1 3,2 8,3 1,6 5.5 7,8 10,4 2,10 4,7 9,9

slide-22
SLIDE 22

Requirements for Being A Subtree

6,1 3,2 8,3 1,6 5.5 7,8 10,4 2,10 4,7 9,9

slide-23
SLIDE 23

Requirements for Being A Subtree

6,1 3,2 8,3 1,6 5.5 7,8 10,4 2,10 4,7 9,9

slide-24
SLIDE 24

Requirements for Being A Subtree

6,1 3,2 8,3 1,6 5.5 7,8 10,4 2,10 4,7 9,9

slide-25
SLIDE 25

Cyclic Representation

◮ The representation by Devroye representing the tree as

(1, U1), . . . , (n, Un) is natural and useful, but the terms with i = 1 and i = (n − k + 1) have to be treated specially because of boundary effects.

◮ Boundary effects can be avoided by instead using a cyclic

representation.

◮ Let U0, . . . , Un ∼ U(0, 1) be i.i.d. uniform r.v.. Let

Ui+k·(n+1) := Ui, i ∈ {0, . . . , n}, k ∈ Z.

◮ When discussing these variables, we will use the natural

metric on Zn+1 defined by |i − j|n+1 := min

ℓ∈Z |i − j − ℓ · (n + 1)|.

slide-26
SLIDE 26

Cyclic Representation

Again let Ii,k = 1

  • Ui−1 and Ui+k are the smallest of Ui−1, . . . , Ui+k
  • ,

but now for all i and k.

◮ The number of subtrees of size k < n in a binary search

tree with n nodes, is with Devroye’s representation equal to

n−k+1

  • i=1

Ii,k.

◮ Using the cyclic representation, the number of subtrees of

size k < n is equal to

n+1

  • i=1

Ii,k; the Ii,k are equally distributed and E(Ii,k) =

2 (k+1)(k+2).

slide-27
SLIDE 27

Cyclic Representation

Recall Ii,k = 1

  • Ui−1 and Ui+k are the smallest of Ui−1, . . . , Ui+k
  • .

◮ The sum n+1 i=1 Ii,k is invariant under a cyclic shift of

U0, . . . , Un. If we shift the values U0, U1, U2, . . . , Un so that U0 is the smallest we are back in the representation by Devroye, where one can assume that U0 = Un+1 = 0 (since only order relations are important for Ii,k).

slide-28
SLIDE 28

The Number of Subtrees of size k

The expected value and variance of Xn,k := n+1

i=1 Ii,k is easy to

calculate using the cyclic representation.

Lemma 1

Let 1 ≤ k < n. For the random binary search tree Tn, E(Xn,k) = 2(n + 1) (k + 1)(k + 2) and Var(Xn,k) =        E Xn,k − (n + 1)

22k2+44k+12 (k+1)(k+2)2(2k+1)(2k+3),

k < n−1

2 ,

E Xn,k + 2

n − 64 (n+3)2 ,

k = n−1

2 ,

E Xn,k − (E Xn,k)2 = E Xn,k −

4(n+1)2 (k+1)2(k+2)2 ,

k > n−1

2 .

slide-29
SLIDE 29

The Random Recursive Tree

Start with a root with label 0.

slide-30
SLIDE 30

The Random Recursive Tree

At stage 1 attach a new node with label 1 to the root.

1

slide-31
SLIDE 31

The Random Recursive Tree

At stage i (i = 1, . . . , 10), attach a new node with label i uniformly at random to one of the previous nodes 0, . . . , i − 1.

1 2

slide-32
SLIDE 32

The Random Recursive Tree

At stage i (i = 1, . . . , 10), attach a new node with label i uniformly at random to one of the previous nodes 0, . . . , i − 1.

1 3 2

slide-33
SLIDE 33

The Random Recursive Tree

At stage i (i = 1, . . . , 10), attach a new node with label i uniformly at random to one of the previous nodes 0, . . . , i − 1.

1 3 4 2

slide-34
SLIDE 34

The Random Recursive Tree

At stage i (i = 1, . . . , 10), attach a new node with label i uniformly at random to one of the previous nodes 0, . . . , i − 1.

1 3 4 2 5

slide-35
SLIDE 35

The Random Recursive Tree

At stage i (i = 1, . . . , 10), attach a new node with label i uniformly at random to one of the previous nodes 0, . . . , i − 1.

1 3 4 2 5 6

slide-36
SLIDE 36

The Random Recursive Tree

At stage i (i = 1, . . . , 10), attach a new node with label i uniformly at random to one of the previous nodes 0, . . . , i − 1.

1 3 4 2 5 6 7

slide-37
SLIDE 37

The Random Recursive Tree

At stage i (i = 1, . . . , 10), attach a new node with label i uniformly at random to one of the previous nodes 0, . . . , i − 1.

1 3 4 2 5 8 6 7

slide-38
SLIDE 38

The Random Recursive Tree

At stage i (i = 1, . . . , 10), attach a new node with label i uniformly at random to one of the previous nodes 0, . . . , i − 1.

1 3 4 2 5 8 9 6 7

slide-39
SLIDE 39

The Random Recursive Tree

At stage i (i = 1, . . . , 10), attach a new node with label i uniformly at random to one of the previous nodes 0, . . . , i − 1.

1 3 4 2 5 8 9 6 10 7

slide-40
SLIDE 40

Total Variation Distance Definition

◮ Let (X, A) be any measurable space. The total variation

distance dTV between two probability measures µ1 and µ2

  • n X is defined to be

dTV(µ1, µ2) := sup

A∈A

| µ1(A) − µ2(A) |.

◮ Let L(X) denote the distribution of a random variable X.

slide-41
SLIDE 41

Poisson Convergence

The following theorem except the explicit rate O

  • 1

k

  • , was shown by

Feng, Mahmoud and Panholzer, and Fuchs by using variants of the method of moments. Here we provide a more direct proof using Stein’s method.

Theorem 1

Let k = kn where k < n. Then it holds that dTV(L(Xn,k), Po(µn,k)) = O 1 k

  • ,

where Xn,k is the number of subtrees of size k in the random binary search tree Tn and µn,k := E(Xn,k) =

2(n+1) (k+1)(k+2). Similarly it holds that

dTV(L(ˆ Xn,k), Po(ˆ µn,k)) = O 1 k

  • ,

where ˆ Xn,k is the number of subtrees in the random recursive tree Λn and ˆ µn,k := E( ˆ Xn,k) =

n (k+1)(k+2).

slide-42
SLIDE 42

Poisson Convergence Consequently, if n → ∞ and k → ∞, then dTV(L(Xn,k), Po(µn,k)) → 0 and dTV(L( ˆ Xn,k), Po( ˆ µn,k)) → 0 , where we recall that Xn,k and ˆ Xn,k is the number of subtrees of size k in the random binary search tree Tn respectively in the random recursive tree Λn.

slide-43
SLIDE 43

Sketch of Proof of Theorem 1

◮ For proving Poisson convergence for sums of weakly

dependent indicator variables it is often useful to find couplings.

◮ Let Γ be a finite index set and let (Iα, α ∈ Γ) be indicator

random variables. We write W :=

α∈Γ Iα and λ = E(W). ◮ A coupling (W, Wα) between W and a random variable

Wα is defined on the same probability space as W, with the property L(Wα) = L(W − Iα | Iα = 1). Such a coupling can be used for approximating W with a Poisson distribution Po(λ).

slide-44
SLIDE 44

Coupling for Poisson Approximation

The coupling (W, Wα) can be constructed in the following way:

◮ We find random variables (Jβα, β ∈Γ) defined on the same

probability space as (Iα, α ∈Γ), in such a way that for each α ∈Γ, and jointly for all β ∈Γ, L(Jβα) = L(Iβ | Iα = 1).

◮ Then Wα = β=α Jβα is defined on the same probability

space as W =

α∈Γ Iα and it holds that

L(Wα) = L(W − Iα | Iα = 1).

slide-45
SLIDE 45

Poisson Approximation of the Number of Subtrees

For showing Poisson convergence as k → ∞ of Xn,k = n+1

i=1 Ii,k (the number of subtrees of size k < n in the

random binary search tree of size n) we showed that there exist appropriate couplings of the indicators Ii,k.

◮ We use that Ii,k and Ij,k are independent, whenever

|i − j|n+1 ≥ k + 2. This independence follows, since Ii,k = 1

  • Ui−1 and Ui+k are the smallest among Ui−1, . . . , Ui+k
  • ,

which means that if |i − j|n+1 ≥ k + 2, Ii,k and Ij,k are depending on disjoint Um.

slide-46
SLIDE 46

A Coupling Forcing a Subtree in the Binary Search Tree

We want to show Poisson convergence as k → ∞ of Xn,k = n+1

i=1 Ii,k.

Lemma 2

Let k ∈ {1, . . . , n − 1}. Then for each i ∈ {1, . . . , n + 1}, there exists a coupling ((Ij,k)j, (Z k

ji )j) such that

L(Z k

ji ) = L(Ij,k | Ii,k = 1) jointly for all j ∈ {1, . . . , n + 1}.

Furthermore,      Z k

ji = Ij,k

if |j − i|n+1 > k + 1, Z k

ji ≥ Ij,k

if |j − i|n+1 = k + 1, Z k

ji = 0 ≤ Ij,k

if 0 < |j − i|n+1 ≤ k.

slide-47
SLIDE 47

A Coupling Forcing A Subtree

Let U0 = 0. If we condition on that the keys {4, 5, 6} forms a subtree, we only need to change the times {U3, U4, U5, U6, U7}, so that U3 and U7 are the two smallest of these five values. All

  • ther time stamps Ui in the tree are the same after the
  • coupling. Thus, in the coupling we change (7, U7 = 8) to

(7, U6 = 1) and (6, U6 = 1) to (6, U7 = 8).

6,1 3,2 8,3 1,6 5.5 7,8 10,4 2,10 4,7 9,9 7,1 3,2 8,3 1,6 5.5 6,8 10,4 2,10 4,7 9,9

slide-48
SLIDE 48

A Negative Relation

◮ Recall that (Jβα, β ∈ Γ) are random variables with

L(Jβα) = L(Iβ | Iα = 1), where Iα, α ∈ Γ are indicator random variables for the event α.

◮ Suppose that, for each α, the set Γα = Γ\{α} is partioned

into Γ−

α and Γα\Γ− α in such a way that

Jβα ≤ Iβ, if β ∈ Γ−

α .

slide-49
SLIDE 49

Poisson Convergence Theorem 2 (Barbour,Holst,Janson)

Let W =

α∈Γ Iα and λ = E(W). Let Γα = Γ\{α} and Γ− α be

defined as above. Then it holds that dTV(L(W), Po(λ)) ≤ min{1, λ−1}

  • λ − Var(W) + 2
  • α∈Γ
  • β∈Γα\Γ−

α

E(IαIβ)

  • .

The Poisson approximation is good if the set Γ−

α is large

compared to the set Γα\Γ−

α .

slide-50
SLIDE 50

Sketch of Proof of Theorem 1

Let Γ := {1, . . . , n + 1} and Γi := Γ \ {i}. From Lemma 2 we see that if |j − i|n+1 = k + 1 then Z k

ji ≤ Ij,k, and thus the set

Γ−

i := Γ \ {i, i ± (k + 1)} is LARGE compared to the set Γi\Γ− i .

Lemma 2

Let k ∈ {1, . . . , n − 1}. Then for each i ∈ {1, . . . , n + 1}, there exists a coupling ((Ij,k)j, (Z k

ji )j) such that

L(Z k

ji ) = L(Ij,k | Ii,k = 1) jointly for all j ∈ {1, . . . , n + 1}.

Furthermore,      Z k

ji = Ij,k

if |j − i|n+1 > k + 1, Z k

ji ≥ Ij,k

if |j − i|n+1 = k + 1, Z k

ji = 0 ≤ Ij,k

if 0 < |j − i|n+1 ≤ k.

slide-51
SLIDE 51

A Coupling Between the Random Recursive Tree and the Binary Search Tree

◮ There is a bijection called the natural correspondance

between ordered trees of size n and binary trees of size n − 1, introduced by Knuth.

◮ As noted by Fuchs, Hwang and Neininger the natural

correspondence yields a coupling between the random recursive tree of size n and the binary search tree of size n − 1.

slide-52
SLIDE 52

A Coupling Between the Random Recursive Tree and the Binary Search Tree

1 3 4 2 5 8 9 6 10 7 (a) A random recursive tree. 6,1 3,2 8,3 1,6 5.5 7,8 10,4 2,10 4,7 9,9 (b) The corresponding binary search tree.

slide-53
SLIDE 53

Bijections Between Subtrees in a Random Recursive Tree and a Binary Search Tree

1 3 4 2 5 8 9 6 10 7

(c) Two subtrees in the random recursive tree.

6,1 3,2 8,3 1,6 5,5 7,8 10,4 2,10 4,7 9,9

(d) The corresponding subtrees in the binary search tree.

slide-54
SLIDE 54

Multivariate Normal Distribution Theorem 3

Let Tn be a random binary search tree with n nodes. Let X n

T be

the number of subtrees T in the binary search tree Tn. Let T 1, . . . , T d be a fixed sequence of distinct binary trees, and let ¯ X n

d = (X n T 1, X n T 2, . . . , X n T d). Let

µn

d :=

  • E(X n

T 1), E(X n T 2), . . . , E(X n T d)

  • and let Γ = (γij)d

i,j=1 denote the matrix with elements

γij = limn→∞ 1

nCov(X n T i, X n T j) (where γij are given explicitly).

Then Γ is non-singular and ¯ X n

d − µn d

√n

d

→ N(0, Γ). There is a corresponding result for the random recursive tree.

slide-55
SLIDE 55

The Cram´ er–Wold Device

◮ By the Cram´

er–Wold device (Billingsley Theorem 7.7), to show that ¯ X n

d = (X n T 1, X n T 2, . . . , X n T d) converges to a

multivariate normal distribution, it is enough to show that every linear combination of the components in the vector converges to a normal distribution.

◮ We use a version of Stein’s method with dependency

graphs as defined by Janson et al. (and earlier used by Devroye for the 1-dimensional case), for convergence to normal distributions, to prove that ¯ X n

d converges to a

multivariate normal distribution.

slide-56
SLIDE 56

Number of Protected Nodes in the Binary Search Tree

We consider the number of so-called 2-protected nodes in binary search trees. A node is 2-protected if the shortest distance to a leaf is at least two, i.e., it is neither a leaf or the parent of a leaf.

slide-57
SLIDE 57

Number of Protected Nodes in the Binary Search Tree

We consider the number of so-called 2-protected nodes in binary search trees. A node is 2-protected if the shortest distance to a leaf is at least two, i.e., it is neither a leaf or the parent of a leaf.

n-Xn=2# +3# =2#

  • #
slide-58
SLIDE 58

Number of Protected Nodes in the Binary Search Tree

The following theorem was shown by Mahmoud and Ward using generating functions and recurrences.

Theorem 4

Let Xn be the number of protected nodes in a binary search

  • tree. Then

Xn − 11

30n

√n

d

− → N

  • 0, 29

225

  • .

◮ We provide a simple proof of this theorem using that the

number of unprotected nodes equals twice the number of leaves minus the number of cherry subtrees.

◮ Hence, since any linear combination of the components in

a random vector with a multivariate distribution is normal, Theorem 4 follows from Theorem 3.

slide-59
SLIDE 59

Summary

◮ We introduced a cyclic version of Devroye’s representation

to study sums of functions of fringe trees in binary search trees.

◮ Using Stein’s method we showed that the number of

subtrees of size k < n in the binary search tree and the random recursive tree (of size n) converges to a Poisson distribution as k → ∞.

◮ We studied the random number of copies of a certain fixed

subtree T in the binary search tree (respectively the random recursive tree). Using the Cram´ er–Wold device, we show that a vector with the components corresponding to these random numbers for different fixed subtrees, converges to a multivariate normal distribution.

slide-60
SLIDE 60

Summary

◮ We introduced certain couplings related to Stein’s

method in the study of fringe trees.

◮ We showed that the the natural correspondence

between the random recursive tree and the binary search tree could be used to analyze fringe trees in the random recursive tree.

◮ We showed that we could translate the problem

concerning the number of protected nodes in the binary search tree to a problem concerning fringe trees. Thus, the fringe tree approach lead to a simple proof, for showing that the the number of protected nodes in the binary search tree is asymptotically normal.