Probabilistic Fr echet Means on Persistence Diagrams Paul Bendich - - PowerPoint PPT Presentation

probabilistic fr echet means on persistence diagrams
SMART_READER_LITE
LIVE PREVIEW

Probabilistic Fr echet Means on Persistence Diagrams Paul Bendich - - PowerPoint PPT Presentation

Probabilistic Fr echet Means on Persistence Diagrams Paul Bendich Duke University :: Dept of Mathematics July 15, 2013 Paul Bendich (Duke) Probabilistic Fr echet Means on Persistence Diagrams July 15, 2013 1 / 33 Collaborators This is


slide-1
SLIDE 1

Probabilistic Fr´ echet Means on Persistence Diagrams

Paul Bendich

Duke University :: Dept of Mathematics

July 15, 2013

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 1 / 33

slide-2
SLIDE 2

Collaborators

This is joint work with:

◮ Liz Munch (Duke) ◮ Kate Turner (Chicago) ◮ John Harer (Duke) ◮ Sayan Mukherjee (Duke) ◮ Jonathan Mattingly (Duke) Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 2 / 33

slide-3
SLIDE 3

Main Idea and Results

New definition of mean for a set X of diagrams in (Dp, Wp) Mileyko et. al.:

◮ µX is itself a (set of) diagram(s) in Dp. ◮ Problem: non-uniqueness leads to discontinuity issues.

Our approach:

◮ Definition: µX ∈ P(Dp): (atomic) prob. dist. on diagrams. ◮ Theorem: X → µX is H¨

  • lder continuous (with exponent 1

2)

a 2 1 b x a 1 2 b y u v

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 3 / 33

slide-4
SLIDE 4

1

Persistence Review

2

Why Means?

3

Frechet Means of Diagrams

4

Probabilistic Frechet Means

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 4 / 33

slide-5
SLIDE 5

1

Persistence Review

2

Why Means?

3

Frechet Means of Diagrams

4

Probabilistic Frechet Means

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 4 / 33

slide-6
SLIDE 6

Persistence modules

A persistence module F is:

◮ family of vector spaces {Fα}, α ∈ R, over a fixed field ◮ family of linear transformations f β

α : Fα → Fβ, for all α ≤ β, s.t

α ≤ γ ≤ β implies f β

α = f β γ ◦ f γ α .

The number α is a regular value of the module if:

◮ There exists δ > 0 such that f α+ǫ

α−ǫ is iso. for all ǫ < δ.

If α is not a r.v., then it is a critical value of the module. Module is tame if only finitely many c.v’s, and each v.s is of finite rank.

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 5 / 33

slide-7
SLIDE 7

Persistence Modules

Given finitely many c.v’s c1 < c2 < . . . < cn. Interleave r.v’s a0 < c1 < a1 < . . . < cn < an. Set Fi = Fai: F0 → F1 → F2 . . . → Fn−1 → Fn

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 6 / 33

slide-8
SLIDE 8

Birth and Death

A vector v ∈ Fi is born at ci if v ∈ im f i

i−1

Such a v dies at cj if:

◮ f j

i (v) ∈ im f j i−1

◮ f j−1

i

(v) ∈ im f j−1

i−1 .

The persistence of v is cj − ci.

v Fi−1 Fi Fj−1 Fj f j

j−1

f i

i−1

f j−1

i

imf j−1

i−1

imf i

i−1

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 7 / 33

slide-9
SLIDE 9

Persistence Diagrams

Let Pi,j be v.s of classes born at ci and dead at cj, and βi,j its rank. Plot a dot of multiplicity βi,j at (ci, cj) in plane. Plot a dot of infinite multiplicity at all y = x diagonal points. Result is Dgm(F).

birth death birth death

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 8 / 33

slide-10
SLIDE 10

Example: persistent homology

Let Y ⊆ RD be compact space. For α ≥ 0, define Yα = d−1

Y [0, α]

For each k, get module {Hk(Yα)}, with maps induced by inclusion.

birth death birth death

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 9 / 33

slide-11
SLIDE 11

1

Persistence Review

2

Why Means?

3

Frechet Means of Diagrams

4

Probabilistic Frechet Means

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 10 / 33

slide-12
SLIDE 12

Relate Multiple Samples

slide-13
SLIDE 13

Relate Multiple Samples

slide-14
SLIDE 14

Relate Multiple Samples

How do we give a summary of the data? Will it play nicely with time varying persistence diagrams?

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 11 / 33

slide-15
SLIDE 15

Significance Testing

Suppose we obtain N points X in unit d-ball. We compute the diagram and are impressed with a feature. Should we be impressed?

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 12 / 33

slide-16
SLIDE 16

Towards Topological Null Hypothesis

Experiment: draw N points uniformly from d-ball and compute diagram. Question: what is expected diagram? Hope: repeat experiment many times, take mean diagram as answer.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Mean of 500 1−D PDs generated from a sample of 50 points.

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 13 / 33

slide-17
SLIDE 17

Towards Topological Null Hypothesis

Experiment: draw N points uniformly from d-cube and compute diagram. Question: what is expected diagram? Hope: repeat experiment many times, take mean diagram as answer.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Mean of 500 1−D PDs generated from a sample of 510 points.

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 14 / 33

slide-18
SLIDE 18

1

Persistence Review

2

Why Means?

3

Frechet Means of Diagrams

4

Probabilistic Frechet Means

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 15 / 33

slide-19
SLIDE 19

Diagrams in the Abstract

Abstract Persistence Diagram

An abstract persistence diagram is a countable multiset of points along with the diagonal, ∆ = {(x, x) ∈ R2 | x ∈ R}, with points in ∆ having infinite multiplicity.

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 16 / 33

slide-20
SLIDE 20

Wasserstein Distance on Dp

a b c x y z d

p-Wasserstein distance for diagrams

Given diagrams X and Y , the distance between them is Wp[Lq](X, Y ) = inf

ϕ:X→Y

  • x∈X

(x − ϕ(x)q)p 1/p .

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 17 / 33

slide-21
SLIDE 21

Discrete vs continuous Wasserstein

Discrete

Given diagrams X and Y , the distance between them is Wp[Lq](X, Y ) = inf

ϕ:X→Y

  • x∈X

(x − ϕ(x)q)p 1/p .

Continuous

Given probability distributions, ν and η, on metric space (X, dX) is Wp[dX](ν, η) =

  • inf

γ∈Γ(ν,η)

  • X×X

dX(x, y)p dγ(x, y) 1/p where Γ(ν, η) is the space of distributions on X × X with marginals ν and η respectively.

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 18 / 33

slide-22
SLIDE 22

The metric space (Dp, Wp)

The space of persistence diagrams is Dp = {X | Wp[L2](X, d∅) < ∞} along with the p-Wasserstein metric, Wp[L2]. Theorem (Mileyko et. al.): (Dp, Wp) is complete and separable.

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 19 / 33

slide-23
SLIDE 23

Fr´ echet means

Let ν be a measure on a metric space (Y , d). The Fr´ echet variance of ν is: Varν = inf

x∈Y

  • Fν(x) =
  • Y

d(x, y)2 dν(y) < ∞

  • The set at which the value is obtained

E(ν) = {x|Fν(X) = Varν} is the Fr´ echet expectation of ν, also called Fr´ echet mean.

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 20 / 33

slide-24
SLIDE 24

Fr´ echet means in Dp: Existence

Theorem (Mileyko et. al.): Let ν be a probability measure on (Dp, B(Dp)) with a finite second moment. If ν has compact support, then E(ν) = ∅. In particular, Fr´ echet means of finite sets of diagrams exist.

a b c f g x y z h

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 21 / 33

slide-25
SLIDE 25

Algorithm for Computation - Selections and Matchings

a b c f g x y z h

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 22 / 33

slide-26
SLIDE 26

Algorithm for Computation - Selections and Matchings

a b c f g x y z h

Definition

Given a set of diagrams X1, · · · , XN, a selection is a choice of one point from each diagram, where that point could be ∆.

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 22 / 33

slide-27
SLIDE 27

Algorithm for Computation - Selections and Matchings

a b c f g x y z h

Definition

The trivial selection for a particular off-diagonal point x ∈ Xi is the selection sx which chooses x for Xi and ∆ for every

  • ther diagram.

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 22 / 33

slide-28
SLIDE 28

Algorithm for Computation - Selections and Matchings

a b c f g x y z h         d⋆ d d• 1 b x f 2 a ∆ ∆ 3 ∆ y g 4 ∆ z ∆ 5 ∆ ∆ h 6 c ∆ ∆        

Definition

A matching is a set of selections so that every off-diagonal point

  • f every diagram is part of

exactly one selection.

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 22 / 33

slide-29
SLIDE 29

Algorithm for Computation - Selections and Matchings

Definition

The mean of a selection is the point which minimizes the sum of the square distances to the elements of the selection.

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 22 / 33

slide-30
SLIDE 30

Algorithm for Computation - Selections and Matchings

Definition

The mean of a matching, meanX(G), is a diagram in Dp with a point at the mean of each selection

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 22 / 33

slide-31
SLIDE 31

Problem: Fr´ echet means need not be unique!

a 2 1 b

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 23 / 33

slide-32
SLIDE 32

Problem: Fr´ echet means need not be unique!

x a 1 2 b y u v

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 23 / 33

slide-33
SLIDE 33

Problem: Fr´ echet means need not be unique!

a 2 1 b

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 23 / 33

slide-34
SLIDE 34

Problem: Fr´ echet means need not be unique!

a 2 1 b a 1 2 b

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 23 / 33

slide-35
SLIDE 35

1

Persistence Review

2

Why Means?

3

Frechet Means of Diagrams

4

Probabilistic Frechet Means

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 24 / 33

slide-36
SLIDE 36

Solution: Randomize Matchings!

Note: non-uniquness of mean caused by non-uniqueness of optimal matching. Idea: consider all matchings, with probability weights. Formally: if X = {X1, . . . , XN} ⊆ Dp, then µX ∈ P(Dp), with:

Definition

µX =

  • G

P(H = G) δmeanX (G)

a 2 1 b

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 25 / 33

slide-37
SLIDE 37

Solution: Randomize Matchings!

Note: non-uniquness of mean caused by non-uniqueness of optimal matching. Idea: consider all matchings, with probability weights. Formally: if X = {X1, . . . , XN} ⊆ Dp, then µX ∈ P(Dp), with:

Definition

µX =

  • G

P(H = G) δmeanX (G)

x a 1 2 b y u v

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 25 / 33

slide-38
SLIDE 38

What is H?

H is a matching-valued random variable (randomized coupling). Perturb each diagram Xi to create random diagram X ′

i .

Associate the optimal matching among the drawn diagrams to one of the original matchings. This defines a probability weight on each possible matching.

a 2 1 b

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 26 / 33

slide-39
SLIDE 39

The random diagram

Pick α > 0 Let η ∈ P(R2) be uniform on Bα(0) (other choices also work). Define ηx to be the translation of η to x. For each x ∈ Xi, make X ′

i by:

1

Draw point from ηx

2

If contained in Bx−∆(x), add it to X ′

i .

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 27 / 33

slide-40
SLIDE 40

Example

a b c f g x y z h

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 28 / 33

slide-41
SLIDE 41

Example

z' h' b' f ' g' x' y'

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 28 / 33

slide-42
SLIDE 42

Example

z' h' b' f ' g' x' y'

    d′

d′

  • d′
  • 1

b′ x′ f ′ 2 ∆ y′ g′ 3 ∆ ∆ h′ 4 ∆ z′ ∆    .

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 28 / 33

slide-43
SLIDE 43

Example

a b c f g x y z h

    d′

d′

  • d′
  • 1

b′ x′ f ′ 2 ∆ y′ g′ 3 ∆ ∆ h′ 4 ∆ z′ ∆    .         d⋆ d d 1 b x f 2 ∆ y g 3 ∆ ∆ h 4 ∆ z ∆ 5 a ∆ ∆ 6 c ∆ ∆         .

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 28 / 33

slide-44
SLIDE 44

Main Theorem

Let SM,K ⊆ Dp be diagrams with at most K dots, each with persistence at most M.

Theorem

The map (SM,K)N − → P(SM,NK) X = {X1, . . . , XN} − → µX is H¨

  • lder continuous with exponent 1
  • 2. That is, there exists a constant C

such that the inequality W2(µX, µY ) ≤ C

  • W2(X, Y )

holds for all pairs of sets of N diagrams.

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 29 / 33

slide-45
SLIDE 45

Outline of the Proof

Wasserstein distance on P(Dp)

Wp(ν, η) =

  • inf

γ∈Γ(ν,η)

  • Dp×Dp

W2(X, Y )p dγ(X, Y ) 1/p

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 30 / 33

slide-46
SLIDE 46

Outline of the Proof - Pairing

The problem

It’s easy to associate parts of the matching if a point x ∈ Xi is matched with and off-diagonal point y ∈ Yi under ϕi : Xi → Yi. What do you do with the rest of the points?

Definition

  • Xi = {x ∈ Xi | ϕi(x) = ∆}
  • Yi = {y ∈ Yi | ϕ−1

i

(y) = ∆} GX = matchings on X1, · · · , XN G

X

  • i

X

  • G

Y i

Y

  • GX

GY

Im (i

X) ↔ Im (i X)

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 31 / 33

slide-47
SLIDE 47

Outline of the Proof - Pairing

x a z h b f g y x z h b f g y c

      d⋆ d d 1 b x f 2 ∆ y g 3 ∆ ∆ h 4 ∆ z ∆ 5 a ∆ ∆       .       d⋆ d d 1 b x f 2 ∆ y g 3 ∆ ∆ h 4 ∆ z ∆ 5 c ∆ ∆       .

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 31 / 33

slide-48
SLIDE 48

Outline of the Proof - Big Inequality

Wp(µX, µY ) ≤

  • (G,H)

∈GX ×GY Paired

min{P(HX = G), P(HY = H)} · Wp(meanX(G), meanY (H)) +

  • (G,H)∈GX ×GY

Paired

|P(HX = G) − P(HY = H)| · M +

  • G∈GX unpaired

|P(HX = G)| · M +

  • H∈GY unpaired

|P(HY = H)| · M

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 32 / 33

slide-49
SLIDE 49

Outline of the Proof - Big Inequality

Wp(µX, µY ) ≤

  • (G,H)

∈GX ×GY Paired

min{P(HX = G), P(HY = H)} · Wp(meanX(G), meanY (H)) +

  • (G,H)∈GX ×GY

Paired

|P(HX = G) − P(HY = H)| · M +

  • G∈GX unpaired

|P(HX = G)| · M +

  • H∈GY unpaired

|P(HY = H)| · M

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 32 / 33

slide-50
SLIDE 50

Further Goals

Find explicit relation between older definition and ours. Do some honest statistics (laws of large numbers, ...) Get rid of SM,K crutch.

Paul Bendich (Duke) Probabilistic Fr´ echet Means on Persistence Diagrams July 15, 2013 33 / 33