Topological Structures in the Analysis of Images and Data Chao Chen - - PowerPoint PPT Presentation

topological structures in the analysis of images and data
SMART_READER_LITE
LIVE PREVIEW

Topological Structures in the Analysis of Images and Data Chao Chen - - PowerPoint PPT Presentation

Topological Structures in the Analysis of Images and Data Chao Chen City University of New York (CUNY) Oct. 2016 C. Chen (CUNY) Topological Structures in the Analysis of Images and Data 1 / 43 Outline Topological Structures 1 High


slide-1
SLIDE 1

Topological Structures in the Analysis of Images and Data

Chao Chen

City University of New York (CUNY)

  • Oct. 2016
  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 1 / 43

slide-2
SLIDE 2

Outline

1

Topological Structures

2

High Dimensional Data Algorithms Applications

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 2 / 43

slide-3
SLIDE 3

Topological Structures

global, multi-scale, independent to geometry 0 dim 1 dim 2 dim

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 3 / 43

slide-4
SLIDE 4

Topological Structures of Data

For a dataset, what are the components and loops of the data? TDA: detect these structures in a robust way.

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 4 / 43

slide-5
SLIDE 5

Topological Structures of Data

For a dataset, what are the components and loops of the data? TDA: detect these structures in a robust way.

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 4 / 43

slide-6
SLIDE 6

Persistent Homology: A Robust Way to Extract Topological Structures

Input: a (density) function, f Output: topological structures & their persistence

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 5 / 43

slide-7
SLIDE 7

Persistent Homology: A Robust Way to Extract Topological Structures

Input: a (density) function, f Output: topological structures & their persistence Def: given threshold t, the superlevel set f −1[t, +∞) := {x|f (x) ≥ t}

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 5 / 43

slide-8
SLIDE 8

Persistent Homology (continued)

the true structures are hidden in superlevel sets consider the whole stack of superlevel sets identify structures that often appear (high persistence) Output: persistence diagram – dots representing all structures

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 6 / 43

slide-9
SLIDE 9

Persistent Homology (continued)

the true structures are hidden in superlevel sets consider the whole stack of superlevel sets identify structures that often appear (high persistence) Output: persistence diagram – dots representing all structures Diagram

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 6 / 43

slide-10
SLIDE 10

Why Topological Structures: Cardiac data (Demo)

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 7 / 43

slide-11
SLIDE 11

Why Topological Structures: Cardiac data (Demo)

Thresholding Thresholding: local evidence, minimize energy E(y) E(y) =

  • v

Ev(yv), yv ∈ {BG, FG}

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 7 / 43

slide-12
SLIDE 12

Why Topological Structures: Cardiac data (Demo)

Thresholding Advanced Thresholding: local evidence, minimize energy E(y) E(y) =

  • v

Ev(yv), yv ∈ {BG, FG} Advanced: pairwise local evidence E(y) =

  • v

Ev(yv) +

  • (u,v)

Eu,v(yu, yv)

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 7 / 43

slide-13
SLIDE 13

Why Topology Data Analysis?

Recovering missing trabeculae: [Gao, Chen , et al. IPMI’13]

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 8 / 43

slide-14
SLIDE 14

Why Topology Data Analysis?

Recovering missing trabeculae: [Gao, Chen , et al. IPMI’13]

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 8 / 43

slide-15
SLIDE 15

Morphological Analysis

Endocardial Surface [ISBI’14]

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 9 / 43

slide-16
SLIDE 16

Follow-up Questions (Ongoing)

Validation on a specimen Homology localization problem Ground Truth Simulation Bad Generator Good Generator

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 10 / 43

slide-17
SLIDE 17

Topological Information as Constraints in Segmentation

[Chen et al. CVPR 2011]

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 11 / 43

slide-18
SLIDE 18

Topological Information as Constraints in Segmentation

[Chen et al. CVPR 2011] Input Stencil 1 2 3 4 Final [Jain, Chen , et al. , Computer & Graphics, 2015]

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 11 / 43

slide-19
SLIDE 19

Additional Application: Multi-Layer Stencil Creation

Canvas/wall result: Website, interactive [Jain, Chen , et al. , Computer & Graphics, 2015]

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 12 / 43

slide-20
SLIDE 20

Outline

1

Topological Structures

2

High Dimensional Data Algorithms Applications

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 13 / 43

slide-21
SLIDE 21

Topological Structures for High Dimensional Data

Plenty have been done: data centric, simplicial complex, mapper, etc.

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 14 / 43

slide-22
SLIDE 22

Topological Structures for High Dimensional Data

Plenty have been done: data centric, simplicial complex, mapper, etc. My focus: density function.

◮ Need a good model: high dim, flexibility, computation

– graphical model

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 14 / 43

slide-23
SLIDE 23

Topological Structures for High Dimensional Data

Plenty have been done: data centric, simplicial complex, mapper, etc. My focus: density function.

◮ Need a good model: high dim, flexibility, computation

– graphical model

◮ Locations that contribute to major topological events, critical points

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 14 / 43

slide-24
SLIDE 24

Graphical Model

Markov Random Field (MRF) D dimension; values/labels L = {1, . . . , L} configurations/labelings: X = LD = {1, · · · , L}D

V1 V2 V4 V6 V3 V5 V7 V8

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 15 / 43

slide-25
SLIDE 25

Graphical Model

Markov Random Field (MRF) D dimension; values/labels L = {1, . . . , L} configurations/labelings: X = LD = {1, · · · , L}D

V1 V2 V4 V6 V3 V5 V7 V8 1 1 1 1 1

θij(0, 0)

xi xj

1 1 θij(0, 1) θij(1, 1) θij(1, 0) Binary Potentials θij(xi, xj)

Energy: E(x) =

(i,j)∈E θij(xi, xj)

Probability: P(x) = exp (−E(x))/Z

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 15 / 43

slide-26
SLIDE 26

What can we do with a graphical model?

Previously: Computing the maximum a posteriori (MAP): argmaxx∈X P(x) = argminx∈X E(x) marginals, sampling, etc.

P(x) MAP

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 16 / 43

slide-27
SLIDE 27

What can we do with a graphical model?

Previously: Computing the maximum a posteriori (MAP): argmaxx∈X P(x) = argminx∈X E(x) marginals, sampling, etc.

mode mode P(x) MAP

New Question: How about modes (local maxima)?

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 16 / 43

slide-28
SLIDE 28

Why modes?

A concise description of the probabilistic landscape

mode mode P(x) MAP

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 17 / 43

slide-29
SLIDE 29

Why modes?

A concise description of the probabilistic landscape Multiple predictions

◮ model is not perfect, ambiguity ◮ multiple hypotheses, diverse, highly possible

Other applications: biology, NLP Previous: mean-shift

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 17 / 43

slide-30
SLIDE 30

Definitions

Given a distance function d(·, ·) and a scalar δ

◮ Neighborhood: Nδ(x) = {x′ | d(x, x′) ≤ δ} ◮ x is a mode if it has a bigger prob. than all its neighbors ◮ Mδ : the set of all modes for a given scale δ

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 18 / 43

slide-31
SLIDE 31

Definitions

Given a distance function d(·, ·) and a scalar δ

◮ Neighborhood: Nδ(x) = {x′ | d(x, x′) ≤ δ} ◮ x is a mode if it has a bigger prob. than all its neighbors ◮ Mδ : the set of all modes for a given scale δ

P(x)

δ = 1

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 18 / 43

slide-32
SLIDE 32

Definitions

Given a distance function d(·, ·) and a scalar δ

◮ Neighborhood: Nδ(x) = {x′ | d(x, x′) ≤ δ} ◮ x is a mode if it has a bigger prob. than all its neighbors ◮ Mδ : the set of all modes for a given scale δ

P(x)

δ = 4

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 18 / 43

slide-33
SLIDE 33

Definitions

Given a distance function d(·, ·) and a scalar δ

◮ Neighborhood: Nδ(x) = {x′ | d(x, x′) ≤ δ} ◮ x is a mode if it has a bigger prob. than all its neighbors ◮ Mδ : the set of all modes for a given scale δ

P(x)

δ = 7

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 18 / 43

slide-34
SLIDE 34

Definitions

D the dimension; L = {1, . . . , L} the label set; X = LD the domain Given a distance function d(·, ·) and a scalar δ

◮ Neighborhood: Nδ(x) = {x′ | d(x, x′) ≤ δ} ◮ x is a mode if it has a bigger prob. than all its neighbors ◮ Mδ : the set of all modes for a given scale δ

X = M0 ⊇ M1 ⊇ · · · ⊇ M∞ = {global maximum (MAP)}

δ = 1 δ = 4 δ = 7 δ = 0

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 19 / 43

slide-35
SLIDE 35

Problem

Problem (MModes)

Given a scale δ, compute the top M elements in Mδ. Challenge: exponential domain, exponential neighborhood

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 20 / 43

slide-36
SLIDE 36

Problem

Problem (MModes)

Given a scale δ, compute the top M elements in Mδ. Challenge: exponential domain, exponential neighborhood Contributions Algorithms (chains, trees):

◮ Dynamic programming (DP) ◮ Heuristic search ◮ Local neighborhood search

Applications [AISTATS 2013, NIPS 2014, IJCAI 2016, ICML 2016]

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 20 / 43

slide-37
SLIDE 37

Outline

1

Topological Structures

2

High Dimensional Data Algorithms Applications

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 21 / 43

slide-38
SLIDE 38

Algorithm: Chains

1 2 3 D xD = 0 xD = 1 x1 = 0 x1 = 1

configurations/labelings = paths MAP: the optimal path (dynamic programming), O(DL2)

◮ from right to left ◮ each step: best energy for subchain [i, D] with given label on i

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 22 / 43

slide-39
SLIDE 39

Algorithm: Chains

1 2 3 D xD = 0 xD = 1 x1 = 0 x1 = 1

configurations/labelings = paths MAP: the optimal path (dynamic programming), O(DL2)

◮ from right to left ◮ each step: best energy for subchain [i, D] with given label on i

MBest: best M configurations/labelings x1 = argminx∈X E(x) xm = argminx∈X\{x1,··· ,xm−1} E(x) Nilsson’98 (fancy DP) O(DL2 + MDL + MD log(MD))

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 22 / 43

slide-40
SLIDE 40

Algorithm: Modes on Chains [AISTATS 2013] Key Idea

The whole chain [1, D] → subchains [i, j] of a fixed length Global modes → local modes

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 23 / 43

slide-41
SLIDE 41

Algorithm: Modes on Chains [AISTATS 2013] Key Idea

The whole chain [1, D] → subchains [i, j] of a fixed length Global modes → local modes A partial labeling xi:j

· · · · · · i j

xi:j is a local mode iff for any yi:j s.t. yi = xi, yj = xj E(xi:j) < E(yi:j)

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 23 / 43

slide-42
SLIDE 42

Algorithm: Modes on Chains [AISTATS 2013] Key Idea

The whole chain [1, D] → subchains [i, j] of a fixed length Global modes → local modes A partial labeling xi:j

· · · · · · i j

xi:j is a local mode iff for any yi:j s.t. yi = xi, yj = xj E(xi:j) < E(yi:j)

Lemma

any [i, j] has L2 local modes, computable in polynomial time

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 23 / 43

slide-43
SLIDE 43

Algorithm: Modes on Chains

Theorem (local-global)

x is a mode iff any length δ + 2 partial labeling xi:j is a local mode An example: D = 7, δ = 3

1 2 3 5 6 7 4 1 2 3 5 6 7 4

x1:5 x2:6 x3:7

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 24 / 43

slide-44
SLIDE 44

Algorithm: Modes on Chains

Intuition

◮ Combinations of local modes → global modes ◮ Consistent: agree at common vertices

[1, 5] [2, 6] [3, 7]

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 25 / 43

slide-45
SLIDE 45

Algorithm: Modes on Chains

Intuition

◮ Combinations of local modes → global modes ◮ Consistent: agree at common vertices

[1, 5] [2, 6] [3, 7]

Step 1: construct a new chain,

◮ supernodes [i, j] ◮ labels {local modes of [i, j]} ◮ feasible only if consistent ◮ preserve the energy of the

  • riginal graph

Fact

New chain labeling space: X = Mδ

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 25 / 43

slide-46
SLIDE 46

Algorithm: Modes on Chains

Step 1: construct a new chain,

◮ Configuration space:

  • X = Mδ

◮ Energy:

E( x) = E(x)

[1, 5] [2, 6] [3, 7]

Step 2: M-Modes is reduced to M-Best in the new chain

◮ M-Best: compute the top M configurations ◮ Use Nilsson’98

Total Complexity O(DL3δ + MDL2 + MD log(MD))

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 26 / 43

slide-47
SLIDE 47

Trees

Chains

· · · · · · i j

Subchain of δ plus two adjacent nodes Local modes (L2) Trees Subtree of size δ plus all adjacent nodes Local modes (exponential to the number of adjacent nodes)

Theorem (local-global)

x is a mode iff within any subchain/subtree it is a local mode. Can extend to any graph!

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 27 / 43

slide-48
SLIDE 48

General Situations

Extending the Algorithm Trees (DP) [NIPS’14] Systematic search [IJCAI’16] Local neighborhood search [ICML’16]

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 28 / 43

slide-49
SLIDE 49

General Situations

Extending the Algorithm Trees (DP) [NIPS’14] Systematic search [IJCAI’16] Local neighborhood search [ICML’16] Model Unknown Input: samples Algorithm:

◮ Step 1: estimate a tree distribution (Chow-Liu algorithm) ◮ Step 2: compute modes

Theoretical guarantee P( Mδ = Mδ) → 1 as S → ∞

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 28 / 43

slide-50
SLIDE 50

Outline

1

Topological Structures

2

High Dimensional Data Algorithms Applications

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 29 / 43

slide-51
SLIDE 51

Application: Multiple Predictions

High Probability; Diversity Image Partitioning Task (Berkeley, Stanford Datasets) Ground Truth 1st Mode 2nd Mode 3rd Mode Standford RI Berkeley RI Standford VOI Berkeley VOI

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 30 / 43

slide-52
SLIDE 52

Application: Video Analysis

Gesture recognition: [Chen et al. AISTATS] Pic from [Liu, Chen et al. CVIU]

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 31 / 43

slide-53
SLIDE 53

Clustering Discrete Data [ICML 2016]

Start from each data, local search until stops at a mode Synthetic data: D = 110, L = 4, 4 clusters randomly perturb 5% and 10% attributes Visualized in 2D (using MDS) GT/Ours ROCK AP kmodes Performance (in NMI)

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 32 / 43

slide-54
SLIDE 54

Clustering Discrete Data [ICML 2016]

DNA Barcoding data ([Kuksa & Pavlovic BMC Bioinformatics]) 600 to 900 dimension Alignment free Also UCI datasets.

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 33 / 43

slide-55
SLIDE 55

Application: User Interaction (Ongoing)

Electron Microscopy (EM) Images of Fly/Mouse Brains Input: 2D or 3D EM images; boundary likelihood map Output: partitioning of the image EM Images Likelihood Results

Pic from Takemura et al. Nature’13

[Uzunba¸ s, Chen and Metaxas, MICCAI’14, MedIA’15]

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 34 / 43

slide-56
SLIDE 56

Application: User Interaction (Ongoing)

Multiple proposals for user to select and modify

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 35 / 43

slide-57
SLIDE 57

Conclusion

Topological structures: global structure/prior/information Individual data/images Whole dataset

◮ New perspective to the model: inference and more

Thank You! Questions?

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 36 / 43

slide-58
SLIDE 58

Appendix

Convergence rate for modes estimation d dimension, L label set size, n sample size

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 37 / 43

slide-59
SLIDE 59

Trees: Idea 1, Fancy DP [NIPS 2014]

Build a new tree

◮ Supernodes ← subtrees ◮ Labels ← local modes ◮ M-Best configurations ← M-Modes

Issue: number of local modes can be exponential to the tree-degree, even for small δ

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 38 / 43

slide-60
SLIDE 60

Trees: Idea 1, Fancy DP [NIPS 2014]

Complexity O

  • D2dLδ2(L + δ)(Ld + λd) + Dλ2 + MDλ + MD log(MD)
  • d tree degree

λ max # of local modes for any ball

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 39 / 43

slide-61
SLIDE 61

Trees: Idea 1, Fancy DP [NIPS 2014]

Complexity O

  • D2dLδ2(L + δ)(Ld + λd) + Dλ2 + MDλ + MD log(MD)
  • d tree degree

λ max # of local modes for any ball In practice (bounded tree degree)

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 39 / 43

slide-62
SLIDE 62

Trees: Idea 2, Heuristic Search [IJCAI 2016]

Compute all local modes → only compute when necessary Heuristic search:

◮ For each state, verify whether one local pattern is a local mode ⋆ if not, prune the whole subtree ◮ Many states (and thus local modes) may never be reached ◮ A*, Death First Search Branch and Bound (DFBnB)

V1 V2 V3

xxx 0xx 1xx 00x 01x 10x 11x

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 40 / 43

slide-63
SLIDE 63

Trees: Idea 2, Heuristic Search [IJCAI 2016]

Compute all local modes → only compute when necessary Heuristic search:

◮ For each state, verify whether one local pattern is a local mode ⋆ if not, prune the whole subtree ◮ Many states (and thus local modes) may never be reached ◮ A*, Death First Search Branch and Bound (DFBnB)

V1 V2 V3

xxx 0xx 1xx 00x 01x 10x 11x Caveat:

◮ Not any cheaper in the worst case senario ◮ Needs the MAP computation

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 40 / 43

slide-64
SLIDE 64

Trees: Idea 2, Heuristic Search [IJCAI 2016]

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 41 / 43

slide-65
SLIDE 65

Trees: Idea 2, Heuristic Search [IJCAI 2016]

Also UCI datasets.

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 41 / 43

slide-66
SLIDE 66

Trees: Idea 3, Local Search

Pic from Nowozin and Lampert

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 42 / 43

slide-67
SLIDE 67

Trees: Idea 3, Local Search

Each step: to compute the best neighbor in Nδ(y), argminz∈Nδ(y)\{y} E(z) Complexity O(DdLδ2(L + δ))

  • C. Chen (CUNY)

Topological Structures in the Analysis of Images and Data 43 / 43