Pyramidal Stochastic Graphlet Embedding for Document Pattern - - PowerPoint PPT Presentation

pyramidal stochastic graphlet embedding for document
SMART_READER_LITE
LIVE PREVIEW

Pyramidal Stochastic Graphlet Embedding for Document Pattern - - PowerPoint PPT Presentation

Pyramidal Stochastic Graphlet Embedding for Document Pattern Classification Anjan Dutta , Pau Riba, Josep Llad os, Alicia Forn es Computer Vision Center, Autonomous University of Barcelona ICDAR, Kyoto, Japan, 13th November, 2017


slide-1
SLIDE 1

Pyramidal Stochastic Graphlet Embedding for Document Pattern Classification

Anjan Dutta, Pau Riba, Josep Llad´

  • s, Alicia Forn´

es Computer Vision Center, Autonomous University of Barcelona ICDAR, Kyoto, Japan, 13th November, 2017

slide-2
SLIDE 2

Introduction Pyramidal Graph Representation Stochastic Graphlet Embedding Experimental Validation Conclusion

Outline

Introduction Pyramidal Graph Representation Stochastic Graphlet Embedding Stochastic Graphlets Sampling Hashed Graphlets Distribution Experimental Validation Datasets Results Conclusions and Future Work

2 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-3
SLIDE 3

Introduction

3 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-4
SLIDE 4

Introduction Pyramidal Graph Representation Stochastic Graphlet Embedding Experimental Validation Conclusion

Introduction

Document pattern classification

◮ Word and symbol classification. ◮ Application: document feature generation, document

categorization, spam filtering etc.

armchair bed sink door table tub window sofa ... armchair bed sink door table tub window sofa

  • rders

and letters weight from twelve The gift ...

4 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-5
SLIDE 5

Introduction Pyramidal Graph Representation Stochastic Graphlet Embedding Experimental Validation Conclusion

Introduction

Graph based representation

◮ Limitations of statistical pattern recognition. ◮ Advantages of structural pattern recognition. ◮ Graph based representation: relation between object parts. ◮ Invariant to rotation and affine transformation. ◮ Comparing graphs: graph matching, graph kernel.

5 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-6
SLIDE 6

Introduction Pyramidal Graph Representation Stochastic Graphlet Embedding Experimental Validation Conclusion

Introduction

Motivation

◮ Document part → graph ⇒ noisy conversion ◮ Unstable representation.

6 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-7
SLIDE 7

Introduction Pyramidal Graph Representation Stochastic Graphlet Embedding Experimental Validation Conclusion

Introduction

Contribution

◮ Graph pyramid: multi-scale graph, tolerate noise, stable

representation.

◮ Stochastic graphlet embedding: avoid graph matching, allows

application of machine learning techniques, low to high order graphlets statistics.

... ... ...

7 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-8
SLIDE 8

Pyramidal Graph Representation

8 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-9
SLIDE 9

Introduction Pyramidal Graph Representation Stochastic Graphlet Embedding Experimental Validation Conclusion

Pyramidal Graph Representation

◮ Multi-scale graph, information at different

resolutions.

◮ Higher leveled graphs contain abstract

information.

◮ Graph pyramid construction techniques:

  • 1. Girvan-Newman
  • 2. grPartition

9 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-10
SLIDE 10

Introduction Pyramidal Graph Representation Stochastic Graphlet Embedding Experimental Validation Conclusion

Girvan-Newman Algorithm

◮ Algorithm for graph clustering (Girvan

and Newman NAS 2002).

◮ Basic principle:

  • 1. Compute edge centrality.
  • 2. Remove edge with highest score.
  • 3. Recompute all scores.
  • 4. Repeat 2nd step.

◮ Results in a dendogram where each

node is an independent cluster.

◮ Algorithm stops when the given

number of clusters is reached.

Figure credit: S. Papadopoulos, CERTH-ITI, 2011. 10 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-11
SLIDE 11

Introduction Pyramidal Graph Representation Stochastic Graphlet Embedding Experimental Validation Conclusion

Pyramid Generation

◮ Pyramid construction: at a

higher level each cluster is represented as a node.

◮ Hierarchical edges: clustered

nodes to their representative in the higher level.

11 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-12
SLIDE 12

Stochastic Graphlet Embedding

12 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-13
SLIDE 13

Introduction Pyramidal Graph Representation Stochastic Graphlet Embedding Experimental Validation Conclusion

Stochastic Graphlets Sampling

◮ Graphlet sampling is a stochastic and recurrent procedure. ◮ It is controlled by two parameters M and T. ◮ Basic principles:

  • 1. Randomly select a node v from G.
  • 2. Add the node v to an empty graph G.
  • 3. Recursively add T connected edges to G.
  • 4. Restart 1st step M times.

◮ Animation: M = 10, T = 6.

13 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-14
SLIDE 14

Introduction Pyramidal Graph Representation Stochastic Graphlet Embedding Experimental Validation Conclusion

Stochastic Graphlets Sampling

◮ A random walk process with a restart. ◮ Samples M × T connected graphlets, with edges varying from

1 to T.

◮ Hypothesis: empirical distribution of large amount of sampled

graphlets will be same to actual distribution.

14 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-15
SLIDE 15

Introduction Pyramidal Graph Representation Stochastic Graphlet Embedding Experimental Validation Conclusion

Hashed Graphlets Distribution

◮ Graph hash functions:

  • 1. Degree of nodes
  • 2. Betweenness centrality
  • 3. Core numbers
  • 4. Clustering coefficients

◮ Probability of collision (Dutta and Sahbi, ArXiv, 2017) ◮ Hash functions with low probability of collision: degree of

nodes, betweenness centrality.

◮ Hash function =

  • degree of nodes,

if t ≤ 4 betweenness centrality,

  • therwise

15 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-16
SLIDE 16

Introduction Pyramidal Graph Representation Stochastic Graphlet Embedding Experimental Validation Conclusion

Pyramidal Stochastic Graphlet Embedding

Summary

16 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-17
SLIDE 17

Experimental Validation

17 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-18
SLIDE 18

Introduction Pyramidal Graph Representation Stochastic Graphlet Embedding Experimental Validation Conclusion

Datasets

HistoGraph

◮ Perfectly segmented word images from George Washington

(GW) dataset.

◮ 30 different words and six different representations: ◮ Three independent subsets: training (90 words), validation

(60 words) and test (143 words).

◮ Frequency: train and validation set (2 to 3), test set (3 to 5).

Figure credit: Stauffer et al. S+SSPR 2016 18 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-19
SLIDE 19

Introduction Pyramidal Graph Representation Stochastic Graphlet Embedding Experimental Validation Conclusion

Results

HistoGraph

Subset

  • Acc. GED
  • Acc. SGE
  • Acc. PSGE

Level 2 Level 3 Keypoint 77.62 78.32 80.42 (+2.10) 78.32 (+0.00) Grid-NNA 65.03 72.73 72.73 (+0.00) 74.13 (+1.40) Grid-MST 74.13 76.92 75.52 (-1.40) 74.83 (-2.09) Grid-DEL 62.94 74.83 79.02 (+4.19) 79.02 (+4.19) Projection 81.82 79.02 79.72 (+0.70) 80.42 (+1.40) Split 80.42 77.62 80.42 (+2.80) 77.62 (+0.00)

19 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-20
SLIDE 20

Introduction Pyramidal Graph Representation Stochastic Graphlet Embedding Experimental Validation Conclusion

Datasets

GREC

◮ Graphs representing symbols from architectural and electronic

drawings.

◮ 22 different classes and five different distortion levels: ◮ Preprocessing applied for cleaning the images and converting

them to graphs.

◮ Three independent subsets: training and validation (286

symbols), test (528 symbols).

◮ Frequency: train and validation set (13), test set (24).

Figure credit: Riesen and Bunke SSPR 2008 20 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-21
SLIDE 21

Introduction Pyramidal Graph Representation Stochastic Graphlet Embedding Experimental Validation Conclusion

Results

GREC

Method Unlabelled Labelled Dissimilarity Embedding (Bunke and Riesen PR 2010)

  • 95.10

Node Attribute Statistics (Gibert et al. PR 2012)

  • 99.20

Fuzzy Graph Embedding (Luqman et al. PR 2013)

  • 97.30

SGE (Dutta and Sahbi ArXiv 2017) 92.80 99.62 Level 2 Level 3 PSGE 93.18 (+0.38) 99.62 (+0.00) 99.81 (+0.19)

21 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-22
SLIDE 22

Conclusions and Future Work

22 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-23
SLIDE 23

Introduction Pyramidal Graph Representation Stochastic Graphlet Embedding Experimental Validation Conclusion

Conclusions and Future Work

◮ Proposal of pyramidal stochastic graphlet embedding. ◮ Pyramidal representation of graph tolerates noise and

distortion.

◮ SGE samples low to high order graphlets providing robust

structural statistics.

◮ Consideration of hierarchical edges as a future line of work.

23 Pyramidal Stochastic Graphlet Embedding Dutta et al.

slide-24
SLIDE 24

Introduction Pyramidal Graph Representation Stochastic Graphlet Embedding Experimental Validation Conclusion

Thanks for your attention! Questions?

Anjan Dutta, PhD Marie-Curie Postdoctoral Fellow Computer Vision Center Autonomous University of Barcelona Email: adutta@cvc.uab.es

24 Pyramidal Stochastic Graphlet Embedding Dutta et al.