Shape Context Matching For Efficient OCR Sudeep Pillai May 14, 2012 - - PowerPoint PPT Presentation

shape context matching for efficient ocr
SMART_READER_LITE
LIVE PREVIEW

Shape Context Matching For Efficient OCR Sudeep Pillai May 14, 2012 - - PowerPoint PPT Presentation

Background & Motivation Shape Context Fast Matching Shape Context Matching For Efficient OCR Sudeep Pillai May 14, 2012 Sudeep Pillai Shape Context Matching For Efficient OCR Background & Motivation Shape Context Fast Matching


slide-1
SLIDE 1

Background & Motivation Shape Context Fast Matching

Shape Context Matching For Efficient OCR

Sudeep Pillai May 14, 2012

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-2
SLIDE 2

Background & Motivation Shape Context Fast Matching

Table of contents

1 Background & Motivation

Motivation Background

2 Shape Context

What is a Shape Context? Matching Shape Contexts Simliarity Measure

3 Fast Matching

Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-3
SLIDE 3

Background & Motivation Shape Context Fast Matching Motivation Background

Motivation

Automatic translation/transcription of handwritten/printed text Printed text has several geometric constraints that can be utilized for improved performance Significant push for accuracy, not too much on optimization

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-4
SLIDE 4

Background & Motivation Shape Context Fast Matching Motivation Background

Object Character Recognition

MNIST database performance

Digits size normalized, and centered in a fixed-size image 60,000 training examples, 10,000 test examples

Classifier Preprocessing Test Error Rate % Linear Classfiers Linear classifier (1-layer NN) None 12.0 Pairwise linear classifier Deskewing 7.6 K-Nearest Neighbors K-NN, Euclidean (L2) None 3.09 K-NN, Euclidean (L3) Deskewing, noise removal 1.22 K-NN, Shape context matching Shape context extraction 0.63

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-5
SLIDE 5

Background & Motivation Shape Context Fast Matching Motivation Background

Object Character Recognition

MNIST database performance

Digits size normalized, and centered in a fixed-size image 60,000 training examples, 10,000 test examples

Classifier Preprocessing Test Error Rate % SVMSs SVM Gaussian Kernel None 1.4 Virtual SVM, deg-9 poly, 2-pixel jittered None 0.56 Neural Nets Deep convex net, unsup pre-training None 0.83 Convolution Nets Committe of 35 conv. net Normalization 0.23

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-6
SLIDE 6

Background & Motivation Shape Context Fast Matching Motivation Background

Object Character Recognition

Figure: A few digits from the MNIST database

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-7
SLIDE 7

Background & Motivation Shape Context Fast Matching Motivation Background

Object Character Recognition

MNIST database performance

Digits size normalized, and centered in a fixed-size image 60,000 training examples, 10,000 test examples

Classifier Preprocessing Test Error Rate % Linear Classfiers Linear classifier (1-layer NN) None 12.0 Pairwise linear classifier Deskewing 7.6 K-Nearest Neighbors K-NN, Euclidean (L2) None 3.09 K-NN, Euclidean (L3) Deskewing, noise removal 1.22 K-NN, Shape context matching Shape context extraction 0.63

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-8
SLIDE 8

Background & Motivation Shape Context Fast Matching What is a Shape Context? Matching Shape Contexts Simliarity Measure

What is a Shape Context?

Definition (Shape) A shape is represented as a sequence of boundary points: P = {p1, . . . , pn}, pi ∈ R2 Definition (Shape Context) Shape context is a descriptor of interest point i.e. a histogram hi(k) = #{pj j = i, xj−xi ∈ bin(k)}, in which bins are uniformly divided in log-polar space

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-9
SLIDE 9

Background & Motivation Shape Context Fast Matching What is a Shape Context? Matching Shape Contexts Simliarity Measure

Shape Context Representation

Figure: Graphical representation of shape context bins

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-10
SLIDE 10

Background & Motivation Shape Context Fast Matching What is a Shape Context? Matching Shape Contexts Simliarity Measure

Shape Context Histogram

Figure: Graphical representation of shape context histograms ℜ60

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-11
SLIDE 11

Background & Motivation Shape Context Fast Matching What is a Shape Context? Matching Shape Contexts Simliarity Measure

Matching Shape Contexts

The cost of matching point pi on the first shape to point qj

  • n the second shape (chi-square distance)

Cij = 1 2

K

  • k=1

[hi(k) − hj(k)]2 hi(k) + hj(k) Minimize the total matching cost:

i C(pi, qπ(i))

Optimal matching One possible technique to solve this problem is to use Hungarian method in O(n3) time complexity

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-12
SLIDE 12

Background & Motivation Shape Context Fast Matching What is a Shape Context? Matching Shape Contexts Simliarity Measure

Properties of shape contexts

Invariant to translation and scale (as it is normalized by the mean distance of the n2 point pairs) Can be made invariant to rotation (local tangent orientation) Tolerant to small affine distortion (log-polar, spatial blur proportional to r)

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-13
SLIDE 13

Background & Motivation Shape Context Fast Matching What is a Shape Context? Matching Shape Contexts Simliarity Measure

Simliarity Measure

Definition On employing a cubic spline transformation T, the two shapes’ similarity can be measured via a weighted sum D = aDac + Dsc + bDbe Dsc Shape context distance Dac Appearance cost Dbe Bending energy or transformation cost

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-14
SLIDE 14

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Dimensionality Reduction

Approximate matching is possible with full shape context feature A low-dimensional feature descriptor is desirable for performance purposes

Uniform bin approximation will make matching accuracy decline with feature dimension d2 Multiple modalities are representable even with a reduced subspace

Use Principal Components Analysis to determine bases that define this shape context subspace Approximate matching can be performed faster once all ℜ60 vectors are projected onto ℜ3

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-15
SLIDE 15

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Dimensionality Reduction

Figure: Projecting histograms of contour points onto the shape context subspace.

The points on the human figure on the right are colored according to their 3-D shape context subspace feature values

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-16
SLIDE 16

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Dimensionality Reduction

Figure: Visualization of feature subspace constructed from shape context histograms

for two different data sets. The RGB channels of each point on the contours are colored according to its histograms 3-D PCA coefficient values. Set matching in this feature space means that contour points of similar color have a low matching cost, while highly contrasting colors incur a high matching cost

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-17
SLIDE 17

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Dimensionality Reduction Tradeoffs

Larger d is

Smaller the PCA reconstruction error Larger the distortion induced by the L1 embedding Larger the complexity of computing the embedding

Do we really need a ℜ60 feature vector to represent a shape?

Shapes are almost never similar Approximate measures make more sense Extract only most discriminating dimensions as descriptor

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-18
SLIDE 18

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Pyramid Matching

X and Y are two sets of vectors in a ℜd feature space Find an approximate correspondence between X and Y

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-19
SLIDE 19

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Pyramid Matching Overview

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-20
SLIDE 20

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Pyramid Matching Kernels

Construct a sequence of grids at resolution 0, . . . , L where a grid at a resolution l has D = 2dl cells. Compute the histograms Hl

X and l Y where

Hl

X and Hl Y are histograms of X and Y at resolution l

Hl

X(i) and Hl Y (i) are the number of points of X and Y in the

ith cell

Compute the number of matches for each resolution using: I(Hl

X, Hl Y ) = D

  • i=1

min(Hl

X(i), Hl Y (i))

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-21
SLIDE 21

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Pyramid Matching Kernels

Summing all the Il giving more importance to the high resolution with: K(X, Y ) = IL+

L

  • l=0

−1 1 2L−1 (Il−Il+1) = 1 2L I0+

L

  • l=1

1 2L−l+1 Il where Il − Il+1 is the number of new matches

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-22
SLIDE 22

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Pyramid Matching (l = 0)

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-23
SLIDE 23

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Pyramid Matching (l = 1)

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-24
SLIDE 24

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Pyramid Matching (l = 2)

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-25
SLIDE 25

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Pyramid Matching

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-26
SLIDE 26

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Comparison with Optimal Matching

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-27
SLIDE 27

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Vocabulary-guided Matching

Figure: The bins are concentrated on decomposing the space where features cluster,

particularly for high-dimensional features (in this figure ℜ2). Features are small points in red, bin centers are larger black points, and blue lines denote bin boundaries. The vocabulary-guided bins are irregularly shaped Voronoi cells.

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-28
SLIDE 28

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Performance

Computing partial matching

Earth Mover’s Distance O(dm3 log m) Hungarian method O(dm3) Greedy matching O(dm2 log m) Pyramid match O(dmL)

for sets with O(m) ℜd features and pyramids with L levels

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-29
SLIDE 29

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Affine Constraints - RANSAC

Figure: Interest points computed on

image 1

Figure: Interest points computed on

image 2

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-30
SLIDE 30

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Affine Constraints - RANSAC

Figure: Find correspondences between interest points

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-31
SLIDE 31

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Affine Constraints - RANSAC

Figure: Outlier removal via RANSAC (Random Sampling And Consensus)

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-32
SLIDE 32

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Additional improvements

RANSAC gives an initial estimate of affine transformation between canonical set of points and query points Utilize affine transformation estimate to perform vocabulary/geometrically guided searching/matching Could use MLESAC/PROSAC to perform probabilistic searching Ability to add constraints to the pyramid matching scheme to reduce query time, and improve robustness to partial matching

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-33
SLIDE 33

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Conclusions

Investigated and implemented a shape descriptor invariant to rotation and scale Integrated an approximate matching scheme that has a linear time complexity Scheme extends well with increase in size of the databse of descriptors Significant improvement in speed with little tradeoff in accuracy Source code available soon

Sudeep Pillai Shape Context Matching For Efficient OCR

slide-34
SLIDE 34

Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching

Conclusions

Thanks!

Sudeep Pillai Shape Context Matching For Efficient OCR