Background & Motivation Shape Context Fast Matching
Shape Context Matching For Efficient OCR
Sudeep Pillai May 14, 2012
Sudeep Pillai Shape Context Matching For Efficient OCR
Shape Context Matching For Efficient OCR Sudeep Pillai May 14, 2012 - - PowerPoint PPT Presentation
Background & Motivation Shape Context Fast Matching Shape Context Matching For Efficient OCR Sudeep Pillai May 14, 2012 Sudeep Pillai Shape Context Matching For Efficient OCR Background & Motivation Shape Context Fast Matching
Background & Motivation Shape Context Fast Matching
Sudeep Pillai May 14, 2012
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching
1 Background & Motivation
Motivation Background
2 Shape Context
What is a Shape Context? Matching Shape Contexts Simliarity Measure
3 Fast Matching
Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Motivation Background
Automatic translation/transcription of handwritten/printed text Printed text has several geometric constraints that can be utilized for improved performance Significant push for accuracy, not too much on optimization
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Motivation Background
MNIST database performance
Digits size normalized, and centered in a fixed-size image 60,000 training examples, 10,000 test examples
Classifier Preprocessing Test Error Rate % Linear Classfiers Linear classifier (1-layer NN) None 12.0 Pairwise linear classifier Deskewing 7.6 K-Nearest Neighbors K-NN, Euclidean (L2) None 3.09 K-NN, Euclidean (L3) Deskewing, noise removal 1.22 K-NN, Shape context matching Shape context extraction 0.63
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Motivation Background
MNIST database performance
Digits size normalized, and centered in a fixed-size image 60,000 training examples, 10,000 test examples
Classifier Preprocessing Test Error Rate % SVMSs SVM Gaussian Kernel None 1.4 Virtual SVM, deg-9 poly, 2-pixel jittered None 0.56 Neural Nets Deep convex net, unsup pre-training None 0.83 Convolution Nets Committe of 35 conv. net Normalization 0.23
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Motivation Background
Figure: A few digits from the MNIST database
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Motivation Background
MNIST database performance
Digits size normalized, and centered in a fixed-size image 60,000 training examples, 10,000 test examples
Classifier Preprocessing Test Error Rate % Linear Classfiers Linear classifier (1-layer NN) None 12.0 Pairwise linear classifier Deskewing 7.6 K-Nearest Neighbors K-NN, Euclidean (L2) None 3.09 K-NN, Euclidean (L3) Deskewing, noise removal 1.22 K-NN, Shape context matching Shape context extraction 0.63
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching What is a Shape Context? Matching Shape Contexts Simliarity Measure
Definition (Shape) A shape is represented as a sequence of boundary points: P = {p1, . . . , pn}, pi ∈ R2 Definition (Shape Context) Shape context is a descriptor of interest point i.e. a histogram hi(k) = #{pj j = i, xj−xi ∈ bin(k)}, in which bins are uniformly divided in log-polar space
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching What is a Shape Context? Matching Shape Contexts Simliarity Measure
Figure: Graphical representation of shape context bins
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching What is a Shape Context? Matching Shape Contexts Simliarity Measure
Figure: Graphical representation of shape context histograms ℜ60
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching What is a Shape Context? Matching Shape Contexts Simliarity Measure
The cost of matching point pi on the first shape to point qj
Cij = 1 2
K
[hi(k) − hj(k)]2 hi(k) + hj(k) Minimize the total matching cost:
i C(pi, qπ(i))
Optimal matching One possible technique to solve this problem is to use Hungarian method in O(n3) time complexity
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching What is a Shape Context? Matching Shape Contexts Simliarity Measure
Invariant to translation and scale (as it is normalized by the mean distance of the n2 point pairs) Can be made invariant to rotation (local tangent orientation) Tolerant to small affine distortion (log-polar, spatial blur proportional to r)
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching What is a Shape Context? Matching Shape Contexts Simliarity Measure
Definition On employing a cubic spline transformation T, the two shapes’ similarity can be measured via a weighted sum D = aDac + Dsc + bDbe Dsc Shape context distance Dac Appearance cost Dbe Bending energy or transformation cost
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
Approximate matching is possible with full shape context feature A low-dimensional feature descriptor is desirable for performance purposes
Uniform bin approximation will make matching accuracy decline with feature dimension d2 Multiple modalities are representable even with a reduced subspace
Use Principal Components Analysis to determine bases that define this shape context subspace Approximate matching can be performed faster once all ℜ60 vectors are projected onto ℜ3
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
Figure: Projecting histograms of contour points onto the shape context subspace.
The points on the human figure on the right are colored according to their 3-D shape context subspace feature values
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
Figure: Visualization of feature subspace constructed from shape context histograms
for two different data sets. The RGB channels of each point on the contours are colored according to its histograms 3-D PCA coefficient values. Set matching in this feature space means that contour points of similar color have a low matching cost, while highly contrasting colors incur a high matching cost
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
Larger d is
Smaller the PCA reconstruction error Larger the distortion induced by the L1 embedding Larger the complexity of computing the embedding
Do we really need a ℜ60 feature vector to represent a shape?
Shapes are almost never similar Approximate measures make more sense Extract only most discriminating dimensions as descriptor
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
X and Y are two sets of vectors in a ℜd feature space Find an approximate correspondence between X and Y
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
Construct a sequence of grids at resolution 0, . . . , L where a grid at a resolution l has D = 2dl cells. Compute the histograms Hl
X and l Y where
Hl
X and Hl Y are histograms of X and Y at resolution l
Hl
X(i) and Hl Y (i) are the number of points of X and Y in the
ith cell
Compute the number of matches for each resolution using: I(Hl
X, Hl Y ) = D
min(Hl
X(i), Hl Y (i))
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
Summing all the Il giving more importance to the high resolution with: K(X, Y ) = IL+
L
−1 1 2L−1 (Il−Il+1) = 1 2L I0+
L
1 2L−l+1 Il where Il − Il+1 is the number of new matches
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
Figure: The bins are concentrated on decomposing the space where features cluster,
particularly for high-dimensional features (in this figure ℜ2). Features are small points in red, bin centers are larger black points, and blue lines denote bin boundaries. The vocabulary-guided bins are irregularly shaped Voronoi cells.
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
Computing partial matching
Earth Mover’s Distance O(dm3 log m) Hungarian method O(dm3) Greedy matching O(dm2 log m) Pyramid match O(dmL)
for sets with O(m) ℜd features and pyramids with L levels
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
Figure: Interest points computed on
image 1
Figure: Interest points computed on
image 2
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
Figure: Find correspondences between interest points
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
Figure: Outlier removal via RANSAC (Random Sampling And Consensus)
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
RANSAC gives an initial estimate of affine transformation between canonical set of points and query points Utilize affine transformation estimate to perform vocabulary/geometrically guided searching/matching Could use MLESAC/PROSAC to perform probabilistic searching Ability to add constraints to the pyramid matching scheme to reduce query time, and improve robustness to partial matching
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
Investigated and implemented a shape descriptor invariant to rotation and scale Integrated an approximate matching scheme that has a linear time complexity Scheme extends well with increase in size of the databse of descriptors Significant improvement in speed with little tradeoff in accuracy Source code available soon
Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching
Sudeep Pillai Shape Context Matching For Efficient OCR