Spatial Coordinate Coding To Reduce Histogram Representations, - - PowerPoint PPT Presentation

spatial coordinate coding to reduce histogram
SMART_READER_LITE
LIVE PREVIEW

Spatial Coordinate Coding To Reduce Histogram Representations, - - PowerPoint PPT Presentation

Spatial Coordinate Coding To Reduce Histogram Representations, Dominant Angle And Colour Pyramid Match P. Koniusz, K. Mikolajczyk CVSSP, University of Surrey, UK { P.Koniusz, K.Mikolajczyk } @surrey.ac.uk September 11, 2011 P. Koniusz, K.


slide-1
SLIDE 1

Spatial Coordinate Coding To Reduce Histogram Representations, Dominant Angle And Colour Pyramid Match

  • P. Koniusz, K. Mikolajczyk

CVSSP, University of Surrey, UK {P.Koniusz, K.Mikolajczyk}@surrey.ac.uk

September 11, 2011

  • P. Koniusz, K. Mikolajczyk (CVSSP)

Spatial Cooridnate Coding September 11, 2011 1 / 13

slide-2
SLIDE 2

Introduction

Recognition approach (Bag of Words)

  • 1. Feature extraction

Compute descriptors

  • 3. Mid-level features
  • 2. Visual vocabulary

Kernel + SVM or KDA Cluster descriptors Detect key-points Build histograms

  • 4. Classification

Average or max pooling

freq. codewords S2-spatial pyramid match pool|LX0,LY0 … pool|LX1,LY1 pool|LX2,LY2

  • 1. Feature extraction
  • 3. Mid-level features
  • 2. Visual vocabulary
  • 4. Classification

… L0 L1 L2

Spatial Pyramid Match [S. Lazebnik, 2006] at a heart of modern

  • bject category recognition to exploit spatial bias in images

Mid-level feature representations result from mapping low level features (e.g. descriptors) to a given vocabulary space Increasing number of quantisation levels results in extreme histogram vectors of 200K or more elements

  • P. Koniusz, K. Mikolajczyk (CVSSP)

Spatial Cooridnate Coding September 11, 2011 2 / 13

slide-3
SLIDE 3

Introduction

Aim

To propose a new joint appearance and spatial representation To reduce resulting vector sizes and therefore both computational and memory requirements To investigate which of pooling modalities (spatial, dominant angle, scale, colour bias) benefit from multiple levels of quantisation

Bias in images (Spatial Pyramid Match)

sky ,tree, ship, grass sky, tree tree, ship, grass sky sky, tree, ship, grass grass

Coordinate set Xs of an object s introduces spatial bias p(s| x) ≥ p(s) for x ∈ Xs

  • P. Koniusz, K. Mikolajczyk (CVSSP)

Spatial Cooridnate Coding September 11, 2011 3 / 13

slide-4
SLIDE 4

Introduction

Bias in images (Dominant Edge Orientation)

fencefence fence trunk trunk

Trunks t remain largely vertical order Θt: p(t|θ) ≥ p(t) if θ ∈ Θt

Bias in images (Dominant Colours)

sky trees

Foliage f is of a limited colour set Cf , thus p(f | c) ≥ p(f ) if c ∈ Cf

  • P. Koniusz, K. Mikolajczyk (CVSSP)

Spatial Cooridnate Coding September 11, 2011 4 / 13

slide-5
SLIDE 5

Spatial Coordinate Coding for Soft Assignment

Descriptor to mid-level features mapping:

  • hn = f (

xn), n = 1, ..., N

  • xn ∈ X - image descriptors
  • hn - mid-level features

Mid-level features are Component Membership Probabilities of GMM: hnk = p( mk| xn) = g( xn; mk, σ) K

k′=1 g(

xn; mk′, σ)

  • mk ∈ M - visual words

σ - model paremeter Average (or maximum) pooling operation performed on columns of matrix HN×K We assume independence of visual appearance and spatial bias and code both modalities as a joint distribution (key idea): g

α(n, k) = g[(1 − α)

xn; (1 − α) mk, σ

′]

  • visual term

· g(α x

n; α

m

k, σ

′)

  • spatial term
  • P. Koniusz, K. Mikolajczyk (CVSSP)

Spatial Cooridnate Coding September 11, 2011 5 / 13

slide-6
SLIDE 6

Spatial Coordinate Coding for Sparse Coding

Mid-level features by optimising: arg min

  • hn
  • xn − M

hn

  • 2

+ β| hn| MD×K - visual vocabulary with K atoms of length D Spatial descriptor x

n and dictionary M

′ terms added to the problem

(key idea): arg min

  • hn

(1 − α)

  • xn − M

hn

  • 2
  • visual term

+ α

  • x

n − M

hn

  • 2
  • spatial term

+β| hn| (1) Soft Assignment and Sparse Coding can be spatially enhanced by just concatenating image descriptors with the spatial information x

n, i.e.:

  • xaug

n

= [ √ 1 − α xT

n

  • visual term

, √α( x

n)T

  • spatial term

]T (key outcome)

  • P. Koniusz, K. Mikolajczyk (CVSSP)

Spatial Cooridnate Coding September 11, 2011 6 / 13

slide-7
SLIDE 7

Experiments on Spatial Information (VOC 2010)

Spatial Coordinate Coding

Pascal 2010 [M. Everingham, 2010] Action Classification set 9 classes, 301 training, 307 validation, and 613 testing bounding boxes Soft Assignment (SA) and Spatial Coordinate Coding (SCC) with RBF χ2 kernels used Results reported as Mean Average Precision

SA + SPM(3levels) SA+SCC SA+SCC validation, 1 kernel validation, 1 kernel test, multiple kernels 49.8 51.6 62.15

Spatial Coordinate Coding outperforms Spatial Pyramid Match

  • P. Koniusz, K. Mikolajczyk (CVSSP)

Spatial Cooridnate Coding September 11, 2011 7 / 13

slide-8
SLIDE 8

Experiments on Spatial Information (Flower 17)

Spatial Coordinate Coding

Flower 17 [M. E. Nilsback, 2008], 17 classes, 3 splits of data, each consisting of 680 training, 340 validation, and 340 testing images

Soft Assignment SCC SPM (3 levels) χ2 kernel 91.16 89.3 Sparse Coding SCC SPM (4 levels) linear kernel 88.43 88.86

Spatial Coordinate Coding is a weaker performer if Sparse Coding and linear classifier are used Pyramid Match elevates histogram data to a higher dimensional representation (vital for linear classifier)

  • P. Koniusz, K. Mikolajczyk (CVSSP)

Spatial Cooridnate Coding September 11, 2011 8 / 13

slide-9
SLIDE 9

Experiments on Dominant Angle Pyramid Match

Dominant Angle Pooling

Pascal 2007 consists of 20 object categories with high variability in intra-class appearance, rotation, and spatial position Dominant Angle (DA) on descriptor level (variant, invariant, and descriptor augmentation cases)

DA invariant DA variant DA coordinate appended 46.00 50.23 50.24

Dominant Angle is important in classification Dominant Angle (DA) with multiple qunatisation levels (DAPM) and Spatial Pyramid Match (SPM)

SPM (3 levels) DAPM(5levels) DAPM + SPM 54.3 53.40 SPM 56.3

Best results achieved when using both Spatial (3 levels) and Dominant Angle Pyramid Match (5 levels)

  • P. Koniusz, K. Mikolajczyk (CVSSP)

Spatial Cooridnate Coding September 11, 2011 9 / 13

slide-10
SLIDE 10

Experiments on Colour Pyramid Match

Colour Component Pooling

Flower 17 set used for further evaluation as it greatly benefits from colour information Soft Assignment (SA) and Spatial Coordinate Coding (SCC) with RBF χ2 kernels used Results Reported as Average Accuracy

SCC 86.4% SCC+Colour Pyramid Match 87.4% SCC+Colour Pyramid Match+Opponent SIFT 91.4% MKL based approach [F. Yan, 2010] 86.7%

  • P. Koniusz, K. Mikolajczyk (CVSSP)

Spatial Cooridnate Coding September 11, 2011 10 / 13

slide-11
SLIDE 11

Conclusions

Spatial Coordinate Coding outperforms SPM (3 levels) (e.g. by 1.8%

  • n Flower 17)

It reduces histogram sizes from e.g. 56K to 4K bypassing Spatial Pyramid Match Spatial bias does not benefit much form multi-level quantisation Dominant Angle benefits from multi-level quantisation (DAPM) DAPM+SPM results in 2.0% improvement on VOC 2007 Colour Pyramid Match improves further Spatial Coordinate Coding by 1.0% on Flower 17 Letting classifier decide the right level of quantisation for multiple modalities leads to performance improvement

  • P. Koniusz, K. Mikolajczyk (CVSSP)

Spatial Cooridnate Coding September 11, 2011 11 / 13

slide-12
SLIDE 12

Thank You

  • P. Koniusz, K. Mikolajczyk (CVSSP)

Spatial Cooridnate Coding September 11, 2011 12 / 13

slide-13
SLIDE 13

References

  • S. Lazebnik et al. (2006)

Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. CVPR.

  • J. C. Van Gemert et al. (2010)

Visual Word Ambiguity. PAMI.

  • J. Yang et al. (2009)

Linear spatial pyramid matching using sparse coding for image classification. CVPR.

  • M. E. Nilsback et al. (2008)

Automated Flower Classification over a Large Number of Classes. ICCV.

  • M. Everingham et al. (2010)

The PASCAL Visual Object Classes Challenge 2010 (VOC2010) Results. ICCV.

  • F. Yan et al. (2010)

Lp Norm Multiple Kernel Fisher Discriminant Analysis for Object and Image Categorisation. CVPR.

  • P. Koniusz, K. Mikolajczyk (CVSSP)

Spatial Cooridnate Coding September 11, 2011 13 / 13