Feature Detection and Matching Shao-Yi Chien Department of - - PowerPoint PPT Presentation

feature detection and
SMART_READER_LITE
LIVE PREVIEW

Feature Detection and Matching Shao-Yi Chien Department of - - PowerPoint PPT Presentation

Feature Detection and Matching Shao-Yi Chien Department of Electrical Engineering National Taiwan University Fall 2019 1 References: Slides from Digital Visual Effects , Prof. Y.-Y. Chuang, CSIE, National Taiwan University


slide-1
SLIDE 1

Feature Detection and Matching

簡韶逸 Shao-Yi Chien Department of Electrical Engineering National Taiwan University Fall 2019

1

slide-2
SLIDE 2
  • References:
  • Slides from Digital Visual Effects, Prof. Y.-Y. Chuang, CSIE,

National Taiwan University

  • Slides from CE 5554 / ECE 4554: Computer Vision, Prof.

J.-B. Huang, Virginia Tech

  • Slides from CSE 576 Computer Vision, Prof. Steve Seitz

and Prof. Rick Szeliski, U. Washington

  • Chap. 4 of Computer Vision: Algorithms and Applications
  • Reference papers

2

slide-3
SLIDE 3

Outline

  • The requirement for the features
  • Points and patches
  • Feature detector
  • Feature descriptors
  • Feature matching
  • Feature tracking
  • SIFT
  • Applications
  • Recent features
  • Edges and lines
  • Appendix: MPEG-7 descriptors

3

slide-4
SLIDE 4

How the Visual Cortex Represents the World?

4 Credit: Matt Brown

slide-5
SLIDE 5

How the Visual Cortex Represents the World?

  • The same place?

5

by Diva Sian by scgbt

slide-6
SLIDE 6

How the Visual Cortex Represents the World?

6

slide-7
SLIDE 7

How the Visual Cortex Represents the World?

  • M. Riesenhuber and T. Poggio, “Why Can't a Computer be

more Like a Brain?” Nature Neuroscience, vol. 2, no. 11, 1999.

7

slide-8
SLIDE 8

The Requirement for the Features

  • We don’t make it by matching pixel values, but with

some higher level information: features

  • Requirements
  • Invariant: to lighting, color, rotation, scale, view angle…
  • Locality: features are local, so robust to occlusion and

clutter (no prior segmentation)

  • Distinctiveness: individual features can be matched to a

large database of objects

  • Quantity: many features can be generated for even

small objects

  • Efficiency: close to real-time performance

8

slide-9
SLIDE 9

Features

  • Interest points
  • Edge and lines
  • Others

9

slide-10
SLIDE 10

Points and Patches

10

slide-11
SLIDE 11

Keypoint Detection and Matching Pipeline

  • Feature detection
  • Feature description
  • Feature matching
  • Feature tracking

11

slide-12
SLIDE 12

Feature Detection – Harris Corner Detector

Suppose we only consider a small window of pixels

  • What defines whether a feature is a good or bad?

12

Slide adapted from Darya Frolova, Denis Simakov, Weizmann Institute.

slide-13
SLIDE 13

Feature Detection – Harris Corner Detector

Local measure of feature uniqueness

  • How does the window change when you shift it?
  • Shifting the window in any direction causes a big change

13

Slide adapted from Darya Frolova, Denis Simakov, Weizmann Institute.

“flat” region: no change in all directions “edge”: no change along the edge direction “corner”: significant change in all directions

slide-14
SLIDE 14

Aperture Problem

14 14

slide-15
SLIDE 15

15

Aperture Problem

15

slide-16
SLIDE 16

16

Aperture Problem

16

slide-17
SLIDE 17

Feature Detection – Harris Corner Detector

  • Change of intensity for the shift (u,v):

17

window function intensity shifted intensity 𝐹(𝑣, 𝑤) = ෍

𝑦,𝑧

𝑥(𝑦, 𝑧) 𝐽 𝑦 + 𝑣, 𝑧 + 𝑤 − 𝐽(𝑦, 𝑧) 2

Auto-correlation function

slide-18
SLIDE 18

Feature Detection – Harris Corner Detector

18

E(u,v)

Strong Minimum Strong Ambiguity No Stable Minimum

slide-19
SLIDE 19

Feature Detection – Harris Corner Detector

  • Small motion assumption → use Taylor Series

expansion

19

𝐹 𝑣, 𝑤 = ෍

𝑦,𝑧

𝑥 𝑦, 𝑧 𝐽 𝑦 + 𝑣, 𝑧 + 𝑤 − 𝐽 𝑦, 𝑧

2

= ෍

𝑦,𝑧

𝑥 𝑦, 𝑧 𝐽 𝑦, 𝑧) + 𝛼𝐽(𝑦, 𝑧 ∙ (𝑣, 𝑤) − 𝐽 𝑦, 𝑧

2

= ෍

𝑦,𝑧

𝑥 𝑦, 𝑧 𝐽𝑦(𝑦, 𝑧)𝑣 + 𝐽𝑧(𝑦, 𝑧)𝑤

2

= ෍

𝑦,𝑧

𝑥 𝑦, 𝑧 𝐽𝑦

2 𝑦, 𝑧 𝑣2 + 𝐽𝑧 2 𝑦, 𝑧 𝑤2 + 2𝐽𝑦 𝑦, 𝑧 𝐽𝑧 𝑦, 𝑧 𝑣𝑤

slide-20
SLIDE 20

Feature Detection – Harris Corner Detector

20

, where M is a 22 matrix computed from image derivatives:

 

       v u v u v u E ) , ( M

        =

y x y y x y x x

I I I I I I y x w

, 2 2

) , ( M

𝐹(𝑣, 𝑤) = ෍

𝑦,𝑧

𝑥 𝑦, 𝑧 𝐽𝑦

2 𝑦, 𝑧 𝑣2 + 𝐽𝑧 2 𝑦, 𝑧 𝑤2 + 2𝐽𝑦 𝑦, 𝑧 𝐽𝑧 𝑦, 𝑧 𝑣𝑤

slide-21
SLIDE 21

Eigenvectors of Symmetric Matrices

( ) ( )

( ) ( )

z z y Λ y Λ Λy y x Q Λ x Q x Q Λ Q x Ax x

T T T T T T T T T

2 1 2 1

= = = = =

1 1q

2 2q

1 = x xT

1 = z zT

21

slide-22
SLIDE 22

Feature Detection – Harris Corner Detector

Intensity change in shifting window: eigenvalue analysis

1, 2 – eigenvalues of M

direction of the slowest change direction of the fastest change

(max)-1/2 (min)-1/2 Ellipse E(u,v) = const

 

       v u v u v u E , ) , ( M

22

slide-23
SLIDE 23

Feature Detection – Harris Corner Detector

  • Feature scoring function
  • det 𝑁 − 𝛽 ∙ trace 𝑁 2 = 𝜇0𝜇1 − 𝛽(𝜇0 + 𝜇1)2

𝜇0 − 𝛽𝜇1

det 𝑁 trace 𝑁 = 𝜇0𝜇1 𝜇0 + 𝜇1

23

Ref: C. Harris and M.J. Stephens, “A combined corner and edge detector,” in Proc. Alvey Vision Conference, 1988.

slide-24
SLIDE 24

Feature Detection – Harris Corner Detector

Whole feature detection flow:

  • Compute the gradient at each point in the image
  • Create the M matrix from the entries in the gradient
  • Compute the eigenvalues.
  • Find points with large Feature scoring function

𝜇0 𝜇1

24

slide-25
SLIDE 25

Harris Detector Example

25

slide-26
SLIDE 26

f value (red high, blue low)

26

slide-27
SLIDE 27

Threshold (f > value)

27

slide-28
SLIDE 28

Find Local Maxima of f

28

slide-29
SLIDE 29

Harris Features (in Red)

29

slide-30
SLIDE 30

Harris Detector: Invariance Properties

  • Rotation

Ellipse rotates but its shape (i.e. eigenvalues) remains the same Corner response is invariant to image rotation

slide-31
SLIDE 31

Harris Detector: Invariance Properties

  • Affine intensity change: I → aI + b

✓ Only derivatives are used => invariance to intensity shift I → I + b ✓ Intensity scale: I → a I R x (image coordinate)

threshold

R x (image coordinate) Partially invariant to affine intensity change

slide-32
SLIDE 32

Harris Detector: Invariance Properties

  • Scaling

All points will be classified as edges Corner

Not invariant to scaling

slide-33
SLIDE 33

Scale Invariant Detection

Suppose you’re looking for corners Key idea: find scale that gives local maximum of f

  • in both position and scale
  • One definition of f: the Harris operator
slide-34
SLIDE 34

DoG – Efficient Computation

  • Computation in Gaussian scale pyramid
  • K. Grauman, B. Leibe

s Original image

4 1

2 = s

Sampling with step s4 =2 s s s

slide-35
SLIDE 35

Find local maxima in position-scale space of Difference-of-Gaussian

  • K. Grauman, B. Leibe

) ( ) ( s s

yy xx

L L +

s s2 s3 s4 s5  List of

  • f

(x (x, y, , s) s)

slide-36
SLIDE 36

Results: Difference-of-Gaussian

  • K. Grauman, B. Leibe
slide-37
SLIDE 37
  • T. Tuytelaars, B. Leibe

Orientation Normalization

  • Compute orientation histogram
  • Select dominant orientation
  • Normalize: rotate to fixed orientation

2p

[Lowe, SIFT, 1999]

slide-38
SLIDE 38

Basic idea:

  • Take 16x16 square window around detected feature
  • Compute edge orientation (angle of the gradient - 90) for each pixel
  • Throw out weak edges (threshold gradient magnitude)
  • Create histogram of surviving edge orientations

Scale Invariant Feature Transform

Adapted from slide by David Lowe

2p angle histogram

slide-39
SLIDE 39

SIFT descriptor

Full version

  • Divide the 16x16 window into a 4x4 grid of cells (2x2 case shown below)
  • Compute an orientation histogram for each cell
  • 16 cells * 8 orientations = 128 dimensional descriptor

Adapted from slide by David Lowe

slide-40
SLIDE 40

Local Descriptors: SIFT Descriptor

[Lowe, ICCV 1999]

Histogram of oriented gradients

  • Captures important texture

information

  • Robust to small translations /

affine deformations

  • K. Grauman, B. Leibe
slide-41
SLIDE 41

Details of Lowe’s SIFT algorithm

  • Run DoG detector

– Find maxima in location/scale space – Remove edge points

  • Find all major orientations

– Bin orientations into 36 bin histogram

  • Weight by gradient magnitude
  • Weight by distance to center (Gaussian-weighted mean)

– Return orientations within 0.8 of peak

  • Use parabola for better orientation fit
  • For each (x,y,scale,orientation), create descriptor:

– Sample 16x16 gradient mag. and rel. orientation – Bin 4x4 samples into 4x4 histograms – Threshold values to max of 0.2, divide by L2 norm – Final descriptor: 4x4x8 normalized histograms

Ref: D.G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, 2004.

slide-42
SLIDE 42

SIFT Example

sift

868 SIFT features

slide-43
SLIDE 43

PCA SIFT

  • 39 x 39 patch → 3042-D vector
  • Dimension reduction to 36-D with principal

component analysis (PCA)

43

Ref: Y. Ke and R. Sukthankar, “PCA-SIFT: a more distinctive representation for local image descriptors,” in Proc. CVPR2004.

slide-44
SLIDE 44

Gradient Location-Orientation Histogram (GLOH)

44

Ref: K. Mikolajczyk and C. Schmid, “A performance evaluation of local descriptors,” IEEE Tran. Pattern Analysis and Machine Intelligence, 2005.

slide-45
SLIDE 45

Speed Up Robust Feature (SURF)

  • SURF detector
  • Hessian Matrix Based Interest Points
  • Approximation for Hessian Matrix
  • Low computational cost

45

Ref: T. Tuytelaars H. Bay, A. Ess and L. V. Gool, “SURF: Speeded up robust features," in Proceedings

  • f Computer Vision and Image Understanding (CVIU), 2008, vol. 110, pp. 346-359.
slide-46
SLIDE 46

Speed Up Robust Feature (SURF)

  • SURF detector
  • Scale space representation

46

9x9 15x15

slide-47
SLIDE 47

Speed Up Robust Feature (SURF)

  • Descriptor
  • Based on sum of Haar wavelet response
  • dx,dy : wavelet responses in x & y direction
  • 4x4 sub-region
  • Calculate Σdx , Σdy, Σ|dx|, Σ|dy|
  • 4*4*4 = 64 dimensions
  • 4*4*5*5=400 times calculation for an interest point
  • Irregular pattern

47

dx dy Σ dx Σ |dx| Σ dy Σ |dy|

slide-48
SLIDE 48

Feature Matching

How to define the difference between two features f1, f2?

  • Simple approach is SSD(f1, f2)
  • sum of square differences between entries of the two descriptors
  • can give good scores to very ambiguous (bad) matches

48

I1 I2 f1 f2

slide-49
SLIDE 49

Feature Matching

How to define the difference between two features f1, f2?

  • Better approach: ratio distance = SSD(f1, f2) / SSD(f1, f2’)
  • f2 is best SSD match to f1 in I2
  • f2’ is 2nd best SSD match to f1 in I2
  • gives small values for ambiguous matches

49

I1 I2 f1 f2 f2

'

slide-50
SLIDE 50

Feature Matching

  • Matching? The difference < threshold
  • How to evaluate?
  • TP: true positives
  • FN: false negatives
  • FP: false positives
  • TN: true negatives

50

slide-51
SLIDE 51

Feature Matching

  • How to evaluate?
  • True positive rate (TPR), recall
  • False positive rate (FPR), false alarm
  • Positive predictive value (PPV), precision
  • Accuracy (ACC)

51

𝑈𝑄𝑆 = 𝑈𝑄 𝑈𝑄 + 𝐺𝑂 = 𝑈𝑄 𝑄 𝐺𝑄𝑆 = 𝐺𝑄 𝐺𝑄 + 𝑈𝑂 = 𝐺𝑄 𝑂 𝑄𝑄𝑊 = 𝑈𝑄 𝑈𝑄 + 𝐺𝑄 = 𝑈𝑄 𝑄′ 𝐵𝐷𝐷 = 𝑈𝑄 + 𝑈𝑂 𝑄 + 𝑂

slide-52
SLIDE 52

Feature Matching

  • How to evaluate?

52

ROC curve (Receiver Operating Characteristic) True positive rate (TPR) False positive rate (FPR) Positive predictive value (PPV) Accuracy (ACC) 𝑈𝑄𝑆 = 𝑈𝑄 𝑈𝑄 + 𝐺𝑂 𝐺𝑄𝑆 = 𝐺𝑄 𝐺𝑄 + 𝑈𝑂 𝑄𝑄𝑊 = 𝑈𝑄 𝑈𝑄 + 𝐺𝑄 𝐵𝐷𝐷 = 𝑈𝑄 + 𝑈𝑂 𝑄 + 𝑂

slide-53
SLIDE 53

Feature Matching

  • Efficient matching
  • Full search
  • Indexing structure
  • Multi-dimensional hashing
  • Locality sensitive hashing (LSH)
  • K-d tree

53

slide-54
SLIDE 54

Applications

Features are used for:

  • Image alignment (e.g., mosaics)
  • 3D reconstruction
  • Motion tracking
  • Object recognition
  • Indexing and database retrieval
  • Robot navigation
  • … other

54

slide-55
SLIDE 55

Object Recognition (David Lowe)

55

slide-56
SLIDE 56

BRIEF (ECCV 2010)

  • We define test 𝜐 on patch 𝐪 of size 𝑇 × 𝑇 as
  • where 𝐪(𝐲) is the pixel intensity in a smoothed version of 𝐪

at 𝐲 = (𝑣, 𝑤)⊤.

  • Choosing a set of 𝑜𝑒 (𝑦, 𝑧)-location pairs uniquely defines

a set of binary tests.

  • We take our BRIEF descriptor to be the 𝑜𝑒-dimensional

bitstring

56

slide-57
SLIDE 57

57

slide-58
SLIDE 58

BRISK (ICCV2011)

slide-59
SLIDE 59

FREAK (CVPR 2012)

  • Retinal sampling pattern
  • Coarse-to-fine descriptor
  • How to select pairs?
  • Learn the best

pairs from training data

slide-60
SLIDE 60

FREAK (CVPR 2012)

slide-61
SLIDE 61

ORB: An efficient alternative to SIFT or SURF

  • ORB = oFAST + rBRIEF
  • oFAST: FAST Keypoint Orientation
  • rBRIEF: Rotation-Aware Brief

61

  • E. Rublee, V. Rabaud, K. Konolige and G. Bradski, “ORB: An efficient alternative to

SIFT or SURF,” in Proc. 2011 International Conference on Computer Vision, Barcelona, 2011.

slide-62
SLIDE 62

FAST1

  • Features from Accelerated Segment Test.
  • The segment test criterion operates by considering a circle of

sixteen pixels around the corner candidate p.

  • The original detector classifies p as a corner if there exists a set
  • f n contiguous pixels in the circle which are all brighter than

the intensity of the candidate pixel Ip + t, or all darker than Ip - t.

62

1Rosten, Edward, and Tom Drummond. "Machine learning for high-speed corner detection." Computer Vision–ECCV 2006.

slide-63
SLIDE 63

Orientation by Intensity Centroid

  • Moments of a patch
  • with these moments we may find the centroid
  • We can construct a vector from the corner's center, 𝑃, to the

centroid, 𝑃𝐷.

63

slide-64
SLIDE 64

Rotation Measure

  • IC: intensity centroid
  • MAX chooses the largest gradient in the keypoint patch
  • BIN forms a histogram of gradient directions at 10 degree intervals, and picks the

maximum bin.

64

slide-65
SLIDE 65

Steered BRIEF

  • Steer BRIEF according to the orientation of keypoints.
  • Using the patch orientation 𝜄 and the corresponding

rotation matrix R𝜄, we construct a "steered" version S𝜄 of S:

  • Now the steered BRIEF operator becomes

65

slide-66
SLIDE 66

Learning Good Binary Features

  • The algorithm is:

66

slide-67
SLIDE 67

Results (1/3)

  • Matching performance of SIFT, SURF, BRIEF with FAST, and

ORB (oFAST +rBRIEF) under synthetic rotations with Gaussian noise of 10.

67

Media IC & System Lab Po-Chen Wu (吳柏辰)

slide-68
SLIDE 68

Results (2/3)

  • Matching behavior under noise for SIFT and rBRIEF. The

noise levels are 0, 5, 10, 15, 20, and 25. SIFT performance degrades rapidly, while rBRIEF is relatively unaffected.

68

Media IC & System Lab Po-Chen Wu (吳柏辰)

slide-69
SLIDE 69

Results (3/3)

  • Test on real-world images:

69

Media IC & System Lab Po-Chen Wu (吳柏辰)

slide-70
SLIDE 70

Computation Time

  • The ORB system breaks down into the following times per

typical frame of size 640x480.

70

Media IC & System Lab Po-Chen Wu (吳柏辰)

Intel i7 2.8 GHz Pascal 2009 dataset 2686 images at 5 scales

slide-71
SLIDE 71

OpenCV 2.4.9

  • Detector
  • "FAST" – FastFeatureDetector
  • "STAR" – StarFeatureDetector
  • "SIFT" – SIFT (nonfree module)
  • "SURF" – SURF (nonfree module)
  • "ORB" – ORB
  • "BRISK" – BRISK
  • "MSER" – MSER
  • "GFTT" – GoodFeaturesToTrackDetector
  • "HARRIS" – GoodFeaturesToTrackDetector with Harris detector

enabled

  • "Dense" – DenseFeatureDetector
  • "SimpleBlob" – SimpleBlobDetector

71

Media IC & System Lab Po-Chen Wu (吳柏辰)

slide-72
SLIDE 72

OpenCV 2.4.9

  • Descriptor
  • "SIFT" – SIFT
  • "SURF" – SURF
  • "BRIEF" – BriefDescriptorExtractor
  • "BRISK" – BRISK
  • "ORB" – ORB
  • "FREAK" – FREAK

72

Media IC & System Lab Po-Chen Wu (吳柏辰)

slide-73
SLIDE 73

Edges and Lines

73

slide-74
SLIDE 74

74

  • Y. Cao, C. Wang, L. Zhang and L. Zhang, "Edgel index for large-scale sketch-

based image search," in Proc. CVPR 2011.

slide-75
SLIDE 75

Edge Detection

  • Canny edge detector
  • The most widely used edge detector
  • The best you can find in existing tools like MATLAB, OpenCV…
  • Algorithm:
  • Apply Gaussian filter to reduce noise
  • Find the intensity gradients of the image
  • Apply non-maximum suppression to get rid of false edges
  • Apply double threshold to determine potential edges
  • Track edge by hysteresis: suppressing weak edges that are not

connected to strong edges

75

slide-76
SLIDE 76

Hysteresis

  • Find connected components from strong edge pixels to

finalize edge detection

76

slide-77
SLIDE 77

Hough Transform

77

slide-78
SLIDE 78

Hough Transform

  • Vote in 𝜄, 𝑠 space
  • (Many choices)

78

slide-79
SLIDE 79

Hough Transform

  • Clear the accumulator array
  • For each detected edgel at location (𝑦, 𝑧) and orientation 𝜄

= 𝑢𝑏𝑜−1𝑜𝑧/𝑜𝑦, compute the value of 𝑒 = 𝑦𝑜𝑦 + 𝑧𝑜𝑧 and increment the accumulator corresponding to (𝜄, 𝑒)

  • Find the peaks in the accumulator corresponding to lines
  • Optionally re-fit the lines to the constituent edgels

79

slide-80
SLIDE 80

Deep Features

80

slide-81
SLIDE 81

Deep Features

  • Features extracted from Deep Neural Network
  • Ex. Deep Face (CVPR2014)
slide-82
SLIDE 82

Deep Features

82

  • E. Simo-Serra, E. Trulls, L. Ferraz, I. Kokkinos, P. Fua, and F. Moreno-Noguer,

“Discriminative learning of deep convolutional feature point descriptors,” in Proc. ICCV 2015. Loss function:

slide-83
SLIDE 83

Deep Features

83

  • E. Simo-Serra, E. Trulls, L. Ferraz, I. Kokkinos, P. Fua, and F. Moreno-Noguer,

“Discriminative learning of deep convolutional feature point descriptors,” in Proc. ICCV 2015.

slide-84
SLIDE 84

Appendix: MPEG-7 Descriptors

84

slide-85
SLIDE 85

85

Introduction

  • MPEG-7 is a standard for describing features of

multimedia content

  • MPEG-7 provides the world’s richest set of audio-

visual descriptions

  • Comprehensive scope of data interoperability
  • Based on XML
slide-86
SLIDE 86

86

Introduction

  • General visual descriptors
  • Color
  • Texture
  • Shape
  • Motion
  • Domain-specific visual descriptors
  • Face recognition descriptors
slide-87
SLIDE 87

87

What is Standardized?

  • Only define the descriptions
  • Not standardize how to produce the descriptions
  • Not standardize how to use the descriptions
  • Only define what is needed for the interoperability of

MPEG-7 enabled systems

slide-88
SLIDE 88

88

What is Standardized?

slide-89
SLIDE 89

89

Color Descriptors

  • Color Space Descriptor
  • Dominant Color Descriptor
  • Scalable Color Descriptor
  • Group of Frames (or Pictures) Descriptor
  • Color Structure Descriptor
  • Color Layout Descriptor
slide-90
SLIDE 90

90

Example: Dominant Color Descriptor (1)

  • Compact description
  • Browsing of image databases based on single or

several color values

  • Definition:
  • F = {(ci, pi, vi), s}, (i = 1, 2, …, N) (N < 9)
  • ci : color value vector (default color space: RGB)
  • pi : percentage ( )
  • vi : optional color variance
  • s : spatial coherency

1 =

i i

p

slide-91
SLIDE 91

91

Dominant Color Descriptor (2)

  • Binary syntax of DCD

Field Number of Bits Meaning NumberofColors 3 Specifies number of dominant colors SpatialCoherency 5 Spatial Coherency Value Percentage[] 5 Normalized percentage associated with each dominant color ColorVariance[][] 1 Color variance of each dominant color Index[][] 1—12 Dominant color values

slide-92
SLIDE 92

92

Dominant Color Descriptor (3)

  • Extraction:
  • Clustering is performed in a perceptually uniform color

space (Lloyd algorithm)

  • Distortion :
  • x(n) : the color vector at pixel n
  • h(n) : perceptual weight for pixel n
  • ci : centroid of cluster Ci

 − =

n i i i

C n x c n x n h D ) ( , ) ( ) (

2

i i

C n x n h n x n h c  = 

) ( , ) ( ) ( ) (

slide-93
SLIDE 93

93

Dominant Color Descriptor (4)

  • Extraction:
  • The procedure is initialized with one cluster consisting of

all pixels and one representative color computed as the centroid of the cluster

  • The algorithm then follows a sequence of centroid

calculation and clustering steps until a stopping criterion (minimum distortion or maximum number of iterations)

slide-94
SLIDE 94

94

Dominant Color Descriptor (5)

  • Extraction:
  • Spatial coherency (s):
  • 4 connectivity connected component analysis
  • Individual spatial coherence: normalized average number of the

connected pixels of each dominant color

  • s = (individual spatial coherence)i
  • S is nonuniformly quantized to 5 bits,

31 means highest confidence 1 means no confidence 0 means not computed

i i

p

slide-95
SLIDE 95

95

Dominant Color Descriptor (6)

  • Similarity Matching:
  • Number of representative colors is small, one can first

search the database for each of the representative color separately, then combine.

slide-96
SLIDE 96

96

Dominant Color Descriptor (7)

  • Similarity Matching:
  • Consider 2 DCDs :
  • Dissimilarity (D):

dk,l:

colors

  • between tw

distance Euclidean the is

, l k l k

c c d − =

d

T d  =

max

slide-97
SLIDE 97

97

Dominant Color Descriptor (8)

  • Similarity Matching:
  • Dissimilarity (Ds):

w1 = 0.3, w2 = 0.7 (recommanded)

  • Dissimilarity (Dv):
slide-98
SLIDE 98

98

Dominant Color Descriptor (9)

  • Similarity Matching Results:
slide-99
SLIDE 99

99

Texture Descriptors

  • Homogeneous Texture Descriptor (HTD)
  • Texture Browsing Descriptor (TBD)
  • Edge Histogram Descriptor (EHD)
slide-100
SLIDE 100

Homogeneous Texture Descriptor

] ,... , , ,..., , , , [

30 2 1 30 2 1

d d d e e e f f HTD

SD DC

=

62 numbers (496 bits) Channels used in computing the HTD Texture feature channels modeled using the Gabor functions in the polar frequency domain

slide-101
SLIDE 101

HTD Extraction

  • On the basis of the frequency layout and the

Gabor functions:

  • ei : log-scaled sum of the square of the Gabor-

filtered Fourier transform coefficients of an image:

] ,... , , ,..., , , , [

30 2 1 30 2 1

d d d e e e f f HTD

SD DC

=

 

2 1 360 ) ( , 10

) , ( ) , ( ] 1 [ log

 

+ +

= =

= + =

 

    

 

P G p where p e

r s i i i

slide-102
SLIDE 102

Edge Histogram Descriptor

  • Divide the image into 4x4 subimages
  • Block-based edge extraction
slide-103
SLIDE 103

Semantics of the Histogram bins

  • f the EHD
  • 5 edge type x 16 subimages = 80 histogram bins
slide-104
SLIDE 104

104

Shape Descriptors

  • Region-based descriptor
  • Contour-based descriptor
  • 3-D Shape Descriptor
slide-105
SLIDE 105

Chen-han Tsai 105

Region v.s Contour (1/2)

slide-106
SLIDE 106

Chen-han Tsai 106

Region v.s Contour (2/2)

slide-107
SLIDE 107

Chen-han Tsai 107

Goal of SDs

  • Fast search and browsing
  • Concise manner
  • Robust to
  • Scaling
  • Translation
  • Rotation
slide-108
SLIDE 108

Chen-han Tsai 108

Region-based descriptor

  • Take all pixels into account
  • Project the shape onto the 2-D domain by using

ART

slide-109
SLIDE 109

Chen-han Tsai 109

Angular Radial Transform

F: image function V: ART basis Define on unit circle in polar system

slide-110
SLIDE 110

Chen-han Tsai 110

ART basis Real part

slide-111
SLIDE 111

Chen-han Tsai 111

ART basis Imaginary part

slide-112
SLIDE 112

Chen-han Tsai 112

Descriptor Representation

Angular (m):0~11 Radial (n) :0~2 Normalize by F00: Fnm /F00 n=0~2,m=0~11

slide-113
SLIDE 113

Chen-han Tsai 113

Contour-based descriptor

  • Take only border pixels into account
  • CSS representation
slide-114
SLIDE 114

Chen-han Tsai 114

CSS Representation

slide-115
SLIDE 115

Chen-han Tsai 115

Descriptor Representation

  • Number of Peaks [6 bits]
  • Global Curvature [2*6 bits]
  • Prototype Curvature [2*6 bits]
  • Highest Peak Y [7 bits]
  • Peak X [6 bits]
  • Peak Y [3 bits]
slide-116
SLIDE 116

116

Motion Descriptors

  • Motion Activity Descriptor
  • Camera Motion Descriptor
  • Motion Trajectory Descriptor
  • Parametric Motion Descriptor