EE 6882 Statistical Methods for Video Indexing and Analysis Fall - - PowerPoint PPT Presentation

ee 6882 statistical methods for video indexing and
SMART_READER_LITE
LIVE PREVIEW

EE 6882 Statistical Methods for Video Indexing and Analysis Fall - - PowerPoint PPT Presentation

EE 6882 Statistical Methods for Video Indexing and Analysis Fall 2004 Prof. Shih-Fu Chang http://www.ee.columbia.edu/~sfchang Lecture 1 - part B (9/7/04) 1 Run-through of a simple image search system Color, Texture, distance metrics, and


slide-1
SLIDE 1

1

EE 6882 Statistical Methods for Video Indexing and Analysis

Fall 2004

  • Prof. Shih-Fu Chang

http://www.ee.columbia.edu/~sfchang Lecture 1 - part B (9/7/04)

slide-2
SLIDE 2

2 EE6882-Chang

Run-through of a simple image search system

  • Color, Texture, distance metrics, and evaluation issues
  • References
  • J. R. Smith and S.-F. Chang, "VisualSEEk: A Fully Automated Content-Based Image Query

System," ACM Multimedia Conference, Boston, MA, Nov. 1996.

  • J. R. Smith and S.-F. Chang, "Visually Searching the Web for Content," IEEE Multimedia

Magazine, Summer, Vol. 4 No. 3, pp.12-20, 1997.

  • M. Flickher, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner,
  • D. Lee, D. Petkovicand D. Steele, and P. Yanker. Query by image and video content: The

QBIC system. In IEEE Computer, volume 38, pages 23-31, 1995.

  • Christos Faloutsos, Ron Barber, Myron Flickner, Wayne Niblack, Dragutin Petkovic, and

William Equitz. Efficient and effective querying by image content. J. of Intelligent Information Systems, 3(3/4):231-262, July 1994. (QBIC System)

  • Sikora, T., "The MPEG-7 visual standard for content description-an overview," IEEE

Transactions on Circuits and Systems for Video Technology, Volume: 11 Issue: 6 , Page(s): 696 -702, June 2001.

  • Manjunath, B.S.; Ohm, J.-R.; Vasudevan, V.V.; Yamada, A., "Color and texture

descriptors," IEEE Transactions on Circuits and Systems for Video Technology, Volume: 11 Issue: 6 , Page(s): 703 -715, June 2001.

  • Yossi Rubner, Carlo Tomasi, and Leonidas J. Guibas. A Metric for Distributions with

Applications to Image Databases. Proceedings of the ICCV'98, Bombay, India, January 1998, pages 59-66.

  • Thanks to John R. Smith for some slides on color/texture feature extraction
slide-3
SLIDE 3

3 EE6882-Chang

Content-based Image Retrieval System

User User User interface User interface Image thumbnails Image thumbnails Images & videos Images & videos Network Network Query server Query server Image/video Server Image/video Server Index Index Archive Archive What functionalities should each component have? What are the bottlenecks of the system?

slide-4
SLIDE 4

4 EE6882-Chang

Feature Extraction for Content-Based Image Retrieval (Color & Texture)

Why visual features?

Manual annotation is tedious and insufficient Computers cannot understand images Comparison of visual features enables comparison of visual

scenes

Need tools for organizing filtering and searching through large

amounts of visual data

What visual features?

What is available in the data? What features does the human visual system (HVS) use? Color: suitable for color images Texture: visual patterns, surface properties, cues for depth Shape: boundaries of real world objects, edges Motion: camera motion vs. object motion

slide-5
SLIDE 5

5 EE6882-Chang

Visual Features

How to use visual features?

Extraction Representation Discrimination Indexing

Considerations

Complexity Invariance

Rotation, scaling, cropping, occlusion, shift, etc.

Dimension Subjective relevance Distance Metric

slide-6
SLIDE 6

6 EE6882-Chang

Visual Features (cont.)

Fundamental approach is from pattern recognition

work

Group pixels, process the group and generate a

feature vector

Discrimination via (transform and ) feature vector

distance

Multidimensional indexing of the feature vectors

Do this for color and texture

Build a content-based image retrieval system

slide-7
SLIDE 7

7 EE6882-Chang

Color Order Systems

The Munsell System (1905)

Colors are arranged so that, as nearly as possible the

perceptual distance between adjacent color is constant. The Munsell Book of Color – color chips

The Natural Color System (NCS) – (1981) Natural Color System Atlas – derived from 60,000

  • bservations

Color are described by the relative amounts of basic colors:

black, white, yellow, blue, red and green

The DIN system (1981) The Coloroid system (1980-1987) Optical Society of American System (OSA) (1981) Hunter LAB System (1981)

slide-8
SLIDE 8

8 EE6882-Chang

Color Order Systems (cont.)

Advantages of Color Order Systems

Easy to understand, plus samples are available Easy to use and compare colors side-by-side Number and spacing of samples can be adapted to application

Disadvantages

Too many color order systems, can’t translate between them Color comparison is only valid for required illuminant User perception differs Application to self-luminous colors (i.e., monitors and

computer displays) is not easy

slide-9
SLIDE 9

9 EE6882-Chang

Color Representation

What is COLOR?

  • A weighted combination of stimuli at three principal wavelengths in

the visible spectrum (form blue=400nm to red=700nm). β ρ γ Examples: λ=500nm (β, γ, ρ)=(20, 40, 20) B=100 (β, γ, ρ)=(100, 5, 4) G=100 (β, γ, ρ)=(0, 100, 75) R=100 (β, γ, ρ)=(0, 0, 100)

[Oberle]

slide-10
SLIDE 10

10 EE6882-Chang

Tri-stimulus Representation

Compute correct α1 α2 α3 s.t. the response (β, γ, ρ) are the same as those of original color.

P1(λ) P2 (λ) P3 (λ) α1 α2 α3

HVS

Same Response (β, γ, ρ) E.g., use are R, G, B as primary colors P1 , P2 ,P3

slide-11
SLIDE 11

11 EE6882-Chang

Color Spaces and Color Order Systems

Color Spaces

RGB – cube in Euclidean space Standard representation used in color displays Drawbacks

RGB basis not related to human color judgments Intensity should for one of the dimensions of color Important perceptual components of color are hue,

brightness and saturation R G B r g b R G B R G B R G B = = = + + + + + +

slide-12
SLIDE 12

12 EE6882-Chang

Color Spaces and Color Order Systems

HSI-cone (cylindrical coordinates) Opponent-Cartesian YIQ-NTSC television standard

0.6 0.28 0.32 0.21 0.52 0.31 0.3 0.59 0.11 I R Q G Y B − −             = −                  

                    − − − =           B G R V V I 6 / 1 6 / 1 6 / 2 6 / 1 6 / 1 3 / 1 3 / 1 3 / 1

2 1

) ( tan

1 2 1

V V H

=

2 / 1 2 2 2 1

) ( V V S + =

1 2 1 1 1 2 1 1 1 R G R Bl Y G W Bk B − −             − = − −             −      

slide-13
SLIDE 13

13 EE6882-Chang

Perceptual Representation Of HSI Space

brightness varies along the

vertical axis

hue varies along the

circumference

saturation varies along the

radius

slide-14
SLIDE 14

14 EE6882-Chang

Color Coordinate Systems

From Jain’s DIP book

slide-15
SLIDE 15

15 EE6882-Chang

Color Coordinate Systems (cont.)

slide-16
SLIDE 16

16 EE6882-Chang

Color Space Quantization

How many colors to keep

IBM QBIC 16M(RGB) 4096 (RGB) 64 (Munsell) colors Columbia U. VisualSEEK 16M (RGB) 166 (HSV) colors

(18 Hue, 3 Sat, 3 Val, 4 Gray)

Stricker and Orengo (Similarity of Color Images)

16M (RGB) 16 hues, 4 val, 4 sat = 128(HSV) colors 16M (RGB) 8 hues, 2 val, 2 sat = 32 (HSV) colors

Sqain and Ballard (Color Indexing)

16M (RGB) 8 wb, 16rg, 16by = 2048 (OPP) colors

Independent quantization – each color dimension is

quantized independently

Joint quantization – color dimensions are quantized

jointly

slide-17
SLIDE 17

17 EE6882-Chang

Color Histogram

Feature extraction from color images

Choose GOOD color space Quantize color space to reduce number of colors Represent image color content using color

histogram

Feature vector IS the color histogram

1 [ , ] , [ , ] , [ , ] [ , , ]

R G B RGB m n

if I m n r I m n g I m n b h r g b

  • therwise

= = =  =  

∑∑

A color histogram represents the distribution of colors where each histogram bin corresponds to a color is the quantized color space

slide-18
SLIDE 18

18 EE6882-Chang

Color Histogram (cont.)

Advantages of color histograms

Compact representation of color information Global color distribution Histogram distance metrics

Disadvantages

High dimensionality No information about spatial positions of colors

slide-19
SLIDE 19

19 EE6882-Chang

Other Histogram Metrics

L1 distance L2 distance Histogram Intersection Quadratic Distance Other histograms

Edge histogram + total edge count Texture Issue: quality of edge, texture extraction, lighting (dark frame)

1 1

( , 1) ( ) ( )

i i j

D i i H j H j

+

+ = −

2 2 1

( , 1) ( ) ( )

i i j

D i i H j H j

+

+ = −

( )

1 1

min ( ), ( ) 1 min ( ), ( )

i i j I i i j j

H j H j D H j H j

+ +

= −        

∑ ∑ ∑

( ) ( )

1 2

1 1 1 1 2 2 1 2 1 2 1 2

( ) ( ) ( , ) ( ) ( ) ( , ) : , .

1 2

Q i i i i j j j ,j

D H j H j j j H j H j j j correlation between colors j j e.g. 1-d α α

+ +

= − −

∑∑

slide-20
SLIDE 20

Color Coherence Vector

A B C D E Regions: Color 1 2 1 3 1 Size 12 15 3 1 5

( ) ( ) ( ) ( ) ( ) ( )

1 1 1 1 1 1

, ,..., , , ,..., ,

I n n I n n n n G i i i i H i i i i i i G H

G G by triangular inequality α β α β α β α β α α β β α α β β

= =

′ ′ ′ ′ ′ = = ′ ′ ′ ′ ∆ − + − ∆ − + − ∆ > ∆

∑ ∑

  • 2

1 2 2 1 1 2 2 1 2 1 1 ... ... B C B B A A B B C B A A Color Quantization B C D B A A Region Segmentaition B B B A E E Labeling B B A A E E B B A A E E             → →                  

Not just count of colors, also check adjacency

1 2 3 17 15 3 1 Color Color Co. Vector: α β

slide-21
SLIDE 21

21 EE6882-Chang

Consideration of Metrics

Limitations of Euclidean Metric

Cannot distinguish classes Correlation between features Curved boundaries

Change feature Use Mahalanobis dist

Distinctive subclasses

Use clustering

Complex features

Use better features

+ ++ +

  • +

+ + +

  • vs

vs + ++ +

  • +

+ + + + + +

  • +

+

  • +

+ +

  • +

+

  • +

+ + + +

  • +

+ + + +

slide-22
SLIDE 22

Mohalanobis Metric

( ) ( )

2 1 1 2 1 2 1

(1,1) (1,2) ... (1, ) ... ... ... ... ( ,1) ( ,2) ... ( , ) ( , ) ( ) ( ) ( ) ( ) / 1, :

T mah x x N k k k

D x x C x x c c c d covariance matrix C c d c d c d d c i j x i m i x j m j N N number of samples

− =

= − −     =       = − − −        

  • o o
  • xi

xj

  • o o
  • xi

xj

  • xi

xj

  • o
  • xi

xj

  • xi

xj

  • i

j

c s s = −

1 2

i j

c s s = −

c =

1 2

i j

c s s =

i j

c s s =

1 2 1 2 1 2 1 1 1 2 1 2 1 2

| ...| ( , ,..., ) | ...| | ...| ( ( , ,..., )) | ...|

T x d d d T x d d d

C e e e diag e e e C e e e diag e e e λ λ λ λ λ λ

− −

=         =        

  • o
  • e1

e2

si, sj: std. deviation Projects data to the eigen vectors, divide the sd of each eigen dimension, and compute Euclidian distance

slide-23
SLIDE 23

23 EE6882-Chang

Mohalanobis Metric (cont.)

Advantage of Mahalanobis metric

Account for scaling of coordinate axes Invariant under linear transformation Correct for correlation Prove curved as well as linear decision boundaries

Potential issue

Need enough training data to estimate Cov. Matrix

Need d(d-1)/2 independent elements

2 2

,

T y x y x

If y Ax C AC A D D = ⇒ = =

. km cm . . . .. . . . . .. . . .. .. .

.. . . . ... . . . . . .. . . . .

  • Maha. Dist.
  • Maha. Dist.

c1 m1 cc mc xi Minimum Selector Selected class

slide-24
SLIDE 24

24 EE6882-Chang

Earth Mover’s Distance (EMD)

  • Rubner, Tomasi, Guibas ’98
  • Transportation Problem [Dantzig’51]

I J cij

I: set of suppliers J: set of consumers cij : cost of shipping a unit of supply from i to j

  • Problem: find the optimal set of flows fij such that

0, , , ,

i j ij i I i I ij ij j i I ij i j J j i j J i I

minimize c f s.t. f i I j J (No reverse shipping) f y j J (satisfy each consumer need /cacacity) f x i I (each supplier's limit) y x (feasibility)

∈ ∈ ∈ ∈ ∈ ∈

≥ ∈ ∈ = ∈ ≤ ∈ ≤

∑∑ ∑ ∑ ∑ ∑

slide-25
SLIDE 25

25 EE6882-Chang

Advantage of EMD

Efficient implementations exist (Simplex Method) Also support partial matching (||I|| >< ||J||, e.g., histogram

defined in different color spaces, or scales)

If the mass of two distributions equal, then EMD is a true

metric

Allow flexible structures, e.g., matching multiple regions in

each image

Multiple region in one image, each region represented by

individual feature vector

Region set: {R1, R2, R3} Region set: {R1’, R2’, R3’, R4’} Cij = dist(Ri, Rj’), which can be based on EMD also

slide-26
SLIDE 26

26 EE6882-Chang

EMD of Color Histogram

( ) ( ) ( ) ( ) ( ) ( ) ( )

1 1 1 1

, ,..., , , ,..., , ( ) ( ) ,

j i M N ij ij i j M N ij i j

h h 1 h 2 h M g= g 1 g 2 h N assume g j h i C f EMD h g f

= = = =

    = ≤     =

∑ ∑ ∑∑ ∑∑

Earth Hole

1 1 1

/

M N N ij ij j i j j ij ij ij

= C f g Fill up each hole C : distance between color i in color space h and color j in color space g f : move f units of mass from i in h to j in g

= = =

∑∑ ∑

Normalization by the denominator term

Avoid bias toward low mass distributions Experiment result [Robner, Tomos, Guiba’98]

slide-27
SLIDE 27

27 EE6882-Chang

EMD with Pre-filtering

,

EMD pre pre

d d ; if d TH then reject candidate > >

x . . . . . .

ij ij j i j i j ij ij j i j i j i j j i i

f p,q f (p,q ) f p f q x p y q ≥                

∑∑ ∑∑ ∑ ∑ ∑ ∑ ∑ ∑

For color histogram

Color i means color of ith bin x: histogram yj: histogram

EMD > Distance between

average color

slide-28
SLIDE 28

28 EE6882-Chang

slide-29
SLIDE 29

29 EE6882-Chang

Texture

What is texture?

Has structure or repetitious pattern, i.e., checkered Has statistical pattern, i.e., grass, sand, rocks

Why texture?

Application to satellite images, medical images Describes contents of real world images, i.e., clouds, fabrics,

surfaces, wood, stone

Challenging issues

Rotation and scale invariance (3D) Segmentation/extraction of texture regions from images Texture in noise

slide-30
SLIDE 30

30 EE6882-Chang

Approaches to texture features

Fourier Domain Energy Distribution

Angular features (directionality) Radial features (coarseness)

2 1 1 2

tan , ) , (

2 1

θ θ

θ θ

≤       ≤ =

∫∫

u v where dudv v u F V

2 2 2 1 2

, ) , (

2 1

r v u r where dudv v u F V r

r

< + ≤ = ∫∫

x

ω

y

ω

φ

x

ω

y

ω

r

slide-31
SLIDE 31

31 EE6882-Chang

Co-occurrence Matrix - (image with m levels)

Popular early texture approach

Approaches to texture

) cos( and ) sin( and ] , [ and ] , [ top' ' e.g., pixels,

  • between tw

relation ) , ( , ) , ( ) , ( ) , ( ) , ( ) , (

1 1 1 1 ) , ( ) , ( ) , ( ) , ( ) , (

θ θ θ

θ θ θ θ θ

d x x d y y j y x I i y x I d R where m m Q m Q m Q Q j i Q

d R d R d R d R d R

+ = + = = = =           =

  • P

1

P

d

θ

slide-32
SLIDE 32

32 EE6882-Chang

Co-occurrence Matrix

(also called Grey-Level Dependence, SGLD)

  • Measures on
  • Energy
  • Entropy
  • Correlation
  • Inertia
  • Local Homogeneity

) , (

) , (

j i Q

d R θ

∑∑ =

i j d R

j i Q d E ) , ( ) , (

) , (θ

θ

∑∑ =

i j R d R

j i Q j i Q d H ) , ( log ) , ( ) , (

) , (θ

θ

∑∑ ⋅ − − =

i j R y x y x

j i Q j i d C ) , ( ) )( ( ) , ( σ σ µ µ θ

∑∑ − =

i j R

j i Q j i d I ) , ( ) ( ) , (

2

θ

∑∑ − + =

i j R

j i Q j i d L ) , ( ) ( 1 1 ) , (

2

θ

  • Statistical

Measures

  • None

corresponds to a visual component.

slide-33
SLIDE 33

33 EE6882-Chang

Non-Fourier type bass Matched better to intuitive texture features Examples of filters (total 12)

Laws Filters [1980]

                − − − − − − − − − 1 4 6 4 2 8 12 8 2 2 8 12 8 2 1 4 6 4 1                 − − − − − 1 2 1 2 4 2 2 4 2 1 2 1

                − − − − − − − − − − − − 1 4 6 4 1 4 16 24 16 4 6 24 36 24 6 4 16 24 16 4 1 4 6 4 1

Measure energy of output from each filter

m

I

12 outputs

slide-34
SLIDE 34

34 EE6882-Chang

Tamura Texture

  • Methods for approximating intuitive texture features
  • Example: ‘Coarseness’, others: ‘contrast’, ‘directionality’
  • Step1: Compute averages at different scales, 1x1, 2x2, 4x4

pixels

  • Step2: compute neighborhood difference at each scale
  • Step 3: select the scale with the largest variation
  • Step 4: compute the coarseness

k Best L

y x S E E E E y x 2 ) ( ), . . . , , max( determine ) , (

2 1 k

= = ∀

∑ ∑

− − − −

+ − = + − =

= ∀

1 1 1

2 2 2 2

2 ) , ( ) , ( ), , (

k k k k

y y j k x x i K

j i f y x A y x

) , 2 ( ) , 2 ( ) ( ), , (

1 1 ,

y x A y x A y x E y x

k k k k h k − −

− − + = ∀

∑∑

= =

=

m j n i Best CRS

j i S mn F

1 1

) , ( 1

slide-35
SLIDE 35

38 EE6882-Chang

Content-based Image and Video Retrieval System

User User User interface User interface Image thumbnails Image thumbnails Images & videos Images & videos Network Network Query server Query server Image/video Server Image/video Server Index Index Archive Archive What are the bottlenecks of the system? What functionalities should each component have?

slide-36
SLIDE 36

39 EE6882-Chang

Evaluation

  • Precision / Recall
  • Precision: C/B
  • Recall: C/A
  • Rank similarity
  • Simple measure

Ground Truth in DB, A Returned Result, B B A C

, B Precision , Recall ↑ ↓ ↑

Recall Precision

# Image ID Rank Correct Rank 1 5 7 2 20 ... N

i i i

R R α ′ − ⋅

slide-37
SLIDE 37

40 EE6882-Chang

Evaluation

Detection False Alarms Misses Correct Dismissals

N images M Benchmark queries K Returned Results

1

  • N

" Irrelevant " Relevant" " 1

  • =

= n Vn

k N n n k k N n n k k n n k k n n k

B V D A V C V B V A − − = − = − = =

∑ ∑ ∑ ∑

− = − = − = − −

) ) 1 ( ( ) ( ) 1 (

1 1 1 1

slide-38
SLIDE 38

41 EE6882-Chang

Evaluation

Given size of the returned results K

Recall Precision Fall out

k

D

k

B

k

C

k

A

“Returned” “Relevant Ground Truth”

) /( ) /( ) /(

k k k k k k k k k k k k

D B B F B A A P C A A R + = + = + =

slide-39
SLIDE 39

42 EE6882-Chang

Evaluation Measures

Precision Recall Curve

  • 2. Receiver Operating Characteristic (ROC Curve)
  • 3. Relative Operating Characteristic
  • 4. R value
  • 5. 3-point Pk value

) (

k k

R P vs

k

P

k

R

k k

B A vs

k k

F A vs

) int(

  • ff

cut at

1

− =

=

N n n k

V k P

08 5 2 at Avg =

k k

R P

Ak (hit) Bk (false)