Mathematical and Perceptual Models for Image Segmentation Thrasos - - PowerPoint PPT Presentation

mathematical and perceptual models for image segmentation
SMART_READER_LITE
LIVE PREVIEW

Mathematical and Perceptual Models for Image Segmentation Thrasos - - PowerPoint PPT Presentation

Mathematical and Perceptual Models for Image Segmentation Thrasos Pappas Electrical & Computer Engineering Department Northwestern University pappas@ece.northwestern.edu http://www.ece.northwestern.edu/~pappas Banff, July 27, 2005 People


slide-1
SLIDE 1

Mathematical and Perceptual Models for Image Segmentation

Thrasos Pappas Electrical & Computer Engineering Department Northwestern University

pappas@ece.northwestern.edu http://www.ece.northwestern.edu/~pappas

Banff, July 27, 2005

slide-2
SLIDE 2

2

Thrasos Pappas, Banff, July 27, 2005

People

! Junqing Chen, Unilever Research ! Dejan Depalov, Northwestern University ! Aleksandra Mojsilovic, IBM T.J. Watson Research Center ! Bernice Rogowitz, IBM T.J. Watson Research Center ! Dongge Li, Motorola Labs ! Bhavan Gandhi, Motorola Labs

slide-3
SLIDE 3

3

Thrasos Pappas, Banff, July 27, 2005

Problem

Images “Ideal” Segmentations Semantic Categories

people

sky mountain manmade cityscape water landscape forest sky forest

  • utdoor
slide-4
SLIDE 4

4

Thrasos Pappas, Banff, July 27, 2005

Semantic Information Extraction

! Motivation

– Proliferation of image and video acquisition devices (digital still and video cameras, image and video phones, PDAs) – World rich in digital visual content – Large personal repositories (consumer market) – Increasing processing capabilities

! Goal: Intelligent content management

– Semantic labeling – Content organization – Efficient retrieval

! Techniques

– Image and video segmentation – Extracting semantically related features – Relating features to semantic categories

slide-5
SLIDE 5

5

Thrasos Pappas, Banff, July 27, 2005

Challenges

! What are the important semantic categories? ! How to link the low-level features to semantically important categories?

slide-6
SLIDE 6

6

Thrasos Pappas, Banff, July 27, 2005

Semantic Categories

! Recent perceptual experiments by Mojsilovic and Rogowitz identified important semantic categories that humans use for image classification Less human-like More human-like Man-made Natural ! Conjecture: Semantic categories can be derived from combinations of low-level image features

slide-7
SLIDE 7

7

Thrasos Pappas, Banff, July 27, 2005

Bridging the Semantic Gap

Semantics High level Use segment descriptors and statistical techniques to relate segments (first) and scenes (later) to semantic categories/labels

Perceptually Uniform

Segments Medium level Incorporate knowledge of human perception and image characteristics into feature extraction and algorithm design Primitives Low level

slide-8
SLIDE 8

Adaptive Clustering Algorithm

slide-9
SLIDE 9

9

Thrasos Pappas, Banff, July 27, 2005

Adaptive Clustering Algorithm

K-means Class Labels ACA Class Labels Original Image

slide-10
SLIDE 10

10

Thrasos Pappas, Banff, July 27, 2005

Adaptive Clustering Algorithm (ACA)

! K-means clustering (LBG) – Based on image histogram – No spatial constraints – Each cluster is characterized by constant intensity ! Add spatial constraints – Region model: Markov/Gibbs random field ! Make it adaptive – Cluster centers spatially varying – Texture model: spatially varying mean + WGN ! MAP estimates of segmentation x given observation y

) ( ) | ( ) | ( x p x y p y x p ∝

slide-11
SLIDE 11

11

Thrasos Pappas, Banff, July 27, 2005

ACA

! K-means minimizes ! Adaptive clustering maximizes ! Or, minimizes

! " # $ % & − − − ∝

' '

) ( ) ( 2 1 exp ) | (

2 2

x V y y x p

C C x s s s

s

µ σ

'

s x s

s

y

2

) ( µ ) ( ) ( 2 1

2 2

x V y

C C x s s s

s

' '

+ − µ σ

slide-12
SLIDE 12

12

Thrasos Pappas, Banff, July 27, 2005

ACA: Local Intensity Function Estimation

! Given , segmentation into classes ! Estimate Intensity function for each class at each point in the image ! Use hierarchy of window sizes

x s x s

x s

s

, , ∀ µ

slide-13
SLIDE 13

13

Thrasos Pappas, Banff, July 27, 2005

ACA

slide-14
SLIDE 14

14

Thrasos Pappas, Banff, July 27, 2005

ACA: Region Estimation

! Given ! Maximize (too difficult) ! Maximize marginal densities (Iterated Conditional Modes)

) , , | ( ) , , | (

s q s s q s

N q x y x p s q x y x p ∈ = ≠ ∀ s x s

x s

s

, , ∀ µ ) | ( y x p

slide-15
SLIDE 15

15

Thrasos Pappas, Banff, July 27, 2005

K-means vs. ACA

slide-16
SLIDE 16

16

Thrasos Pappas, Banff, July 27, 2005

K-means Clustering

slide-17
SLIDE 17

17

Thrasos Pappas, Banff, July 27, 2005

K-means Clustering

slide-18
SLIDE 18

18

Thrasos Pappas, Banff, July 27, 2005

ACA: Local Intensity Functions (15x15)

slide-19
SLIDE 19

19

Thrasos Pappas, Banff, July 27, 2005

ACA: Model (15x15)

slide-20
SLIDE 20

20

Thrasos Pappas, Banff, July 27, 2005

Adaptive Clustering Algorithm

ACA Class Labels ACA Model (7x7) Original Image

slide-21
SLIDE 21

21

Thrasos Pappas, Banff, July 27, 2005

Adaptive Clustering Algorithm

ACA Class Labels ACA Model (15x15) Original Image

slide-22
SLIDE 22

22

Thrasos Pappas, Banff, July 27, 2005

Adaptive Clustering Algorithm

ACA Class Labels ACA Model (31x31) Original Image

slide-23
SLIDE 23

23

Thrasos Pappas, Banff, July 27, 2005

Image Restoration Models

! Simple space varying image model [Kuan et al.` 85]

– Space-varying mean + white Gaussian noise

! Spatially-adaptive LMMSE estimator

– Use local sample mean and local sample variance

! No explicit model for region boundaries

– Computes sample mean/variance across boundaries

slide-24
SLIDE 24

24

Thrasos Pappas, Banff, July 27, 2005

K-means vs. ACA

slide-25
SLIDE 25

25

Thrasos Pappas, Banff, July 27, 2005

ACA

slide-26
SLIDE 26

Adaptive Perceptual Color-Texture Segmentation

slide-27
SLIDE 27

27

Thrasos Pappas, Banff, July 27, 2005

Natural Textures

! Combine color composition,

spatial characteristics

! Non-uniform statistical

characteristics (lighting, perspective)

! Perceptually uniform ! Need spatially adaptive features ! Small number of parameters

slide-28
SLIDE 28

28

Thrasos Pappas, Banff, July 27, 2005

Texture Synthesis [Portilla-Simoncelli’00]

slide-29
SLIDE 29

29

Thrasos Pappas, Banff, July 27, 2005

Adaptive Perceptual Color-Texture Segmentation

← Slowly varying Dominant Colors

Color Composition Feature Extraction Spatial Texture Feature Extraction Original Final segmentation

← Texture Class Labels

Grayscale

slide-30
SLIDE 30

30

Thrasos Pappas, Banff, July 27, 2005

Dominant Colors

! Human eye cannot simultaneously perceive a large number of colors – Even though, under appropriate adaptation, it can distinguish more than 2M colors ! Small set of color categories – Efficient representation – Easier to capture invariant properties of object appearance ! Color categories are related statistical structure of perceived environment – K-means clustering to compute color categories [Yendrikovskij’00]

slide-31
SLIDE 31

31

Thrasos Pappas, Banff, July 27, 2005

Spatially Adaptive Dominant Colors

! Dominant colors [Ma’97, Mojsilovic’00] – For class of images – For a given image ! Current approaches to extract dominant colors: – K-means (VQ) [LBG’80]; – Mean-shift [Comaniciu-Meer’97]; Assumption: constant dominant colors ! Proposed approach: – Spatially adaptive dominant colors – Use ACA

slide-32
SLIDE 32

32

Thrasos Pappas, Banff, July 27, 2005

Comparison with Mean-Shift

4 colors

ACA Original Image quantization

  • ver-segmentation

under-segmentation

slide-33
SLIDE 33

33

Thrasos Pappas, Banff, July 27, 2005

Color Composition Feature

! Constant Dominant Colors: ! Spatially Adaptive Dominant Colors: ! ACA adapts to local characteristics. ! Dominant colors relatively constant in small neighborhood: Can approximate with intensity at center of window.

( )

[ ]

{ }

1 , , , , , , ) , ( ∈ = =

i i i s c

p n i p c N s f !

: color : percentage

( )

[ ]

{ }

1 , , , , , , ∈ = =

i i i c

p n i p c f !

i

c

i

p

slide-34
SLIDE 34

34

Thrasos Pappas, Banff, July 27, 2005

Color Feature Similarity Metric

! Optimal Color Composition Distance (OCCD) [Mojsilovic’00]

– Quantize color component based on percentage – Find best color correspondence – Then compute distance as sum of distances between matched colors (in a given colorspace)

slide-35
SLIDE 35

35

Thrasos Pappas, Banff, July 27, 2005

Illustration of OCCD computation

A :( ,30) ( ,30) ( ,20) ( ,20) B :( ,40) ( ,30) ( ,30) A : B : A : B : 131 30 55 61 OCCD dist = 61*.3+55*.2+30*.1+131*.1=45.4

  • Color Quantization unit p = 10
  • Weight of the link is Cmax-cost

(color distance in Lab color space, Cmax =376)

  • Solve maximum graph

matching problem using Gabow’s algorithm.

  • Apply color metric to resulting

graph.

slide-36
SLIDE 36

36

Thrasos Pappas, Banff, July 27, 2005

Spatial Texture Features

! Grayscale image component (vs. achromatic pattern map) ! Multiscale frequency decomposition – DWT (9/7 Daubechies) – Steerable filters [Freeman-Adelson’91] – Gabor filters [Daugman’86] ! Energy of subband coefficients is sparse – Use local median energy

slide-37
SLIDE 37

37

Thrasos Pappas, Banff, July 27, 2005

Steerable Pyramid Decomposition

π π π − π −

Ideal spectrum 1-level decomposition Ideal spectrum 2-level decomposition

slide-38
SLIDE 38

38

Thrasos Pappas, Banff, July 27, 2005

Steerable Pyramid Decomposition

π π π − π −

Ideal spectrum Actual spectrum

slide-39
SLIDE 39

39

Thrasos Pappas, Banff, July 27, 2005

Smooth vs. Non-smooth Classification

! For each pixel: – Smax = Maximum of 4 subband responses – Si = Index of maximum coefficients – Local median energy extraction on Smax – 2-level K-means on local median (Check validity of smooth/non-smooth cluster) – Use threshold provided by subjective test

slide-40
SLIDE 40

40

Thrasos Pappas, Banff, July 27, 2005

Classification of Non-smooth Regions

! Construct local histogram of Si ! “Complex”: no dominant orientation, i.e., no index dominates (1st and 2nd maximum of histogram are close, or maximum is not large enough) ! Otherwise classify according to dominant orientation (max index) as “horizontal,” “vertical,” “+45,” “-45.” ! Can be used with any multiscale frequency decomposition

Max Indices Si Texture classes

slide-41
SLIDE 41

41

Thrasos Pappas, Banff, July 27, 2005

Multi-scale Texture Classification

! Apply texture classification at each scale ! Combine texture classes from different

scales based on the following rules:

– “smooth”: “smooth” at all scales – “Vertical,” “Horizontal,” “+45o,” “-45o”: consistent texture classification across all scales. Note: “complex” or “smooth” is consistent with any single direction – “complex”: none of above satisfied

slide-42
SLIDE 42

42

Thrasos Pappas, Banff, July 27, 2005

Image Segmentation

! “Smooth” regions:

– Based on ACA – Merge based on color difference along border of each region pair – Small border regions merged with non-smooth

! “Texture” regions:

– Initial segmentation by region growing – Iterative border refinement

After Merge Before Merge Crude segmentation Final segmentation

slide-43
SLIDE 43

43

Thrasos Pappas, Banff, July 27, 2005

Initial Segmentation by Region Growing

! Starting from any pixel in the textured regions, grow by adding nearby pixels with similar color features (in the OCCD sense). ! Use higher threshold if pixels belong to same texture class; lower threshold if pixels belong to different texture classes ! Hierarchical grid approach ! Paint the resulting segment with average color of that region.

ACA image Texture classes Crude segmentation

slide-44
SLIDE 44

44

Thrasos Pappas, Banff, July 27, 2005

Hierarchical Grid Approach

Black: non-texture region White: textured region

! Do initial region growing on coarse grid using OCCD ! Reduce grid spacing (half) ! Find OCCD to the classified

  • neighbors. If close to none,

create new texture class. ! Add simple spatial constraints (MRF-type) to OCCD distance ! Repeat until all pixels are classified. ! Faster without loss of accuracy

slide-45
SLIDE 45

45

Thrasos Pappas, Banff, July 27, 2005

Why MRF Constraints Are Necessary

Crude: Final: β=0 β=0.5 β=1.0

slide-46
SLIDE 46

46

Thrasos Pappas, Banff, July 27, 2005

Iterative Border Refinement

Real Boundary Misclassified Region1 Region 2 Color features in inner window represent local features Color features in outer window represent region-wide characteristics Window pairs used: {35/11, 21/9, 11/5, 11/3}

slide-47
SLIDE 47

47

Thrasos Pappas, Banff, July 27, 2005

Results with steerable filters

without Perceptual Tuning

ACA Segmentation Original Texture Classes

slide-48
SLIDE 48

48

Thrasos Pappas, Banff, July 27, 2005

Results with steerable filters

with Perceptual Tuning

ACA Segmentation Original Texture Classes

slide-49
SLIDE 49

49

Thrasos Pappas, Banff, July 27, 2005

Perceptual Tuning

! Smooth vs. non-smooth classification ! Thresholds for Dominant Orientation

– Horizontal, vertical, +45, -45, complex classification

! Threshold for color feature similarity ! Texture window size

– Varies with scale

slide-50
SLIDE 50

50

Thrasos Pappas, Banff, July 27, 2005

Texture Discrimination Test*

! Setup:

– Viewing distance: about 2 feet; – Subjects with normal vision (corrected), normal color vision – 37 texture images from photo CD at 4-5 scales

* http://www.ece.northwestern.edu/~pappas/research/texture_perception_test/

slide-51
SLIDE 51

51

Thrasos Pappas, Banff, July 27, 2005

Test I: Texture Classification

! Classify image into:

– SMOOTH: Uniform or slowly varying image intensity; no objects or sharp boundaries present. – TEXTURE: Approximately uniform texture patterns; may be slowly varying (further classification into horizontal, vertical, +45, -45, complex categories) – OTHER: None of the above, e.g., non-uniform texture, multiple regions, multiple objects

slide-52
SLIDE 52

52

Thrasos Pappas, Banff, July 27, 2005

Test II: Texture Similarity

! Similarity scores:

– 0: dissimilar – 1: somewhat similar – 2: similar – 3: same texture

slide-53
SLIDE 53

53

Thrasos Pappas, Banff, July 27, 2005

Segmentation Results

slide-54
SLIDE 54

Segmentation Evaluation Metric

slide-55
SLIDE 55

55

Thrasos Pappas, Banff, July 27, 2005

Human Segmentation Examples

! No “ground truth” for natural image segmentation ! The segmentations of different people are consistent.

slide-56
SLIDE 56

56

Thrasos Pappas, Banff, July 27, 2005

Segmentation Evaluation Metric

[Martin’01]

! Quantify the consistency between segmentations of different granularities; allow mutual refinements ! Local error measure (asymmetric): ! Local Consistency Error (LCE): ! Global Consistency Error(GCE): ! GCE ≥ LCE

1 2 1 2 1

( , ) \ ( , ) ( , , ) ( , )

i i i i

R S p R S p E S S p R S p =

1 2 1 2 2 1

1 ( , ) min ( , , ), ( , , )

i i i i

GCE S S E S S p E S S p n & # = % " $ !

' '

{ }

1 2 1 2 2 1

1 ( , ) min ( , , ), ( , , )

i i i

LCE S S E S S p E S S p n = '

slide-57
SLIDE 57

57

Thrasos Pappas, Banff, July 27, 2005

Comparison with JSEG Segmentation

Human Segmentation Proposed Approach JSEG (merge=0.4) GCE=0.33 LCE=0.28 GCE=0.04 LCE=0.02 GCE=0.08 LCE=0.07 GCE=0.04 LCE=0.04

slide-58
SLIDE 58

58

Thrasos Pappas, Banff, July 27, 2005

Comparison with JSEG Segmentation

Human Segmentation Proposed Approach JSEG (merge=0.4) GCE=0.26 LCE=0.17 GCE=0.1 LCE=0.07 GCE=0.11 LCE=0.08 GCE=0.09 LCE=0.04

slide-59
SLIDE 59

Segment Classification

slide-60
SLIDE 60

60

Thrasos Pappas, Banff, July 27, 2005

Semantic Information Extraction at Segment Level

Dominant Colors (ACA)

  • riginal

segment 1 segment 3 Dominant Colors & Percentages quantize

vertical

  • 45

complex 45 horizontal

Segments as Medium Level Descriptors

smooth

Spatial Texture

segment 2

Location Shape Size Plus:

slide-61
SLIDE 61

61

Thrasos Pappas, Banff, July 27, 2005

Color Naming Syntax

black gray white blackish very-dark dark medium light very-light whitish grayish moderate medium strong vivid reddish brownish yellowish greenish bluish purplish pinkish red

  • range

brown yellow green blue purple pink beige magenta

  • live

Achromatic Saturation Lightness Hue secondary Hue primary 267 quantization points (NBS, Mojsilovic’02) Eleven Colors That Are Almost Never Confused (Boynton’89)

slide-62
SLIDE 62

62

Thrasos Pappas, Banff, July 27, 2005

Labels

Segment

Man Made Natural Animal People

Mountain Woods/Bushes Grass Night-sky Day-sky Flower Ground Snow Sun Cityscape Building Face

Vegetation Sky Landform

Bridge Person Water Car Crowd Boat Airplane Forest Clouds Pavement Sunrise/Sunset Other Man Made

Scene

Indoor Outdoor: Street, skyline, beach, garden, night scene, day scene

slide-63
SLIDE 63

63

Thrasos Pappas, Banff, July 27, 2005

Database

! Training ! Testing ! Corel:12,000 ! Key Photos: 2,000 ! Other: 600 ! Corbis ! !

slide-64
SLIDE 64

64

Thrasos Pappas, Banff, July 27, 2005

Annotation Aide

! XML output

slide-65
SLIDE 65

65

Thrasos Pappas, Banff, July 27, 2005

Results

! 1600 photos ! No humans or animals ! 4000 manually labeled segments ! 80% training 20% testing ! Fisher Linear Discriminant method ! 14 colors, 6 textures

slide-66
SLIDE 66

66

Thrasos Pappas, Banff, July 27, 2005

Results

! Recall: correctly labeled / total relevant segments ! Precision: correctly labeled / total assigned to label by algorithm