Hypercolumns for Object Segmentation and Fine-grained Localization - - PowerPoint PPT Presentation

hypercolumns for object segmentation and fine grained
SMART_READER_LITE
LIVE PREVIEW

Hypercolumns for Object Segmentation and Fine-grained Localization - - PowerPoint PPT Presentation

Hypercolumns for Object Segmentation and Fine-grained Localization Bharath Hariharan, Pablo Arbelaez, RossGirshick, Jitendra Malik Gksu Erdoan Image Classification horse, person, building Slide credit:Bharath Hariharan Object Detection


slide-1
SLIDE 1

Hypercolumns for Object Segmentation and Fine-grained Localization

Bharath Hariharan, Pablo Arbelaez, RossGirshick, Jitendra Malik Göksu Erdoğan

slide-2
SLIDE 2

Image Classification

horse, person, building

Slide credit:Bharath Hariharan

slide-3
SLIDE 3

Object Detection

Slide credit:Bharath Hariharan

slide-4
SLIDE 4

Simultaneous Detection and Segmentation

Detect and segment every instanceof the categoryin the image

  • B. Hariharan, P. Arbelaez, R.

Girshick, and J. Malik. Simultaneous detection and

  • segmentation. In ECCV, 2014

Slide credit:Bharath Hariharan

slide-5
SLIDE 5

SDS Semantic Segmentation

Slide credit:Bharath Hariharan

slide-6
SLIDE 6

Simultaneous Detection and Part Labeling

Detect and segment every instanceof the categoryin the image and labelits parts

Slide credit:Bharath Hariharan

slide-7
SLIDE 7

Simultaneous Detection and Keypoint Prediction

Detect every instanceof the category in the imageand mark its keypoints

Slide credit:Bharath Hariharan

slide-8
SLIDE 8

Motivation

§ Task: Assigncategory labelsto imagesor boundingboxes § General Approach: Output of last layer of CNN § This is most sensitive to category-levelsemanticinformation § The informationis generalizedover in the top layer

§ Is output of last layer of CNN appropriate for finer-

grained problems?

slide-9
SLIDE 9

Motivation

§ Not optimal representation! § Last layer of CNN is mostly invariant to ‘nuisance ’ variablessuch as

pose, illumination, articulation, preciselocation…

§ Pose and nuisancevariablesare preciselywhat we interested in.

§ How can we get such an information?

slide-10
SLIDE 10

Motivation

§ It is present in

intermediate layers

§ Less sensitive

to semantics

slide-11
SLIDE 11

Motivation

§ Top layerslose localizationinformation § Bottom layers are not semanticenough

§ Combine both

slide-12
SLIDE 12

Detection and Segmentation

 Simultaneous detection and segmentation

  • B. Hariharan, P. Arbelaez, R. Girshick, and J. Malik.

Simultaneous detection and segmentation. In ECCV, 2014

slide-13
SLIDE 13

Combining features across multiple levels:

Combine subsampled intermediate layerswith top layer Difference Upsampling

Pedestrian Detectionwith Unsupervised Multi-StageFeature Learning Sermanet et. al.

PedestrianDetection

slide-14
SLIDE 14

Framework

§ Start from a detection (R-CNN) § Heatmaps § Use category-specific, instance-specificinformationto… § Classifyeachpixel in detection window

Slide credit:Bharath Hariharan

slide-15
SLIDE 15

One Framework, Many Tasks:

SDS Does the pixel belong to the object? Part labeling Which part does the pixel belong to? Pose estimation Does it lie on/near a particular keypoint

Task Classification Target

Slide credit:Bharath Hariharan

slide-16
SLIDE 16

Heatmaps for each task

§ Segmentation:

§ Probability that a particular locationinside the object

§ Part Labeling:

§ Separate heatmap for each part § Each heatmap is the probability a location belongs to that part

§ KeypointPrediction

§ Separate heatmap for each keypoint § Each heatmap isthe probability of the keypoint at a particular location

slide-17
SLIDE 17

Hypercolumns

Slide credit:Bharath Hariharan

slide-18
SLIDE 18

Hypercolumns

§ Term derived from Hubel and

Wiesel

§ Re-imaginesold ideas:

§ Jets(Koenderink and van Doorn) § Pyramids(Burt andAdelson) § Filter Banks(Malik and Perona)

Slide credit:Bharath Hariharan

slide-19
SLIDE 19

Computing the Hypercolumn Representation

§ Upsamplingfeature map F to f § feature vector for at locationi § alfa_ik: positionof i and k in the box § Concatenate features from every locationto one long vector

slide-20
SLIDE 20

Interpolating into grid of classifiers

§ Fully connectedlayerscontribute to global instance-specificbias § Different classifierforeach locationcontribute to seperate instance-

specificbias

§ Simplest way to get locationspecificclassifier:

§ train seperate classifiersat each 50x50 locations

§ What would be the problems of this approach?

slide-21
SLIDE 21

Interpolating into grid of classifiers

1.

Reduce amount of data for eachclassifierduringtraining

2.

Computationallyexpensive

3.

Classifiervary with locations

4.

Risk of overfitting

How can we escape from these problems?

slide-22
SLIDE 22

Interpolate into coarse grid of classifiers

§ Train a coarse KxKgrid of classifiersandinterpolate between them § Interpolate grid of functionsinstead of values § Each classifierin the grid is a functiongk(.) § gk(feature vector)=probability § Score of i’th pixel

slide-23
SLIDE 23

Training classifiers

§ Interpolationis not used in train time § Divide each box to KxK grid § Training data for k’th classifieronlyconsistsof pixels from the k’th

grid cell acrossall traininginstances.

§ Train with logisticregression

slide-24
SLIDE 24

Hypercolumns

Slide credit:Bharath Hariharan

slide-25
SLIDE 25

Efficient pixel classification

§ Upsamplinglargefeature maps is expensive! § If classificationandupsamplingare linear

§ Classification o upsampling=Upsampling oclassification

§ Linear classification=1x1 convolution

§ Extension : use nxn convolution

§ Classification=convolve,upsample,sum,sigmoid

slide-26
SLIDE 26

Efficient pixel classification

Slide credit:Bharath Hariharan

slide-27
SLIDE 27

Efficient pixel classification

Slide credit:Bharath Hariharan

slide-28
SLIDE 28

Efficient pixel classification

Slide credit:Bharath Hariharan

slide-29
SLIDE 29

Representation as a neural network

slide-30
SLIDE 30

Training classifiers

§ MCG candidatesoverlapswith ground truth by %70 or more § For eachcandidate findmost overlappedground truth instance § Crop ground truth to the expandedboundingbox of the candidate § Label locationspositiveor negative accordingto problem

slide-31
SLIDE 31

Experiments

slide-32
SLIDE 32

Evaluation Metric

§ Similar to bounding box detection metric § Box overlap=

§

§ If box overlap> threshold, correct

Slide credit:Bharath Hariharan

slide-33
SLIDE 33

Evaluation Metric

§ Similar to bounding box detection metric § But with segments instead of boundingboxes § Each detection/GT comes with a segment

segment overlap= ∩ ∪

§ If segment overlap> threshold, correct

Slide credit:Bharath Hariharan

slide-34
SLIDE 34

Task 1:SDS

§ System 1:

§ Refinement step with hypercolumnsrepresentation § Features

§ Top-level fc7 features § Conv4 features § Pool2 features § 1/0 according to location was inside original regioncandidate or not § Coarse 10x10 discretizationof original candidate into 100-dimensional vector

§ 10x10 grid of classifiers § Project predictionsover superpixelsand average

slide-35
SLIDE 35

Task 1:SDS System 1

slide-36
SLIDE 36

Task 1:SDS

§ System 2: § MCG insteadof selective

search

§ Expand set of boxes by adding

nearby high-scoringboxes after NMS

slide-37
SLIDE 37

Task 1:SDS

slide-38
SLIDE 38

Hypercolumns vs Top Layer

slide-39
SLIDE 39

Hypercolumns vs Top Layer

Slide credit:Bharath Hariharan

slide-40
SLIDE 40

Task 2:Part Labeling

Slide credit:Bharath Hariharan

slide-41
SLIDE 41

Task 2:Part Labeling

slide-42
SLIDE 42

Task 2:Part Labeling

slide-43
SLIDE 43

Task 3: Keypoint Prediction

slide-44
SLIDE 44

Task 3: Keypoint Prediction

slide-45
SLIDE 45

Task 3: Keypoint Prediction

slide-46
SLIDE 46

Conclusion

§ A general framework for fine-grained localization that:

§ Leverages information from multiple CNN layers § Achieves state-of-the-art performance on SDS and part labeling and accurate results on keypoint prediction

Slide credit:Bharath Hariharan

slide-47
SLIDE 47

Future Work

§ applyinghypercolumnrepresentationto fine-grained tasks

§ Attribute classification § Action classification § …

slide-48
SLIDE 48

Questions???

slide-49
SLIDE 49

THANK YOUJ