YIN XU 1. Image Segmentaion & Retrieval What is image - - PowerPoint PPT Presentation

▶

Jan 23, 2023 9 likes •240 views

CS688: Large-Scale Image & Video Retrieval (Spring 2020) YIN XU 1. Image Segmentaion & Retrieval What is image segmentation? Whats the relationship to image retrieval? 2. Current challenges & solutions: Challenges: Intra-class

SLIDE 1

CS688: Large-Scale Image & Video Retrieval (Spring 2020)

YIN XU

SLIDE 2

1. Image Segmentaion & Retrieval

What is image segmentation? What’s the relationship to image retrieval?

2. Current challenges & solutions:

Challenges: Intra-class inconsistency & Inter-class indistincition Solutions: point-based & countor-basede

3. PointRend:Image Segmentation as Rendering
4. Summary

SLIDE 3

3 5/12/2020

"T wo men riding on a bike in front of a building on the road. And there is a car."

What is semantic segmentation?

Idea: recognizing, understanding what's in the image in pixel level.

SLIDE 4

4 5/12/2020

Why semantic segmentation?

1. Robot vision and understanding
2. Autonomous driving
3. Medial image analysis

SLIDE 5

5 5/12/2020

Interesting topics of segmentation:

1. 2D images: (general) sematic segmentation, instance segmentation
2. 3D images: Point clouds
3. Video segmentation

SLIDE 6

6 5/12/2020

Semantic segmentation: a process of assigning a label to every pixel in the image Instance segmentation: treat multiple objects of the same class as distinct individual objects (or instances)

SLIDE 7

7 5/12/2020

Segmentation-based Retrieval (mainly for object-based retrieval):

1. Avoiding large number of regions in one image
--- manageable regions / objects
2. Extracting simple boundary regions (avoiding disturbrance):
--- segmented regions can be a unit in retrieval
3. Make a robust datatset descriptor
--- reduce search space

SLIDE 8

Challenges:

8 5/12/2020

Intra-class Inconsistency: The same semantic label but different appearances Inter-class Indistinction: Different semantic labels but with similar appearances

SLIDE 9

9 5/12/2020

Deep Snake for Real-Time Instance Segmentation

SLIDE 10

10 5/12/2020

Deep Snake for Real-Time Instance Segmentation,CVPR 2020

SLIDE 11

11 5/12/2020

Efficient Segmentation: Learning Downsampling Near Semantic Boundaries, ICCV 2019

Steps:

1) compute the boundary map with given semantic labels. 2) For each pixel, find the closet pixel on the boundary.

SLIDE 12

5/12/2020

upsampling +correction

SLIDE 13

13 5/12/2020 Coarse features FG predictions Coarse prediction N*C*7*7 N*C*7*7 cat Iteratively “renderrin g” Target size N*2*C*7*7

From 7*7 to 224*224:

---X

224 7 =5 iterations

input

SLIDE 14

14 5/12/2020

Correction: 3-layer MLP

Notes:

Last step of segmentation:

--map all vectors to a K-d space (with conv1*1)
--using argmax() (pixel classification)
---use the indices as its classification

Steps:

1) Upsample (Bilinear Interpolation) 2) Uncertainty calculation:

-- the difference between the most & second

most confidence

-- set a threshold 0.5

3) Generate k*N points from uniform distribution and then select the top β ∗ N ones (uncertain). 4) Feed selected pixels into 3-layer MLP

SLIDE 15

15 5/12/2020

Correction: 3-layer MLP

upsamle

N,K,W,H N,K,2*W,2*H

uncertaint y

selectio n Sampling

N,K,2*W,2* H

SLIDE 16

16 5/12/2020

Sampling Steps: from 77 to 112112

When N = 28 ∗ 28

SLIDE 17

17 5/12/2020 Key-point Sampling segmentation Key-point Sampling

SLIDE 18

18 5/12/2020 Instance Segmentation

Point Rend (Segementation) Point Rend: instance

SLIDE 19

19 5/12/2020

Point Rend (Segementation) Point Rend: instance

SLIDE 20

20 5/12/2020

SLIDE 21

21 5/12/2020

SLIDE 22

22 5/12/2020

Summary: Problem: inconsistent segmentation around edge regions Method: key-point detection + pixel-wise correction Components: 1) Sampling method: coarse prediction + uncertainty 2) Pixel correction : 3-layer MLP 3) Process: iteratively implement upsampling +correction Personal thinkings: Ads: 1) Fine-grained segmentation 2) edge preservation Dis: may not that useful in general semenatics.

SLIDE 23

23 5/12/2020 23

CS688: Large-Scale Image & Video Retrieval (Spring 2020)

YIN XU

Segmentation-based Retrieval (mainly for object-based retrieval):

Sampling Steps: from 7*7 to 112*112

Q & A

Sampling Steps: from 77 to 112112