Efficient Edge Estimation Instructor - Simon Lucey 16-623 - - - PowerPoint PPT Presentation

efficient edge estimation
SMART_READER_LITE
LIVE PREVIEW

Efficient Edge Estimation Instructor - Simon Lucey 16-623 - - - PowerPoint PPT Presentation

Efficient Edge Estimation Instructor - Simon Lucey 16-623 - Designing Computer Vision Apps Today Motivation. What is an Edge? Oriented Filters. Learning Efficient Edges. Persistent versus Occluding Edges Texture and


slide-1
SLIDE 1

Efficient Edge Estimation

Instructor - Simon Lucey

16-623 - Designing Computer Vision Apps

slide-2
SLIDE 2

Today

  • Motivation.
  • What is an Edge?
  • Oriented Filters.
  • Learning Efficient Edges.
slide-3
SLIDE 3
slide-4
SLIDE 4
slide-5
SLIDE 5
slide-6
SLIDE 6

Persistent versus Occluding Edges

  • Texture and geometric edges are “persistent” across

viewpoints.

  • Occluding edges are “unique” to viewpoint, but potentially

provide rich information about a 3D object.

Taken from Ham, Singh and Lucey “Occlusions are Fleeting - Texture is Forever: Moving Past Brightness Constancy”

slide-7
SLIDE 7

Persistent versus Occluding Edges

(a) Rendered Glass (b) Non-persistent Edges (c) Persistent + 3D Poly Edges

Taken from Ham, Singh and Lucey “Occlusions are Fleeting - Texture is Forever: Moving Past Brightness Constancy”

slide-8
SLIDE 8

Taken from Ham, Singh and Lucey “Occlusions are Fleeting - Texture is Forever: Moving Past Brightness Constancy”

slide-9
SLIDE 9

Today

  • Motivation.
  • What is an Edge?
  • Oriented Filters.
  • Learning Efficient Edges.
slide-10
SLIDE 10

1968

  • riginal image

Canny edge detector human annotator

Taken from Isola et al. “Crisp Boundary Detection Using Pointwise Mutual Information”

slide-11
SLIDE 11

Edges are Semantic

  • No mathematical definition of an edge, contour of boundary.
  • Classic problem in computer vision.
  • By definition they are semantic.
  • Early work focussed on oriented filters.
  • e.g. Canny edge detectors.
  • Developed by John F. Canny in 1986.

“John Canny”

slide-12
SLIDE 12
slide-13
SLIDE 13
slide-14
SLIDE 14

D.H. Hubel & T.N. Wiesel. Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of Physiology, 160(1):106, 1962.

slide-15
SLIDE 15

D.H. Hubel & T.N. Wiesel. Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of Physiology, 160(1):106, 1962.

slide-16
SLIDE 16

D.H. Hubel & T.N. Wiesel. Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of Physiology, 160(1):106, 1962.

slide-17
SLIDE 17

D.H. Hubel & T.N. Wiesel. Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. The Journal of Physiology, 160(1):106, 1962.

slide-18
SLIDE 18

Olshausen & Field 1996

slide-19
SLIDE 19
slide-20
SLIDE 20

x1

. . .

x2 xN−1 xN

  . . . . . .   . . . . . .

X =

M × N

slide-21
SLIDE 21

0.05 0.1 0.15 0.2 0.25

  • 6
  • 4
  • 2

2 4 6

x

p(x)

3.19 bits

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

  • 6
  • 4
  • 2

2 4 6

1.41 bits

Olshausen & Field 1996

H(x) = −

n

X

n=1

p(xn) · log[p(xn)] =

slide-22
SLIDE 22

=

X

D

Z

×

+

  • H. Lee, A. Ng, et al. 2007

Not Always Zero Always Zero

Olshausen & Field 1996

  • c)
slide-23
SLIDE 23

Visualizing CNNs

slide-24
SLIDE 24

Today

  • Motivation.
  • What is an Edge?
  • Oriented Filters.
  • Learning Efficient Edges.
slide-25
SLIDE 25

1D Filter

slide-26
SLIDE 26

1D Filter

slide-27
SLIDE 27

1D Filter

slide-28
SLIDE 28

2D Filter

  1, 0, −1 1, 0, −1 1, 0, −1  

(Prewitt)

  1, 0, −1 2, 0, −2 1, 0, −1  

(Sobel)

slide-29
SLIDE 29

2D Filter

+

=

slide-30
SLIDE 30

2D Filter

  • c)
slide-31
SLIDE 31

99.6% sparse per patch

slide-32
SLIDE 32

Not Always Zero Always Zero

=

+

  • M. Zeiler, et al. 2010

x

z1

+ . . . + ∗

zK

dK

d1

  • c)
slide-33
SLIDE 33

Edges

Adapted from: Elder “Are Edges Incomplete?” IJCV 1999.

slide-34
SLIDE 34

Naive Approach?

  • c)

How do we recover edges?

slide-35
SLIDE 35

Canny Edge Detector

Adapted from: Computer vision: models, learning and inference. Simon J.D. Prince

slide-36
SLIDE 36

Canny Edge Detector

Compute horizontal and vertical gradient images h and v

Adapted from: Computer vision: models, learning and inference. Simon J.D. Prince

slide-37
SLIDE 37

Oriented Filters

  • Seems inefficient to have to search all possible orientations,

+

=

  • Instead one can express all orientations as a linear

combination of x- and y- gradient filters.

slide-38
SLIDE 38

Canny Edge Detector

Quantize to 4 directions

Adapted from: Computer vision: models, learning and inference. Simon J.D. Prince

slide-39
SLIDE 39

Canny Edge Detector

Non-maximal suppression

Adapted from: Computer vision: models, learning and inference. Simon J.D. Prince

slide-40
SLIDE 40

Non-Max Suppression

  • Interesting, view of non-max suppression in terms of sparse

coding.

=

X

D

Z

×

  • Non-max suppression attempts to enforce that Z is sparse.
slide-41
SLIDE 41

Canny Edge Detector

Hysteresis Thresholding

slide-42
SLIDE 42

Classic Edge Detector - Problems

Simple Step Edge Filtering Response Texture Edge Filtering Response

Original Color Image

(a)

Appearance Edges Found by Linear Filtering

(b)

Taken from: “Occlusion Boundaries: Low-Level Detection to High-Level Reasoning” - A. Stein (Ph.D. Thesis)

slide-43
SLIDE 43

Today

  • Motivation.
  • What is an Edge?
  • Oriented Filters.
  • Learning Efficient Edges.
slide-44
SLIDE 44

What defines an edge?

  • Color.
  • Brightness.
  • Texture
  • Continuity
  • Symmetry
slide-45
SLIDE 45

Edge Estimation as a Learning Problem

Taken from Martin et al. “Learning to Detect Natural Image Boundaries Using Local Brightness, Color, and Texture Cues“

slide-46
SLIDE 46

Current State of the Art

Sobel & Feldman 1968 Arbeláez et al. 2011 (gPb) ez et al. Dollár & Zitnick 2013 (SE) Our method Human labelers

Taken from Isola et al. “Crisp Boundary Detection Using Pointwise Mutual Information”

Isola et al. 2014

slide-47
SLIDE 47

Current State of the Art - Speed

ODS OIS AP R50 FPS Human .80 .80

  • Canny

.60 .63 .58 .75 15 Felz-Hutt [16] .61 .64 .56 .78 10 Normalized Cuts [10] .64 .68 .45 .81

  • Mean Shift [9]

.64 .68 .56 .79

  • Hidayat-Green [23]

.62†

  • 20

BEL [13] .66†

  • 1/10

Gb [30] .69 .72 .72 .85 1/6 gPb + GPU [8] .70†

  • 1/2‡

ISCRA [42] .72 .75 .46 .89 1/30‡ gPb-owt-ucm [1] .73 .76 .73 .89 1/240 Sketch Tokens [31] .73 .75 .78 .91 1 DeepNet [27] .74 .76 .76

  • 1/5‡

SCG [41] .74 .76 .77 .91 1/280 SE+multi-ucm [2] .75 .78 .76 .91 1/15 SE .73 .75 .77 .90 30 SE+SH .74 .76 .79 .93 12.5 SE+MS .74 .76 .78 .90 6 SE+MS+SH .75 .77 .80 .93 2.5

Taken from Dollar & Zitnick “Fast Edge Detection Using Structured Forests”

“Requires GPU!!!” Dollar & Zitnick

slide-48
SLIDE 48

Edge Detection as Classification

positives

Hard!

{ 0, 1 }

Taken from Dollar & Zitnick “Fast Edge Detection Using Structured Forests”

slide-49
SLIDE 49

Edges have Structure

Taken from Dollar & Zitnick “Fast Edge Detection Using Structured Forests”

slide-50
SLIDE 50

Structured Random Forests

Taken from Dollar & Zitnick “Fast Edge Detection Using Structured Forests”

slide-51
SLIDE 51

Structured Prediction

Taken from Kontschieder “Structured Class-Labels in Random Forests for Semantic Image Labelling”

slide-52
SLIDE 52

Structured versus Pixel Prediction

pixel output ☹ structured output ☺

Taken from Dollar & Zitnick “Fast Edge Detection Using Structured Forests”

slide-53
SLIDE 53

Multiscale Detection

Taken from Dollar & Zitnick “Fast Edge Detection Using Structured Forests”

slide-54
SLIDE 54

Taken from Dollar & Zitnick “Fast Edge Detection Using Structured Forests”

slide-55
SLIDE 55

Taken from Dollar & Zitnick “Fast Edge Detection Using Structured Forests”

slide-56
SLIDE 56

Taken from Dollar & Zitnick “Fast Edge Detection Using Structured Forests”

slide-57
SLIDE 57

Things to try in iOS - GPUImage

slide-58
SLIDE 58

More to read…

  • Prince et al.
  • Chapter 13, Sections 1-2.
  • Corke et al.
  • Chapter 13, Section 2.
  • P. Isola et al. “Crisp Boundary Detection Using Pointwise

Mutual Information”, ECCV 2014.

  • P. Dollar & C. L. Zitnick, “Fast Edge Detection Using

Structured Forests”, PAMI 2015.

1 Fast Edge Detection Using Structured Forests Piotr Doll´ ar and C. Lawrence Zitnick Microsoft Research {pdollar,larryz}@microsoft.com Abstract—Edge detection is a critical component of many vision systems, including object detectors and image segmentation
  • algorithms. Patches of edges exhibit well-known forms of local structure, such as straight lines or T-junctions. In this paper we take
advantage of the structure present in local image patches to learn both an accurate and computationally efficient edge detector. We formulate the problem of predicting local edge masks in a structured learning framework applied to random decision forests. Our novel approach to learning decision trees robustly maps the structured labels to a discrete space on which standard information gain measures may be evaluated. The result is an approach that obtains realtime performance that is orders of magnitude faster than many competing state-of-the-art approaches, while also achieving state-of-the-art edge detection results on the BSDS500 Segmentation dataset and NYU Depth dataset. Finally, we show the potential of our approach as a general purpose edge detector by showing our learned edge models generalize well across datasets. F 1 INTRODUCTION Edge detection has remained a fundamental task in computer vision since the early 1970’s [18], [15], [43]. The detection
  • f edges is a critical preprocessing step for a variety of
tasks, including object recognition [47], [17], segmentation [33], [1], and active contours [26]. Traditional approaches to edge detection use a variety of methods for computing color gradients followed by non-maximal suppression [7], [19], [50]. Unfortunately, many visually salient edges do not correspond to color gradients, such as texture edges [34] and illusory contours [39]. State-of-the-art edge detectors [1], [41], [31], [21] use multiple features as input, including brightness, color, texture and depth gradients computed over multiple scales. Since visually salient edges correspond to a variety of visual phenomena, finding a unified approach to edge detection is
  • difficult. Motivated by this observation several recent papers
have explored the use of learning techniques for edge detection [13], [49], [31], [27]. These approaches take an image patch and compute the likelihood that the center pixel contains an
  • edge. Optionally, the independent edge predictions may then
be combined using global reasoning [1], [41], [49], [2]. Edges in a local patch are highly interdependent [31]. They often contain well-known patterns, such as straight lines, parallel lines, T-junctions or Y-junctions [40], [31]. Recently, a family of learning approaches called structured learning [36] has been applied to problems exhibiting similar characteristics. For instance, [29] applies structured learning to the problem
  • f semantic image labeling for which local image labels are
also highly interdependent. In this paper we propose a generalized structured learning approach that we apply to edge detection. This approach allows us to take advantage of the inherent structure in edge patches, while being surprisingly computationally efficient. We can compute edge maps in realtime, which is orders of magnitude faster than competing state-of-the-art approaches. A random forest framework is used to capture the structured
  • Fig. 1.
Edge detection results using three versions of our Structured Edge (SE) detector demonstrating tradeoffs in accu- racy vs. runtime. We obtain realtime performance while simul- taneously achieving state-of-the-art results. ODS numbers were computed on BSDS [1] on which the popular gPb detector [1] achieves a score of .73. The variants shown include SE, SE+SH, and SE+MS+SH, see §4 for details. information [29]. We formulate the problem of edge detection as predicting local segmentation masks given input image
  • patches. Our novel approach to learning decision trees uses
structured labels to determine the splitting function at each branch in the tree. The structured labels are robustly mapped to a discrete space on which standard information gain measures may be evaluated. Each forest predicts a patch of edge pixel labels that are aggregated across the image to compute our final edge map, see Figure 1. Since the aggregated edge maps may be diffuse, the edge maps may optionally be sharpened using local color and depth cues. We show state-of-the-art results on both the BSDS500 [1] and the NYU Depth dataset [44]. We demonstrate the potential of our approach as a general purpose edge detector by showing the strong cross dataset generalization of our learned edge models. arXiv:1406.5549v2 [cs.CV] 25 Nov 2014