perspective click and drag area selections in pictures
play

Perspective click-and-drag area selections in pictures Frank - PowerPoint PPT Presentation

Perspective click-and-drag area selections in pictures Frank NIELSEN www.informationgeometry.org Sony Computer Science Laboratories, Inc. Machine Vision Applications (MVA) 21st May 2013 c 2013 Frank Nielsen 1/30 Traditional click and


  1. Perspective click-and-drag area selections in pictures Frank NIELSEN www.informationgeometry.org Sony Computer Science Laboratories, Inc. Machine Vision Applications (MVA) 21st May 2013 c � 2013 Frank Nielsen 1/30

  2. Traditional click and drag rectangular selection → Fails for selecting parts in photos: c � 2013 Frank Nielsen 2/30

  3. Traditional click and drag rectangular selection → Fails for selecting parts in photos: Cannot capture “New” without part of “Court”. Man-made environments: many perspectively slanted planar parts. c � 2013 Frank Nielsen 3/30

  4. Perspective click’n’drag Intelligent UI (= computer vision + human computer interface) → Image “parsing” of perspective rectangles (automatic/semi-automatic/manual) c � 2013 Frank Nielsen 4/30

  5. Video demonstrations Perspective click-and-drag + perspective copy/paste/swap c � 2013 Frank Nielsen 5/30

  6. Perspective click’n’drag: Outline 1. Preprocessing: Detect & structure perspective parts 1.1 Quad detector: ◮ Image segmentation ◮ Outer contour quad fitting ◮ Quad recognition 1.2 Quad homography tree 2. Interactive user interface: Perspective quad selection based on click-and-drag UI (=diagonal selection) 3. Application example: Interactive image editing (swap) c � 2013 Frank Nielsen 6/30

  7. Preprocessing workflow c � 2013 Frank Nielsen 7/30

  8. Quad detection: Sobel/Hough transform How to detect convex quads in images? indoor robotics [6] using vanishing point. Limitations of Hough transform [8] on Sobel image: Combinatorial line arrangement O ( n 4 )... → good for limited number of detected lines (blackboard detection [8], name card detection, etc.) c � 2013 Frank Nielsen 8/30

  9. Quad detection: Image segmentation (SRM) → Fast Statistical Region Merging [4] (SRM) Source codes in Java TM , Matlab R � , Python R � , C, etc. c � 2013 Frank Nielsen 9/30

  10. Quad detection: Image segmentation (SRM) c � 2013 Frank Nielsen 10/30

  11. Quad detector ◮ For each segmented region, consider its exterior contour C (polygon), ◮ Compute the contour diameter, P 1 P 3 , ◮ Compute the upper most P 2 and bottom most P 4 extremal points ◮ Calculate the symmetric Haussdorf distance between quad Q = ( P 1 , P 2 , P 3 , P 4 ) and contour C , ◮ Accept region as quad when distance falls below as prescribed threshold. All quads convex and clockwise oriented . c � 2013 Frank Nielsen 11/30

  12. Quad detection: Image segmentation (SRM) ... any closed contour image segmentation, → run at different scales (eg., parameter Q in SRM). Alternatively, can also use mean-shift [9], normalized cuts [7], etc. Why? To increase the chance of detecting for some parameter tuning quads. → We end up with a quad soup c � 2013 Frank Nielsen 12/30

  13. Multi-segmentation Increases the chance of recognizing quads, but get a quad soup. Q = 128 Q = 10 Q = 0 . 3 Q = 0 . 25 c � 2013 Frank Nielsen 13/30

  14. Nested convex quad hierarchy ◮ From a quad soup, sort the quads in decreasing order of their area in a priority queue. ◮ Add image boundary quad Q 0 as the quad root of the quad tree Q . ◮ Greedy selection: Add a quad of the queue if and only if it is fully contained in another quad of Q . ◮ When adding a quad Q i , compute the homographies [2] H i and H − 1 of the quad to the unit square. i c � 2013 Frank Nielsen 14/30

  15. Do not explicit unwarp perspective rectangles Many existing systems first unwarp... source segmented unwarped Mobile cell phone signage recognition [5], AR systems, etc. c � 2013 Frank Nielsen 15/30

  16. Perspective click’n’drag: User interaction Perspective sub-rectangle selection: Clicking on a corner p 1 and dragging the opposite corner p 3 . find the deepest quad Q in the quad hierarchy Q that contains both points p 1 and p 3 . Unit H square ¯   x ′ ¯ p 1 p ′ ˜ 2 = y ′ p ′ ˜ 1 = H ˜ p 1   p 2 ← p ′ 2 1 H − 1 perspective regular H dragging dragging p 4 ← p ′ 4 ˜ p ′ 3 = H ˜ p 3  x ′  p 3 p ′ ˜ 4 = y ′   H − 1 1 c � 2013 Frank Nielsen 16/30

  17. Some examples of perspective click-and-drag selections Regular vs. perspective rectangle UI selection c � 2013 Frank Nielsen 17/30

  18. Implementation details: Primitives on convex quads By convention, order quads clockwise. Positive determinant for the two quad-induced triangles: � x 1 − x 3 � �� x 2 − x 3 � � det = � � y 1 − y 3 y 2 − y 3 � � ◮ Predicate p ∈ Q = ( p 1 , p 2 , p 3 , p 4 )?: Two queries: p ∈ ( p 1 , p 2 , p 3 ) and p ∈ ( p 3 , p 4 , p 1 ). ◮ Area of a quad: One half of the absolute value of the determinant of the two quad triangles. c � 2013 Frank Nielsen 18/30

  19. In class Quadrangle double area(Feature p1 , Feature p2 , Feature p3) { double res; res =(p1.x-p3.x)*(p2.y-p1.y) -(p1.x-p2.x)*(p3.y-p1.y); return 0.5*Math.abs(res); // half of determinant } double area() { return (area(p1 ,p2 ,p3)+area(p1 ,p3 ,p4)); } // // Clockwise or aligned order predicate // boolean CW(Feature a, Feature b, Feature c) { double det =(a.x-c.x)*(b.y-c.y) -(b.x-c.x)*(a.y-c.y); if (det >=0.0) { return true;} else { return false;} } // Determine if a pixel falls inside the quadrangle or not boolean inside(int x, int y) { Feature p=new Feature(x,y,1.0); if ( CW(p1 ,p2,p) && CW(p2 ,p3 ,p) && CW(p3 ,p4 ,p) && CW(p4 ,p1 ,p) ) { return true;} else { return false;} } c � 2013 Frank Nielsen 19/30

  20. Homography estimation Projective geometry, homogeneous and inhomogeneous coordinates.       x ′ ˜ ˜ h 11 h 12 h 13 x i i p ′  =  = H ˜ y ′ ˜ i = ˜ h 21 h 22 h 23 y i ˜ p i ,     i w ′ h 31 h 32 h 33 w i i w ′ i = h 31 x i + h 32 y i + h 33 w i i = h 11 x i + h 12 y i + h 13 w i i = h 21 x i + h 22 y i + h 23 w i x ′ h 31 x i + h 32 y i + h 33 w i , y ′ h 31 x i + h 32 y i + h 33 w i . A i block matrix: x ′ i ( h 31 x i + h 32 y i + h 33 ) = h 11 x i + h 12 y i + h 13 , y ′ i ( h 31 x i + h 32 y i + h 33 ) = h 21 x i + h 22 y i + h 23 . Solve for A i h = 0 c � 2013 Frank Nielsen 20/30

  21. Homography estimation using inhomogeneous system Assume h 33 � = 0 (and set h 33 = 1).       − x 1 x ′ − y 1 x ′ x ′ x 1 y 1 1 0 0 0 h 11 1 1 1 − x 1 y ′ − y 1 y ′ y ′ 0 0 0 1 x 1 y 1 h 12       1 1 1       − x 2 x ′ − y 2 x ′ x ′ x 2 y 2 1 0 0 0 h 13       2 2 2       − x 2 y ′ − y 2 y ′ y ′ 0 0 0 1 x 2 y 2 h 21       2 2 2 = ×       − x 3 x ′ − y 3 x ′ x ′ x 3 y 3 1 0 0 0 h 22       3 3 3       − x 3 y ′ − y 3 y ′ y ′ 0 0 0 1 x 3 y 3 h 23       3 3 3       − x 4 x ′ − y 4 x ′ x ′ x 4 y 4 1 0 0 0 h 31       4 4 4 − x 4 y ′ − y 4 y ′ y ′ 0 0 0 x 4 y 4 1 h 32 4 4 4 � �� � h ′ Linear system written: Bh ′ = b . For four pairs h ′ = B − 1 b . c � 2013 Frank Nielsen 21/30

  22. Homography estimation using the normalized DLT 9 � H = UDV T = λ i u i v ⊤ i , i =1 Right eigenvector of V corresponding to the smallest eigenvalue. (last column vector v 9 of V ) When λ 9 = 0, the system is exactly determined. When λ 9 > 0, the system is over-determined and λ 9 is an indicator of the goodness of fit of the solution h = v 9 . In practice, this estimation procedure is highly unstable numerically[2]. Points need to be first normalized to that their centroid defines the √ origin, and the diameter is set to 2. c � 2013 Frank Nielsen 22/30

  23. Image editing: Selection swaps H 12 from Q 1 to Q 2 by com- position: H 12 = H 1 H − 1 2 H 21 = H − 1 12 = H 2 H − 1 1 → backward pixel mapping [3] (avoid holes) forward mapping backward mapping DEST → SRC ( H − 1 ) SRC → DEST ( H ) c � 2013 Frank Nielsen 23/30

  24. Image editing: Selection swaps c � 2013 Frank Nielsen 24/30

  25. Image editing: Selection swaps c � 2013 Frank Nielsen 25/30

  26. Image editing: Selection swaps c � 2013 Frank Nielsen 26/30

  27. Image editing: Selection swaps c � 2013 Frank Nielsen 27/30

  28. Perspective Click-and-Drag UI: Conclusion ◮ Simple UI system relying on computer vision . ◮ Extend to other input formats: Stereo pairs, RGBZ images, etc. ◮ Implemented using processing.org (2500+ lines) Ongoing work: ◮ Rely on efficient quad detection : extensive benchmarking (BSDS500, Corel, ImageNet, etc. databases) ◮ Extend to various perspectively slanted shapes (like ball → ellipsoids, etc.) ◮ Robust multiple quad-to-square homography estimations [1]? www.informationgeometry.org c � 2013 Frank Nielsen 28/30

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend