0 1 2 34 5 26 2 7 0
play

!""#$%&'()*%+$),' -.,")/)0%1/$2+' - PowerPoint PPT Presentation

!""#$%&'()*%+$),' -.,")/)0%1/$2+' 34'5$%/)/26$2)#'7.&%#+' ' !"#$%&'()'#)*+,-)./0. ) Erik Sudderth Brown University Work by E. Sudderth, A. Torralba, W. Freeman, & A. Willsky IJCV 2008: Describing Visual


  1. !""#$%&'()*%+$),' -.,")/)0%1/$2+' 34'5$%/)/26$2)#'7.&%#+' ' !"#$%&'()'#)*+,-)./0. ) Erik Sudderth Brown University Work by E. Sudderth, A. Torralba, W. Freeman, & A. Willsky IJCV 2008: Describing Visual Scenes using Transformed Objects & Parts CVPR 2006: Depth from Familiar Objects: A Hierarchical Model for 3D Scenes NIPS 2005: Describing Visual Scenes using Transformed Dirichlet Processes Building on work by Y. W. Teh, M. Jordan, M. Beal, & D. Blei JASA 2006: Hierarchical Dirichlet Processes

  2. 8%)/,$,9':$16';."$2'7.&%#+ ' Framework for unsupervised discovery of low-dimensional latent structure from bag of word representations model neural stochastic recognition Algorithms nonparametric Neuroscience gradient Statistics dynamical Vision Bayesian ! ! ! ! pLSA : Probabilistic Latent Semantic Analysis (Hofmann 2001) ! ! LDA : Latent Dirichlet Allocation (Blei, Ng, & Jordan 2003) ! ! HDP : Hierarchical Dirichlet Processes (Teh, Jordan, Beal, & Blei 2006)

  3. 5$%/)/26$2)#'<$/$26#%1'=/.2%++' G 0 ∼ DP( γ , H ) G j ∼ DP( α , G 0 ) J groups of data: documents, images, ! E [ π j ] = β

  4. >6$,%+%'?%+1)@/),1'A/),26$+%'

  5. 8.2)#'B$+@)#'A%)1@/%+C'D@"%/"$E%#+ ' Inspired by the successes of topic models for text data, some have proposed learning from local image features • ! Partition image into ~1,000 superpixels • ! Goal: Reduce dimensionality, aggregate information spatially – hopefully not across object boundaries!

  6. 8.2)#'B$+@)#'A%)1@/%+C'F,1%/%+1'?%9$.,+' Maximally Stable Linked Sequences Affinely Adapted Extremal Regions of Canny Edges Harris Corners • ! Some invariance to lighting & pose variations • ! Dense, multiscale over-segmentation of image

  7. !'<$+2/%1%'A%)1@/%'B.2)G@#)/*' SIFT Descriptors • ! !"#$%&'()*+,'-."/#%$-+"0+ "#')1.%2"1+)1)#/3+ • ! 4"$56.)+789:::+;"#*+ Lowe, IJCV 2004 *'<2"1%#3+='%+>?$)%1-+ appearance of • ! @%5+)%<,+0)%.6#)+."+ feature i in image j 1)%#)-.+ !"#$%&'()*+' 2D position of feature i in image j

  8. ;6%'H./#&')+')'()9'.I'B$+@)#'H./&+' Fei-Fei & Perona, CVPR 2005 Sivic, Russell, Efros, Zisserman, & Freeman, ICCV 2005 Topics as visual themes composing a Topics as visual object classes within a known set of scene categories (carefully chosen) image collection

  9. F0)9%+')+'0./%'16),'()9+'.I'A%)1@/%+' • ! A";+*"+B+C1";+.,'-+'-+"<)%1+D)1)%.,+%+<&)%#+-C3E+ • ! A";+$%13+D'<3<&)-+%1*+.#'<3<&)-+%$+B+&""C'1/+%.E+ ,-.'%*/'(/'0*."12'0)'#3$//4/'"5%2/#'"10)'0)6"7'5)+/&#8' 9-/*/'%*/'5%1.'5)*/'0))&#'%!%"&%:&/':.'%+%6;12' 1)16%*%5/0*"7'%1+'-"/*%*7-"7%&'<%./#"%1'5)+/&#='

  10. B$+@)#'JGK%21'>)1%9./$L)M.,' • ! GOAL: Visually recognize and localize object categories • ! Robustly learn appearance models from few examples

  11. Part-Based Models for Objects Pictorial Structures Generalized Cylinders R Recognition by Components it Fischler & Elschlager, 1973 Marr & Nishihara, 1978 Biederman, 1987 Discriminative Parts Constellation Model Efficient Matching Felzenszwalb, McAllester, Perona, Weber, Welling, Felzenszwalb & Huttenlocher, 2005 Ramanan, 2008 to ! Fergus, Fei-Fei, 2000 to !

  12. >.@,M,9'JGK%21+'N'=)/1+' How many parts? How many objects?

  13. O%,%/)MP%'7.&%#'I./'JGK%21+' For each image: Sample a reference position For each feature: ! ! Randomly choose one part ! ! Sample from that part’s feature distribution

  14. JGK%21+')+'<$+1/$G@M.,+' Pr(part) Feature Feature Pr(appearance | part) Pr(position | part) appearance position • ! Parts are defined by parameters , which encode distributions on visual features: • ! Objects are defined by distributions on the infinitely many potential part parameters:

  15. !'-.,")/)0%1/$2'=)/1Q()+%&'7.&%#' # parts # images 4 Images 16 Images 64 Images

  16. O%,%/)#$L$,9'!2/.++'>)1%9./$%+' Can we transfer knowledge from one object category to another?

  17. 8%)/,$,9'D6)/%&'=)/1+' • ! FDG)<.-+%#)+"H)1+&"<%&&3+-'$'&%#+'1+%55)%#%1<)+ • ! I'-<"=)#+ 6%*0# +-,%#)*+%<#"--+<%.)/"#')-+ – ! A";+$%13+.".%&+5%#.-+-,"6&*+;)+-,%#)E+ – ! A";+$%13+5%#.-+-,"6&*+)%<,+<%.)/"#3+6-)E+

  18. 5$%/)/26$2)#'<='JGK%21'7.&%#' H R #" G 0 G 0 G &" G G !" 1 2 %" $" v w N J L

  19. 5$%/)/26$2)#'<='JGK%21'7.&%#' H Discrete Data: Teh et. al., 2004 R #" G 0 G 0 G &" G G !" 1 2 %" $" v w N J L

  20. >6$,%+%'?%+1)@/),1'A/),26$+%'

  21. D6)/$,9'=)/1+C'RS'>)1%9./$%+' • ! Caltech 101 Dataset (Li & Perona) • ! Bikes from Graz-02 (Opelt & Pinz) • ! Horses (Borenstein & Ullman) • ! Google ! • ! Cat & dog faces (Vidal-Naquet & Ullman)

  22. B$+@)#$L)M.,'.I'D6)/%&'=)/1+' Pr(position | part) Pr(appearance | part)

  23. B$+@)#$L)M.,'.I'D6)/%&'=)/1+' Pr(position | part) Pr(appearance | part)

  24. B$+@)#$L)M.,'.I'D6)/%&'=)/1+' Pr(position | part) Pr(appearance | part)

  25. B$+@)#$L)M.,'.I'=)/1'<%,+$M%+' Hierarchical Clustering of Pr(part | object)

  26. <%1%2M.,';)+T'

  27. <%1%2M.,'?%+@#1+' Shared Parts more accurate than Unshared Parts Modeling feature positions improves shared detection, but hurts unshared detection 6 Training Images per Category (ROC Curves)

  28. <%1%2M.,'?%+@#1+' 6 Training Images per Category Detection vs. Training Set Size (ROC Curves) (Area Under ROC)

  29. D6)/$,9'D$0"#$U%+'7.&%#+'

  30. D2%,%+V'JGK%21+V'),&'=)/1+' Scene Objects Parts Features

  31. >.,1%E1@)#';/),+I%/'8%)/,$,9'

  32. JGK%21'P+4'B$+@)#'>)1%9./$%+' • ! Assume training data contains object category labels • ! Discover underlying visual categories automatically

  33. 7@#M"#%'JGK%21'D2%,%+' • ! How many cars are there? • ! Where are those cars in the scene? Standard dependent Dirichlet process models (Gelfand et. al., 2005) inappropriate

  34. D")M)#';/),+I./0)M.,+' • ! Let global DP clusters model objects in a canonical coordinate frame • ! Generate images via a random set of transformations: Parameterized family Shift cluster from canonical of transformations coordinate frame to object location in a given image Layered Motion Models (Darrell & Pentland 1991, Wang & Adelson 1994, Jojic & Frey 2001) Nonparametric Transformation Densities (Learned-Miller & Viola 2000)

  35. !';.*'H./#&C''()/+'N'(#.G+'

  36. ;/),+I./0%&'<$/$26#%1'=/.2%++' H Mixture Transformations Parameters G #" G R 0 0 G !" j G G G 1 2 3 %" $" v N J

  37. F0"./1),2%'.I';/),+I./0)M.,+' TDP HDP

  38. >.@,M,9'N'8.2)M,9'JGK%21+'' • ! How many cars are there? Dirichlet Processes • ! Where are those cars in the scene? Transformations

  39. B$+@)#'D2%,%';<='' R Global Density G #" Object category 0 Part size & shape Transformation prior G !" j Transformed Densities Object category o Part size & shape %" '" Instance locations F F $ $" 2D Image Features ( Appearance Location w v H N J

  40. D1/%%1'D2%,%'B$+@)#'>)1%9./$%+'

  41. D1/%%1'D2%,%'D%90%,1)M.,+'

  42. D%90%,1)M.,'=%/I./0),2%'

  43. WE1%,+$.,C'X<'D2%,%+ ' Office Scene Red Far Green Near • ! Segmentation easier in 3D • ! Identifying known objects regularizes depth estimation

  44. X<'D1/@21@/%'I/.0'D1%/%. ' Reference (left) Image Potential Matches Depth Densities )" Depth = Disparity Overhead View

  45. O/%%&*'<%"16'W+M0)1%+ ' Reference (left) Image Potential Matches Depth Densities Green Near Red Far

  46. X<';/),+I./0%&'<=C'JY2%'D2%,%+ ' Computer Screen Background Bookshelves Desk

  47. D1%/%.';%+1'F0)9% ' Simultaneous object recognition & coarse 3D reconstruction

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend