343h honors ai
play

343H: Honors AI Lecture 26: More applications 4/29/2014 Kristen - PowerPoint PPT Presentation

343H: Honors AI Lecture 26: More applications 4/29/2014 Kristen Grauman UT Austin This week Tournament Wed night (tomorrow) 7 pm Well meet here Submit final agent by tonight Otherwise well take your last qualifying entry


  1. 343H: Honors AI Lecture 26: More applications 4/29/2014 Kristen Grauman UT Austin

  2. This week  Tournament Wed night (tomorrow) 7 pm  We’ll meet here  Submit final agent by tonight  Otherwise we’ll take your last qualifying entry  Class Thursday  Course wrap-up, exam details, tournament recap/awards, surveys

  3. Last time  Neural networks  Visual recognition  Face detection  Gender recognition  Boosting  Multi-class SVMs  Classifier cascades

  4. Today  Deep learning for image recognition  Body pose estimation from decision forests  Non-parametric scene recognition

  5. How many computers to identify a cat? [Le, Ng, Dean, et al. 2012]

  6. Perceptron Slide credit: Dan Klein and Pieter Abbeel

  7. Two-layer neural network Slide credit: Dan Klein and Pieter Abbeel

  8. N-layer neural network Slide credit: Dan Klein and Pieter Abbeel

  9. Auto-encoder (sketch) Slide credit: Dan Klein and Pieter Abbeel

  10. Training procedure: stacked auto-encoder  Auto-encoder  Layer 1 = “compressed” version of input layer  Stacked auto-encoder  For every image, make a compressed image (=layer 1 response to image)  Learn Layer 2 by using compressed images as input, and as output to be predicted  Repeat similarly for Layer 3, 4, etc.  Some details left out  Typically in between layers responses get agglomerated from several neurons (“pooling” / “complex cells”) Slide credit: Dan Klein and Pieter Abbeel

  11. Final result: trained neural network Slide credit: Dan Klein and Pieter Abbeel

  12. Jamie Shotton, Andrew Fitzgibbon, Mat Cook, Toby Sharp, Mark Finocchio, Richard Moore, Alex Kipman, Andrew Blake CVPR 2011

  13. image window Toy example: centred at x distinguish left ( L ) and right ( R ) f ( I, x ; Δ 1 ) > θ 1 sides of the body no yes f ( I, x ; Δ 2 ) > θ 2 P( c ) no yes L R P( c ) P( c ) L R L R

  14. [Breiman et al. 84] for all Q n = (I, x) P n ( c ) pixels body part c f ( I, x ; Δ n ) > θ n n no yes P l ( c ) reduce P r ( c ) entropy r l c c Take ( Δ , θ ) that maximises Goal: drive entropy information gain: at leaf nodes Δ𝐹 = − 𝑅 l 𝐹(Q l ) − 𝑅 r 𝐹(Q r ) to zero 𝑅 𝑜 𝑅 𝑜

  15. [Amit & Geman 97] [Breiman 01] [Geurts et al. 06] (𝐽, x) (𝐽, x) tree 1 tree T ……… P T ( c ) P 1 ( c ) c c  Trained on different random subset of images  “bagging” helps avoid over -fitting 𝑈 𝑄 𝑑 𝐽, x = 1 𝑈 𝑄 𝑢 (𝑑|𝐽, x)  Average tree posteriors 𝑢=1

  16.  Define 3D world space density: 1 2 3D coord pixel of i th pixel weight 3D coord bandwidth pixel index i inferred depth at 3. hypothesize i th pixel probability body joints  Mean shift for mode detection …

  17. Mean shift Search window Center of mass Mean Shift vector Slide by Y. Ukrainitz & B. Sarel

  18. Mean shift clustering • Cluster: all data points in the attraction basin of a mode • Attraction basin: the region for which all trajectories lead to the same mode Slide by Y. Ukrainitz & B. Sarel

  19. Nearest Neighbor classification • Assign label of nearest training data point to each test data point Black = negative Novel test example Red = positive Closest to a positive example from the training set, so classify it as positive. from Duda et al. Voronoi partitioning of feature space for 2-category 2D data

  20. K-Nearest Neighbors classification • For a new point, find the k closest points from training data • Labels of the k points “vote” to classify k = 5 Black = negative If query lands here, the 5 Red = positive NN consist of 3 negatives and 2 positives, so we classify it as negative. Source: D. Lowe

  21. 6+ million geotagged photos by 109,788 photographers Annotated by Flickr users

  22. Global texture: capturing the “Gist” of the scene Capture global image properties while keeping some spatial information Gist descriptor Oliva & Torralba IJCV 2001, Torralba et al. CVPR 2003

  23. [Hays and Efros. im2gps : Estimating Geographic Information from a Single Image. CVPR 2008.]

  24. The Importance of Data [Hays and Efros. im2gps : Estimating Geographic Information from a Single Image. CVPR 2008.]

  25. Recap  Deep learning for image recognition  Body pose estimation from decision forests  Non-parametric scene recognition  Visual recognition tasks with supervised classification  Variety of features and models  Training data quality and/or quantity essential

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend