343H: Honors AI Lecture 26: More applications 4/29/2014 Kristen - - PowerPoint PPT Presentation

343h honors ai
SMART_READER_LITE
LIVE PREVIEW

343H: Honors AI Lecture 26: More applications 4/29/2014 Kristen - - PowerPoint PPT Presentation

343H: Honors AI Lecture 26: More applications 4/29/2014 Kristen Grauman UT Austin This week Tournament Wed night (tomorrow) 7 pm Well meet here Submit final agent by tonight Otherwise well take your last qualifying entry


slide-1
SLIDE 1

343H: Honors AI

Lecture 26: More applications 4/29/2014 Kristen Grauman UT Austin

slide-2
SLIDE 2

This week

  • Tournament Wed night (tomorrow) 7 pm
  • We’ll meet here
  • Submit final agent by tonight
  • Otherwise we’ll take your last qualifying entry
  • Class Thursday
  • Course wrap-up, exam details, tournament recap/awards,

surveys

slide-3
SLIDE 3

Last time

  • Neural networks
  • Visual recognition
  • Face detection
  • Gender recognition
  • Boosting
  • Multi-class SVMs
  • Classifier cascades
slide-4
SLIDE 4

Today

  • Deep learning for image recognition
  • Body pose estimation from decision

forests

  • Non-parametric scene recognition
slide-5
SLIDE 5

How many computers to identify a cat?

[Le, Ng, Dean, et al. 2012]

slide-6
SLIDE 6

Perceptron

Slide credit: Dan Klein and Pieter Abbeel

slide-7
SLIDE 7

Two-layer neural network

Slide credit: Dan Klein and Pieter Abbeel

slide-8
SLIDE 8

N-layer neural network

Slide credit: Dan Klein and Pieter Abbeel

slide-9
SLIDE 9

Auto-encoder (sketch)

Slide credit: Dan Klein and Pieter Abbeel

slide-10
SLIDE 10

Training procedure: stacked auto-encoder

  • Auto-encoder
  • Layer 1 = “compressed” version of input layer
  • Stacked auto-encoder
  • For every image, make a compressed image (=layer

1 response to image)

  • Learn Layer 2 by using compressed images as input,

and as output to be predicted

  • Repeat similarly for Layer 3, 4, etc.
  • Some details left out
  • Typically in between layers responses get

agglomerated from several neurons (“pooling” / “complex cells”)

Slide credit: Dan Klein and Pieter Abbeel

slide-11
SLIDE 11

Final result: trained neural network

Slide credit: Dan Klein and Pieter Abbeel

slide-12
SLIDE 12

Jamie Shotton, Andrew Fitzgibbon, Mat Cook, Toby Sharp, Mark Finocchio, Richard Moore, Alex Kipman, Andrew Blake CVPR 2011

slide-13
SLIDE 13

image window centred at x

no

Toy example: distinguish left (L) and right (R) sides of the body

no yes yes

L R P(c) L R P(c) L R P(c)

f(I, x; Δ1) > θ1 f(I, x; Δ2) > θ2

slide-14
SLIDE 14

Qn = (I, x) f(I, x; Δn) > θn

no yes

c Pr(c)

body part c Pn(c)

c Pl(c)

Take (Δ, θ) that maximises information gain:

n l r

Goal: drive entropy at leaf nodes to zero

reduce entropy

[Breiman et al. 84]

for all pixels

Δ𝐹 = − 𝑅l 𝑅𝑜 𝐹(Ql) − 𝑅r 𝑅𝑜 𝐹(Qr)

slide-15
SLIDE 15

 Trained on different random subset of images

  • “bagging” helps avoid over-fitting

 Average tree posteriors

[Amit & Geman 97] [Breiman 01] [Geurts et al. 06]

………

tree 1 tree T

c P1(c) c PT(c) (𝐽, x) (𝐽, x)

𝑄 𝑑 𝐽, x = 1 𝑈

𝑢=1 𝑈

𝑄

𝑢(𝑑|𝐽, x)

slide-16
SLIDE 16

 Define 3D world space density:  Mean shift for mode detection

  • 3. hypothesize

body joints …

1 2 pixel index i bandwidth 3D coord

  • f i thpixel

3D coord pixel weight inferred probability depth at i th pixel

slide-17
SLIDE 17

Search window Center of mass Mean Shift vector

Mean shift

Slide by Y. Ukrainitz & B. Sarel

slide-18
SLIDE 18
  • Cluster: all data points in the attraction basin
  • f a mode
  • Attraction basin: the region for which all

trajectories lead to the same mode

Mean shift clustering

Slide by Y. Ukrainitz & B. Sarel

slide-19
SLIDE 19

Nearest Neighbor classification

  • Assign label of nearest training data point to each

test data point

Voronoi partitioning of feature space for 2-category 2D data

from Duda et al.

Black = negative Red = positive Novel test example Closest to a positive example from the training set, so classify it as positive.

slide-20
SLIDE 20

K-Nearest Neighbors classification

k = 5

Source: D. Lowe

  • For a new point, find the k closest points from training data
  • Labels of the k points “vote” to classify

If query lands here, the 5 NN consist of 3 negatives and 2 positives, so we classify it as negative. Black = negative Red = positive

slide-21
SLIDE 21

6+ million geotagged photos by 109,788 photographers

Annotated by Flickr users

slide-22
SLIDE 22

Global texture: capturing the “Gist” of the scene

Oliva & Torralba IJCV 2001, Torralba et al. CVPR 2003

Capture global image properties while keeping some spatial information

Gist descriptor

slide-23
SLIDE 23

[Hays and Efros. im2gps: Estimating Geographic Information from a Single Image. CVPR 2008.]

slide-24
SLIDE 24

The Importance of Data

[Hays and Efros. im2gps: Estimating Geographic Information from a Single Image. CVPR 2008.]

slide-25
SLIDE 25

Recap

  • Deep learning for image recognition
  • Body pose estimation from decision

forests

  • Non-parametric scene recognition
  • Visual recognition tasks with supervised

classification

  • Variety of features and models
  • Training data quality and/or quantity essential