Visual Recognition: Prospects for Image & Video Analytics - - PowerPoint PPT Presentation

visual recognition prospects for image video analytics
SMART_READER_LITE
LIVE PREVIEW

Visual Recognition: Prospects for Image & Video Analytics - - PowerPoint PPT Presentation

Visual Recognition: Prospects for Image & Video Analytics Jitendra Malik University of California at Berkeley Classification & Segmentation Water outdoor Grass wildlife Tiger Sand back Tiger head eye legs tail mouth shadow


slide-1
SLIDE 1

Visual Recognition: Prospects for Image & Video Analytics

Jitendra Malik University of California at Berkeley

slide-2
SLIDE 2

Computer Vision Group UC Berkeley

Classification & Segmentation

Tiger Grass Water Sand

  • utdoor

wildlife Tiger tail eye legs head back shadow mouth

slide-3
SLIDE 3

PASCAL Visual Object Challenge

slide-4
SLIDE 4

We want to locate the object

  • Orig. Image

Segmentation

  • Orig. Image

Segmentation

slide-5
SLIDE 5

Computer Vision Group UC Berkeley

Fifty years of computer vision 1963-2013

  • 1960s: Beginnings in artificial intelligence, image processing

and pattern recognition

  • 1970s: Foundational work on image formation: Horn,

Koenderink, Longuet-Higgins …

  • 1980s: Vision as applied mathematics: geometry, multi-scale

analysis, probabilistic modeling, control theory, optimization

  • 1990s: Geometric analysis largely completed, vision meets

graphics, statistical learning approaches resurface

  • 2000s: Significant advances in visual recognition, range of

practical applications

slide-6
SLIDE 6
slide-7
SLIDE 7

Computer Vision Group

University of California

Berkeley

Handwritten digit recognition (MNIST,USPS)

  • LeCun’s Convolutional Neural Networks variations (0.8%,

0.6% and 0.4% on MNIST)

  • Tangent Distance(Simard, LeCun & Denker: 2.5% on USPS)
  • Randomized Decision Trees (Amit, Geman & Wilder, 0.8%)
  • K-NN based Shape context/TPS matching (Belongie, Malik &

Puzicha: 0.6% on MNIST)

slide-8
SLIDE 8

Computer Vision Group UC Berkeley

EZ-Gimpy Results (Mori & Malik, 2003)

  • 171 of 192 images correctly identified: 92 %

horse smile canvas spade join here

slide-9
SLIDE 9

Results on various images submitted to the CMU on-line face detector http://www.vasc.ri.cmu.edu/cgi-bin/demos/findface.cgi

Face Detection

Carnegie Mellon University

slide-10
SLIDE 10

Multiscale sliding window

Ask this question repeatedly, varying position, scale, category…

Paradigm introduced by Rowley, Baluja & Kanade 96 for face detection Viola & Jones 01, Dalal & Triggs 05, Felzenszwalb, McAllester, Ramanan 08

slide-11
SLIDE 11

Computer Vision Group UC Berkeley

Caltech-101 [Fei-Fei et al. 04]

  • 102 classes, 31-300 images/class
slide-12
SLIDE 12

Caltech 101 classification results (even better by combining cues..)

slide-13
SLIDE 13

PASCAL Visual Object Challenge

slide-14
SLIDE 14
slide-15
SLIDE 15
slide-16
SLIDE 16
slide-17
SLIDE 17

Trying to find stick figures is hard (and unnecessary!)

Generalized Cylinders (Binford, Marr & Nishihara) Geons (Biederman)

slide-18
SLIDE 18

Person detection is challenging

slide-19
SLIDE 19

Can we build upon the success

  • f faces and pedestrians?

 Pattern matching  Capture patterns that are common and visually characteristic  Are these the only two common and characteristic patterns? Rowley, Baluja, Kanade CVPR96 Viola and Jones, IJCV01 … Dalal and Triggs, CVPR05 …

slide-20
SLIDE 20

Poselets

We will train classifiers for these different visual patterns

slide-21
SLIDE 21

Segmenting people

[Bourdev, Maji, Brox and Malik, ECCV10]

Best person segmentation on PASCAL 2010 dataset

slide-22
SLIDE 22

“A person with long pants” “A man with short hair and long sleeves” “A man with short hair, glasses, short sleeves and shorts” “A woman with long hair, glasses and long pants”(??)

Describing people

slide-23
SLIDE 23

Male or female?

slide-24
SLIDE 24

Gender classifier per poselet is much easier to train

slide-25
SLIDE 25

Is male

slide-26
SLIDE 26

Has long hair

slide-27
SLIDE 27

Wears long pants

slide-28
SLIDE 28

Wears a hat

slide-29
SLIDE 29

Wears long sleeves

slide-30
SLIDE 30

Wears glasses

slide-31
SLIDE 31

Actions in still images …

 have characteristic :

 pose and appearance  interaction with objects and agents

slide-32
SLIDE 32

Some discriminative poselets

slide-33
SLIDE 33

Problem: Human Activity Recognition

12/20/2011 SMARTS Annual Review 2011

Mean Performance: 59.7% correct Approach: Learn pose and appearance specific for an action

slide-34
SLIDE 34

Results : Top Confusions

slide-35
SLIDE 35

Low-Cost Automated Tuberculosis Diagnostics Using Mobile Microscopy

Jeannette Chang1, Pablo Arbelaez1, Neil Switz2, Clay Reber2, Asa Tapley2,3 Lucian Davis3, Adithya Cattamanchi3, Daniel Fletcher2, and Jitendra Malik1

Department of Electrical Engineering and Computer Science, UC Berkeley1 Department of Bioengineering, UC Berkeley2 Medical School and San Francisco General Hospital, UC San Francisco3

slide-36
SLIDE 36

Why Tuberculosis?

  • Mortality and Treatment1
  • TB is second leading cause of deaths from infectious

disease worldwide (after HIV/AIDS)

  • Highly effective antibiotic treatment
  • Current Diagnostics
  • Technicians screen microscopic images of sputum

smears manually

  • Other methods include culture and PCR
  • Tremendous potential benefit from automated

processing or classification

  • 1. http://www.who.int/tb/publications/global_report/2011/gtbr11_full.pdf
  • 2. http://www.thehindu.com/health/rx/article21138.ece

Examples of sputum smears with TB

  • bacteria. Brightfield (top) and fluorescent

(bottom) microscopy.2

slide-37
SLIDE 37

Candidate TB Blob Identification Feature Extraction Linear SVM Classification

Input image from CellScope device Each candidate TB object is characterized by a feature vector containing 8 Hu moment invariants and 14 geometric/photometric descriptors.

𝑦1 ⋮ 𝑦𝑂 =

Candidate TB objects sorted by their SVM output confidence scores in decreasing order (row-wise, from top to bottom) Bar plot with SVM output confidence scores corresponding to sorted candidate TB objects Array of candidate TB objects

20 40 60 80 100 0.2 0.4 0.6 0.8 1 Candidate Object Index SVM Output Confidence Score

Sample subset of candidate TB objects with corresponding confidence scores

0.918 0.885 0.389 0.374 0.008 0.002 0.001 0.000

slide-38
SLIDE 38

Sample positive objects Sample negative objects

Sample Candidate Objects

slide-39
SLIDE 39

Patches in Descending Order of Confidence

slide-40
SLIDE 40

0.000 0.200 0.400 0.600 0.800 1.000 MinIntensity φ1 Perimeter FilledArea Area φ5 φ7 φ6 φ4 φ11 MaxIntensity EulerNumber Extent φ3 ConvexArea Solidity MajorAxisLength EquivDiameter φ2 MinorAxisLength Eccentricity MeanIntensity

Features listed in descending order of normalized SVM weights.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Sensitivity (Recall) Specificity or Precision SS/RP curves, Avg spec: 0.96744, Avg prec: 0.95389 cost exp: 7 train-SS train-RP test-SS test-RP

Object-Level Performance (Uganda Data)

slide-41
SLIDE 41

Slide-Level Performance (Uganda Data)