unsupervised discovery of mid level discriminative patches
play

Unsupervised Discovery Of Mid-level Discriminative Patches Saurabh - PowerPoint PPT Presentation

Unsupervised Discovery Of Mid-level Discriminative Patches Saurabh Singh (ss1@andrew.cmu.edu), RI Which representation seems intuitive? Spectrum of Visual Features Low-Level High-Level Pixel Filter-Banks Sparse-SIFT Parts, Objects


  1. Unsupervised Discovery Of Mid-level Discriminative Patches Saurabh Singh (ss1@andrew.cmu.edu), RI

  2. Which representation seems intuitive?

  3. Spectrum of Visual Features Low-Level High-Level Pixel Filter-Banks Sparse-SIFT Parts, Objects Image Segments Visual Words

  4. Visual Words or Letters?

  5. Spectrum of Visual Features Low-Level High-Level Parts, Pixel Filter-Banks Sparse-SIFT Objects Image Segments Visual Words Our Approach (Mid-Level Discriminative Patches)

  6. Discriminative Patches Two key requirements 1. Representative : Need to occur frequently enough. 2. Discriminative: Need to be different enough from the rest of the visual world.

  7. First some examples

  8. Unsupervised Discovery of Discriminative Patches Given “discovery dataset” Find a relatively small number of discriminative patches that represent it well. We assume access to a “natural world” dataset, which captures the visual statistics of the world in general. Dataset: Subset of Pascal VOC 2007 with six categories.

  9. Visual Word Approach • Sample a lot of patches from the discovery dataset (represented in terms of their features*) at various locations and scales. • Perform some form of unsupervised clustering (e.g. K- Means) Doesn’t work well. * We use Histogram of Oriented Gradients (HOG) features

  10. K-Means Clusters

  11. Chicken-Egg Problem • If we know that a set of patches are visually similar, we can easily learn a distance metric for them • If we know the distance metric, then we can easily find other members.

  12. Discriminative Clustering • Initialize using K-Means • Train a discriminative classifier to represent the distance function (treating other clusters as negative examples). • Re-assign the patches to clusters whose classifier gives highest score • Repeat

  13. Discriminative Clustering* • Initialize using K-Means • Train a discriminative classifier to represent the distance function (Using “natural world” as negative data). • Detect the patches and assign to clusters. • Repeat

  14. Discriminative Clustering* Initial Final Initial Final

  15. Discriminative Clustering+ • Split the discovery dataset into two equal parts {Training, Validation} • Perform the training step of Discriminative Clustering* on Training set. • Perform the detection step of Discriminative Clustering* on Validation set. • Exchange the roles of Training and Validation sets. • Repeat.

  16. Discriminative Clustering+ KMeans Iter 1 Iter 2 Iter 3 Iter 4

  17. Discriminative Clustering+ KMeans Iter 1 Iter 2 Iter 3 Iter 4

  18. More Results

  19. Image in terms of D+ Patches

  20. Ranking Patches • Purity: Homogeneity of the clusters. Approximated by the mean SVM score for top few members • Discriminativeness: How rare are the patches in the “natural world”. Approximated by term frequency in “discovery dataset” with respect to both combined.

  21. Top Ranked Patches

  22. Doublets : Spatially Consistent Pairs

  23. Doublets : Refinement

  24. Discovered Doublets

  25. Discovered Doublets

  26. Evaluation • Comparison with Visual Words • Dictionary of 1000 visual words to compare against 1000 Discriminative clusters.

  27. Evaluation : Purity Purity 1 0.9 0.8 Visual Word Our Approach Cluster Purity 0.7 0.6 0.5 0.4 0.3 0.2 0 200 400 600 800 1000 Number of Clusters

  28. Evaluation : Coverage Coverage 1 0.9 0.8 Visual Word 0.7 Dataset Coverage Our Approach 0.6 0.5 0.4 0.3 0.2 0.1 0 0 200 400 600 800 1000 Number of Clusters

  29. Supervised Image Classification Bus Horse Train Sofa Dining Motor Average Table Bike Vis- 0.45 0.70 0.60 0.59 0.41 0.51 0.54 Word D-Pats 0.60 0.82 0.61 0.67 0.55 0.67 0.65 D-Pats + 0.62 0.82 0.61 0.67 0.57 0.68 0.66 Doublets

  30. Going Further : More Supervision • Discovering using category labels. • Per-category Clustering.

  31. Using Labels Table 1: horse AP: 0.356 AP: 0.340 AP at 0.1 Recall: 0.098 AP at 0.1 Recall: 0.094 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 Precision Precision 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Recall Recall

  32. Using Labels AP: 0.270 AP: 0.240 AP at 0.1 Recall: 0.088 AP at 0.1 Recall: 0.084 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 Precision Precision 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Recall Recall

  33. Per-Category Clustering • Discovery Dataset: Images belonging to a single category

  34. Top Patches Per-Scene Bookstore Cloister Buffet Bowling

  35. Top Patches Per-Scene Computer Room Laundromat Shoe Shop Waiting Room

  36. Thank You Fun Fact: Only ~300,000 CPU Hours consumed

  37. • Histogram of gradient orientations -Orientation -Position • Weighted by magnitude *Borrowed From Alyosha’s Slides

  38. Average Precision 1 0.9 0.8 0.7 0.6 Precision 0.5 0.4 0.3 0.2 *Formulas from Wikipedia 0.1 0 0 0.2 0.4 0.6 0.8 1 Recall

  39. Spatial Pyramid level 0 level 1 level 2 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + � 1/4 � 1/4 � 1/2

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend