Learning the right thing with visual attributes Kristen Grauman - PowerPoint PPT Presentation

Learning the right thing with visual attributes Kristen Grauman Department of Computer Science University of Texas at Austin With Chao-Yeh Chen, Aron Yu, and Dinesh Jayaraman

Beyond image labels What does it mean to understand an image? Cow Tree Labels Grass vs. A lone cow grazes A lone cow grazes in a bright green in a bright green The story of pasture near an pasture near an old tree, probably old tree, probably an image in the Scottish in the Scottish Highlands. Highlands.

Attributes high outdoors metallic flat heel brown has- red ornaments four-legged indoors • Mid-level semantic properties shared by objects • Human-understandable and machine-detectable [Ferrari & Zisserman 2007, Kumar et al. 2008, Farhadi et al. 2009, Lampert et al. 2009, Endres et al. 2010, Wang & Mori 2010, Berg et al. 2010, Parikh & Grauman 2011, …]

Using attributes: Visual search Susp uspect #1 #1: : Mal ale, , sun sungla lasses, , black an bla and whi hite ha hat, t, blu blue shir shirt “Like this…but more orn rnate ” Relative feedback [Kovashka et al. 2012] Person search [Kumar et al. 2008, Feris et al. 2013]

Using attributes: Interactive recognition Computer Vision Cone-shaped beak ? yes Computer Vision American Goldfinch? [Branson et al. 2010, 2013]

Using attributes: Semantic supervision Band-tailed pigeons: Mules:  White collar  Shorter legs than donkeys  Yellow feet  Shorter tails than horses  Yellow bill  Red breast Training with relative descriptions Zero-shot learning [Parikh & Grauman 2011, [Lampert et al. 2009] Shrivastava & Gupta 2012] Strong ong body dy HOT NOT HOT Annotator rationales [Donahue & Grauman 2011]

Problem With attributes, it’s easy to learn the wrong thing. • Incidental correlations • Spatially overlapping properties • Subtle visual differences • Partially category-dependent • Variance in human-perceived definitions …yet applications demand that correct meaning be captured!

Goal Learn the right thing. • How to decorrelate attributes that often occur simultaneously? • Are attributes really class-independent? • How to detect fine-grained attribute differences?

The curse of correlation What will be learned from this training set? Object Learning       Cat

The curse of correlation What will be learned from this training set? Attribute Learning      Forest animal? Brown? Has ears? Combinations? Problem : Attributes that often co-occur cannot be distinguished by the learner

The curse of correlation Forest animal      Brown      Problem : Attributes that often co-occur cannot be distinguished by the learner

Idea: Resist the urge to share Forest animal “Compete”      for features Brown      Problem : Attributes that often co-occur cannot be distinguished by the learner JAYARAMAN ET AL., CVPR 2014

Semantic attribute groups • Closely related attributes may share features • Assume attribute “groups” from external knowledge. JAYARAMAN ET AL., CVPR 2014

Standard approach: learning separately Loss function: feature dimensions JAYARAMAN ET AL., CVPR 2014

Proposed group-based formulation S 3 S 1 S 2 Group-wise weight matrix color texture motion Compute Penalize Penalize row L2 norms row L1 norms row L2 norms (inter-group competition) (feature sharing) (in-group sharing) JAYARAMAN ET AL., CVPR 2014

Formulation effect Ours Sparse features Standard multi-task learning (inter-group competition, (no relationships (sharing and conflation in-group sharing) among attributes) across groups) Forest animal Brown Forest animal Brown Forest animal Brown JAYARAMAN ET AL., CVPR 2014

Results – Attribute detection Birds Pascal Animals 0.22 0.32 0.57 Series1 0.3 0.55 0.2 Series2 0.28 0.53 0.18 AP AP 0.26 0.51 AP Series3 0.16 0.24 0.49 Series4 0.22 0.47 0.14 Series5 0.2 0.45 0.12 1 1 1 By decorrelating attributes, our attribute detectors generalize much better to novel unseen categories. (*) Argyriou et al, Multi-task Feature Learning, NIPS 2007 (~) Farhadi et al, Describing Objects by Their Attributes, CVPR 2009 JAYARAMAN ET AL., CVPR 2014

Attribute detection example Su Success case ses No mouth Not brown No eye Not boxy No ear underparts Fail ilure case ses No Not Eyeline Black breast Not feather furry vegetation JAYARAMAN ET AL., CVPR 2014

Attribute localization examples Brown wing Blue back Olive back Crested head Standard Our method avoids conflation to learn the correct semantic attribute. Ours JAYARAMAN ET AL., CVPR 2014

Problem Are attributes really category-independent? ? = Fluffy dog Fluffy towel

An intuitive but impractical solution • Learn category-specific attributes? Impractical! Would need examples for all category-attribute combinations… Fluffy dogs Non-fluffy dogs

Idea: Analogous attributes • Given sparse set of category-specific models, infer “missing” analogous attribute classifiers 1 Learned category-sensitive attributes 2 Inferred Attribute attribute Striped Brown Spotted ?? = ?? - + + No Dog - training Category examples Prediction 3 No + - - training + + Equine examples A striped dog? Yes. Chen & Grauman, CVPR 2014

Transfer via tensor completion Discover low-d latent factors Construct sparse and infer missing classifiers object-attribute (the analogous attributes) classifier tensor Category Category W W Attribute Attribute Bayesian probabilistic tensor factorization [Xiong et al., SDM 2010].

Datasets • ImageNet attributes – 9600 images – 384 object categories – 25 attributes – 1498 object-attribute pairs [Russakovsky & Fei-Fei 2010] available • SUN attributes – 14340 images – 280 object categories – 59 attributes – 6118 object-attribute pairs available [Patterson & Hays 2012]

Inferring class-sensitive attributes 84 total attributes, 664 object/scene classes 74 Our approach Series1 Series2 72 infers all 18K Average mAP 70 “missing” Series3 classifiers → 68 savings of 348K 66 labeled images 64 62 Category-sensitive 1 2 outperforms status quo 76% of the time, average gain of 15 points in AP Chen & Grauman, CVPR 2014

Which attributes are analogous? Brown, red, Red, long, Shiny, White, gray, 2 1 yellow yellow wooden, wet wooden Brown, red, Gray, smooth, White, gray, Brown, white, long rough red red Tiles, metal, Socializing, Metal, Conducting 3 4 wire railing, eating gaming, business, carpet, leaves foliage Grass, wire, Working, Congregating, Conducting working paper, cleaning, business, carpet, sailing/boating socializing foliage Chen & Grauman, CVPR 2014

Problem : Fine-grained attribute comparisons Which is more comfortable ?

Relative attributes Use ordered image pairs to train a ranking function: Ranking function = …, Image features “smiling more than” [Parikh & Grauman, ICCV 2011; Joachims 2002]

Relative attributes Rather than simply label images with their properties, Not bright Smiling Not natural

Relative attributes We can compare images by attribute’s “strength” bright smiling natural

Idea : Local learning for fine-grained relative attributes • Lazy learning: train query-specific model on the fly. • Local: use only pairs that are similar/relevant to test case. : Relevant nearby Test training pairs comparison Yu & Grauman, CVPR 2014

Idea : Local learning for fine-grained relative attributes w Local Global Vs. more 2 2 ? ? 1 1 more less w less Yu & Grauman, CVPR 2014

UT Zappos50K Dataset Large shoe dataset, consisting of 50,025 catalog images from Zappos.com Coarse • 4 relative attributes > • High quality pairwise labels from mTurk workers > • 6,751 ordered labels + 4,612 “equal” labels Fine-Grained • 4,334 twice-labeled fine-grained > labels (no “equal” option) > “open” Yu & Grauman, CVPR 2014

Results: Fine-grained attributes Accuracy of comparisons – all attributes Accuracy on the 30 hardest test pairs Yu & Grauman, CVPR 2014

Predicting useful neighborhoods • Most relevant points = most similar points? • Pose as large-scale multi-label classification problem . . . . . . 𝑧 𝑟 = [0, 0, 0, 1, 1, …, 1, 0] 𝑧 𝑜 = [1, 0, 1, 1, 0, …, 0, 1] Reconstruct 𝜚 𝑔 𝑔 𝑨 𝑜 𝑦 𝑟 𝑦 𝑜 Compressed label space Training Testing [Yu & Grauman NIPS 2014]

Learning the right thing with visual attributes Kristen Grauman - PowerPoint PPT Presentation

Learning the right thing with visual attributes Kristen Grauman Department of Computer Science University of Texas at Austin With Chao-Yeh Chen, Aron Yu, and Dinesh Jayaraman Beyond image labels What does it mean to understand an image? Cow

Group Sustainability Manager doing the RIGHT thing Agenda doing the RIGHT thing Government

Thing Twelve. Government can pick winners. Thing Thirteen. Making rich people richer doesnt make

COMPANY PRESENTATION MAY 2019 Next to doing the right thing, the most important thing is to let

and does not do it, to him it is sin. James 4:17 Do the Next Right Thing Do the Next Right Thing

Finding the Right Target Audience Defining the Right Audience Right Visitors Right Time

Matrix COSEC Right People in Right Place at Right Time Matrix COmplete SECurity Matrix COSEC

light right light right light right light right to steady the tongue, hold the sides of

Stop building the wrong thing righter, build the right thing

JSON-LD Joint Session Lyon, France, October 2018 DEFINING @ID OF THING Defining @id of Thing

What is this thing? Crouching Chameleon - Jumping Fly p. 1/1 What is this thing? What do

Getting the right women and newborns to the right place to get the right care at the right time

Succession Planning Right People Right Skills Right Time Right Place OC Fair &

How to Prevent Right to Buy and Right to Acquire fraud 1. Right to buy/Right to Acquire:

From Anywhere, Anytime, Anyone to The Right Information at the Right Time, in the Right

Kingston & Richmond Transformation Integrated Musculoskeletal Model and Pathways Right Care,

Will TISA undermine the right to water and the right to food? FIAN for the Human Right to Adequate

Boson Pairing and Unusual Criticality GGI, May 2012 Austen Lamacraft Y. Shi, P. Fendley, AL,

Wealthy Consumers: The Double-Edged Sword of 21 st Century Agriculture Aligned Ag NorCal Sales

2.1 Input and Interaction Hao Li http://cs420.hao-li.com 1 Administrative Exercise 1:

A Soil Analysis MATTHEW 13:1-23 Did You Ever Wonder What happens to all the teaching

Karen Mestan, M.D. ViaCord/Perkin-Elmer: Investigator-initiated Grant/Research research grant

1 Your eyes see sunlight as white light, Light is a wave, like a ripple on a pond. but it is

COMIC: An analog computer in the colorant industry David Hemmendinger Dept. of Computer Science

What does the visual system know about shadows Patrick Cavanagh Universit Paris Descartes

Learning the right thing with visual attributes Kristen Grauman - PowerPoint PPT Presentation

Learning the right thing with visual attributes Kristen Grauman Department of Computer Science University of Texas at Austin With Chao-Yeh Chen, Aron Yu, and Dinesh Jayaraman Beyond image labels What does it mean to understand an image? Cow

Group Sustainability Manager doing the RIGHT thing Agenda doing the RIGHT thing Government

Thing Twelve. Government can pick winners. Thing Thirteen. Making rich people richer doesnt make

COMPANY PRESENTATION MAY 2019 Next to doing the right thing, the most important thing is to let

and does not do it, to him it is sin. James 4:17 Do the Next Right Thing Do the Next Right Thing

Finding the Right Target Audience Defining the Right Audience Right Visitors Right Time

Matrix COSEC Right People in Right Place at Right Time Matrix COmplete SECurity Matrix COSEC

light right light right light right light right to steady the tongue, hold the sides of

Stop building the wrong thing righter, build the right thing

JSON-LD Joint Session Lyon, France, October 2018 DEFINING @ID OF THING Defining @id of Thing

What is this thing? Crouching Chameleon - Jumping Fly p. 1/1 What is this thing? What do

Getting the right women and newborns to the right place to get the right care at the right time

Succession Planning Right People Right Skills Right Time Right Place OC Fair &amp;

How to Prevent Right to Buy and Right to Acquire fraud 1. Right to buy/Right to Acquire:

From Anywhere, Anytime, Anyone to The Right Information at the Right Time, in the Right

Kingston &amp; Richmond Transformation Integrated Musculoskeletal Model and Pathways Right Care,

Will TISA undermine the right to water and the right to food? FIAN for the Human Right to Adequate

Boson Pairing and Unusual Criticality GGI, May 2012 Austen Lamacraft Y. Shi, P. Fendley, AL,

Wealthy Consumers: The Double-Edged Sword of 21 st Century Agriculture Aligned Ag NorCal Sales

2.1 Input and Interaction Hao Li http://cs420.hao-li.com 1 Administrative Exercise 1:

A Soil Analysis MATTHEW 13:1-23 Did You Ever Wonder What happens to all the teaching

Karen Mestan, M.D. ViaCord/Perkin-Elmer: Investigator-initiated Grant/Research research grant

1 Your eyes see sunlight as white light, Light is a wave, like a ripple on a pond. but it is

COMIC: An analog computer in the colorant industry David Hemmendinger Dept. of Computer Science

What does the visual system know about shadows Patrick Cavanagh Universit Paris Descartes

Succession Planning Right People Right Skills Right Time Right Place OC Fair &

Kingston & Richmond Transformation Integrated Musculoskeletal Model and Pathways Right Care,