Where have we been? Where are we going? LI F E I F EI The - PowerPoint PPT Presentation

Where have we been? Where are we going? LI F E I – F EI

The Beginning: CVPR 2009 J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei, Im a g eNet: A La rg e-Sca le Hiera rchica l Im a g e Da ta b a se. IEEE Com puter Vision and Pattern Recognition (CVPR), 2009.

The Impact of

on Google Scholar 4,38 6 Citations 2,8 47 Citations …and m any m ore.

From Challenge Contestants to Startups

A Revolution in Deep Learning W hy Deep Lea rning is Sud d enly Cha ng ing Your Life The Grea t Artificia l Intellig ence By Roger Parloff Aw a kening The d a ta tha t tra nsform ed AI By Gideon Lew is-Kraus resea rch—a nd p ossib ly the w orld By Dave Gershgorn

“The of x ” SpaceNet MusicNet Medical Im ageNet DigitalGlobe, CosmiQ Works, NVIDIA J. Thickstun et al, 2017 Stanford Radiology, 2017 ShapeNet EventNet ActivityNet A.Chang et al, 2015 G. Ye et al, 2015 F. Heilbron et al, 2015

An Explosion of Datasets 1627 276 1919 1MM 4 MM Hosted Datasets Commercial Student Data Scientists ML Models Competitions Competitions Submitted

“ Datasets—not algorithm s—m ight be the key lim iting factor to developm ent of hum an-level artificial intelligence.” A L E X A N D E R W I S S N E R - G R O S S Edge.org, 2016

The Untold History of

Hardly the First Image Dataset Segm entation (20 0 1) CMU/ VASC Faces (19 9 8 ) FERET Faces (19 9 8 ) COIL Objects (19 9 6 ) MNIST digits (19 9 8 -10 ) D. Martin, C. Fowlkes, D. Tal, J. Malik. H. Rowley, S. Baluja, T. Kanade P. Phillips, H. Wechsler, J. S. Nene, S. Nayar, H. Murase Y LeCun & C. Cortes Huang, P. Raus KTH hum an action (20 0 4 ) Sign Language (20 0 8 ) UIUC Cars (20 0 4 ) 3D Textures (20 0 5) CuRRET Textures (19 9 9 ) I. Leptev & B. Caputo P. Buehler, M. Everingham, A. S. Agarwal, A. Awan, D. Roth S. Lazebnik, C. Schmid, J. Ponce K. Dana B. Van Ginneken S. Nayar Zisserman J. Koenderink ESP (20 0 6 ) CAVIAR Tracking (20 0 5) Middlebury Stereo (20 0 2) CalTech 10 1/ 256 (20 0 5) LabelMe (20 0 5) Ahn et al, 2006 R. Fisher, J. Santos-Victor J. Crowley D. Scharstein R. Szeliski Fei-Fei et al, 2004 Russell et al, 2005 GriffIn et al, 2007 Lotus Hill TinyIm age (20 0 8 ) PASCAL (20 0 7) MSRC (20 0 6 ) (20 0 7) Torralba et al. 2008 Everingham et al, 2009 Shotton et al. 2006 Yao et al, 2007

A Profound Machine Learning Problem Within Visual Learning

Machine Learning 101: Complexity, Generalization, Overfitting Error Underfitting Overfitting Zone Zone Generalization Error Training Generalization Error Gap Optim al Capacity Capacity

One-Shot Learning Fei-Fei et al, 2003, 2004

Fei-Fei et al, 2003, 2004

How Children Learn to See

Error Underfitting Overfitting Zone Zone Generalization Error Training Generalization Error Gap Optim al Capacity Capacity

A new way of thinking… To shift the focus of Machine Learning for visual recognition from … to data. modeling… Lots of data.

Internet Data Growth 1990-2010 15,000 11,250 7,500 3,750 Global Data Traffic (PB/ month) Source: Cisco

What is WordNet? Establishes Organizes over ontological and 150,000 words into lexical relationships 117,000 categories Original paper by in NLP and related called synsets . [George Miller, et tasks. al 1990 ] cited over 5,000 times

Christiane Fellbaum Senior Research Scholar Computer Science Department, Princeton President, Global WordNet Consortium

Ind iv id ua lly Illustra ted W ord Net Nod es jacket: a short coat Germ an shepherd: breed of A m a ssiv e ontology of large shepherd dogs used in police work and as a guide for the im a ges to tra nsform blind. com p uter v ision m icrowave: kitchen appliance that cooks food by passing an electromagnetic wave through it. m ountain: a land mass that projects well above its surroundings; higher than a hill.

Comrades Prof. Kai Li Jia Deng 1 st Ph.D. student Princeton Princeton

Entity Step 1: Ontological Ma m m a l structure based on WordNet Dog Germ a n Shep herd

Dog Germ a n Step 2: Populate categories Shep herd with thousands of images from the Internet

Dog Germ a n Step 3: Clean results by Shep herd hand

Three Attempts at Launching

1 st Attempt: The Psychophysics Experiment Im ageNet PhD Students Miserable Undergrads

1 st Attempt: The Psychophysics Experiment # of synsets: 40 ,0 0 0 (subject to: imageability analysis) • # of candidate images to label per synset: 10 ,0 0 0 • # of people needed to verify: 2-5 • Speed of human labeling: 2 im ages/ sec (one fixation: ~200msec) • Massive parallelism (N ~ 10 ^2-3) • ≈ 19 years 40 ,0 0 0 × 10 ,0 0 0 × 3 / 2 = 60 0 0 ,0 0 0 ,0 0 0 sec N

2 nd Attempt: Human-in-the-Loop Solutions

2 nd Attempt: Human-in-the-Loop Solutions Machine-generated Human-generated datasets can only match datasets transcend the best algorithms of algorithmic limitations, the time. leading to better machine perception.

3 rd Attempt: A Godsend Emerges Im ageNet PhD Students Crowdsourced Labor 4 9 k Workers from 16 7 Countries 20 0 7-20 10

The Result: Goes Live in 2009

What We Did Right

While Others Targeted Detail… LabelMe Lotus Hill Per-Object Regions and Labels Hand-Traced Parse Trees Russell et al, 2005 Yao et al, 2007

… We Targeted Scale SUN, 131K [Xiao et al. ‘10] LabelMe, 37K [Russell et al. ’07] 15M [Deng et al. ’09] PASCAL VOC, 30K [Everingham et al. ’06-’12] Caltech10 1, 9K [Fei-Fei, Fergus, Perona, ‘03]

Additional Goals Carnivore - Canine - Dog - Working Dog - Husky High High-Quality Free of Resolution Annotation Charge To better replicate human visual To create a benchmarking dataset To ensure immediate application and acuity and advance the state of machine a sense of community perception, not merely reflect it

An Emphasis on Community and Achievement Large Scale Visual Recognition Challenge (ILSVRC 20 10 -20 17)

ILSVRC Contributors Alex Berg Jia Deng Zhiheng Huang Aditya Khosla Jonathan Krause Fei-Fei Li UNC Chapel Hill Univ. of Michigan Stanford Stanford Stanford Stanford Sean Ma Eunbyung Park Olga Russakovsky Sanjeev Satheesh Hao Su Wei Liu UNC Chapel Hill Stanford UNC Chapel Hill Stanford Stanford Stanford

Our Inspiration: PASCAL VOC 2005-2012

Our Inspiration: PASCAL VOC Mark Everingham Prize @ ECCV 20 16 Mark Everingham 1973-2012 Alex Berg, Jia Deng, Fei-Fei Li, Wei Liu, Olga Russakovsky

Participation and Performance 172 157 123 8 1 35 29 2010 2011 2012 2013 2014 2015 2016 Num ber of Entries

Participation and Performance 0 .28 172 157 123 8 1 0 .0 3 35 29 2010 2011 2012 2013 2014 2015 2016 Num ber of Classification Entries Errors (top-5)

Participation and Performance 0 .28 0 .66 172 157 123 8 1 0 .0 3 0 .23 35 29 2010 2011 2012 2013 2014 2015 2016 Average Precision Num ber of Classification Entries Errors (top-5) For Object Detection

What we did to make better

Lack of Details

Lack of Details… ILSVRC Detection Challenge PASCAL ILSVRC Statistics VOC 20 12 20 13 Object classes 20 20 0 10 x Images 5.7K 395K 70 x Training Objects 13.6K 345K 25x

Evaluation of ILSVRC Detection Need to annotate the presence of all classes (to penalize false detections) # images: 400K Table Chair Horse Dog Cat Bird # classes: 200 # annotations = 80M! + + - - - - + - - - + - + + - - - -

Evaluation of ILSVRC Detection Hierarchical annotation J. Deng, O. Russakovsky, J. Krause, M. Bernstein, A. Berg, & L. Fei-Fei. CHI, 2014

What does classifying 10K+ classes tell us? J. Deng, A. Berg & L. Fei-Fei, ECCV, 2010

Fine-Grained Recognition “Ca rd iga n W elsh Corgi” “Pem broke W elsh Corgi”

Fine-Grained Recognition cars [Gebru, Krause, Deng, Fei-Fei, CHI 2017] 2567 classes 700k images

Expected Outcomes Machine learning Breakthroughs in ImageNet becomes a advances and changes object recognition benchmark dramatically

Unexpected Outcomes

Neural Nets are Cool Again! 13,259 Citations Krizhevsky, Sutskever & Hinton, NIPS 2012

And Cooler and Cooler  … “ResNet” “AlexNet” “GoogLeNet” “VGG Net” [Simonyan & Zisserman, [He et al. CVPR 2016] [Krizhevsky et al. NIPS 2012] [Szegedy et al. CVPR 2015] ICLR 2015]

Neural Nets A Deep Learning Revolution GPUs

Ontological Structure Structure Not Used as Much

Thing is a Animalia Chordate Arthropoda Mammal Insect W om ba t is a Primate Carnivora Diptera Marsupial Hominidae Pongidae Felidae Muscidae Homo Pan Felis Musca is a Sapiens Troglodytes Domestica Leo Domestica Wombat Human Chimpanzee House Cat Lion Housefly Deng, Krause, Berg & Fei-Fei, CVPR 2012

Where have we been? Where are we going? LI F E I F EI The - PowerPoint PPT Presentation

Where have we been? Where are we going? LI F E I F EI The Beginning: CVPR 2009 J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-Fei, Im a g eNet: A La rg e-Sca le Hiera rchica l Im a g e Da ta b a se. IEEE Com puter Vision and

High Performing Darren Steeves Meet Aunt Hilda They are going to This is going to be love me

Dean Hybl Where Have We Been, Where Are We Now, Where Are We Going? Where Are We Now? Where Are

Mississippi By:Shauntyara Introduction Have you ever been to Mississippi? I have. I have been

IF WE WERE GOING TO HAVE A BUSINESS WE WERE GOING TO HAVE ONE THAT WAS CONSISTENT WITH OUR VALUES.

PSPA Talk Alistair Church Dept Neurology Royal Gwent Hospital What are you going to do about it

The Generation of Referring Expressions: The Generation of Referring Expressions: Where We've

Were going to spend the next 45 minutes talking about hops. Specifically, were going to be

Going Concern Audit Opinions in the Age of Covid-19 Some Thoughts Marshall A. Geiger Going

The world ld is is going going downhill ll The world ld is is getting better Theres a lot

Keep Communication Going Please download the Keep Communication Going Worksheet and Handouts 1

Update August 16, 2018 Where have we been? Where are we now? Where are we going? Whats

2013 - 2016 District Technology Plan 4/18/2013 Considerations Where we have been - what we

Ramin Jamnejad We are going to find what is going on here! Electrons can be emitted from a solid

Managed Lanes in California: Where Weve Been Where We ve Been Where Were Going Joe Rouse

Stimulus Equivalence Joshua K. Pritchard, PhD In todays workshop Im going to try and

We Believe Cherry Creek Schools Welcome and Introductions What are you going to change? Are we

1 Imagine we are building VR system. Money is no object: We are asked to specify the

CASTLE In a Nutshell Cognitive Models Motor controls and Sensory inputs other actions Virtual

BASICS ON DIGITAL BASICS ON DIGITAL AUDIO AND VIDEO AUDIO AND VIDEO REPRESENTATION

Design of HCI: Who is involved? Computer scientists Software designers Hardware

The Human Ov erview Human can b e view ed as an information pro cessing system, for

No Disclosures Creating an Innovative Inpatient Program for Pediatric Medical Psychiatric

Advances on cognitive automation at LGI2P / Ecole des Mines d'Als Doctoral research snapshot

UNLOCKING THE POTENTIAL OF CANNABINOID MEDICINES I N V E S TO R P R E S E N TAT I O N D e c e m