Learn earn CN CNNs Ns fr from
- m Lar
Learn earn CN CNNs Ns fr from om Lar arge ge-scale cale We - - PowerPoint PPT Presentation
Learn earn CN CNNs Ns fr from om Lar arge ge-scale cale We Web b Im Images ages wi without hout Hu Human an An Anno notations tations Weilin Huang Malong Technologies Ho How t w to Trai ain a H a High-Perf erfor orma
June 26, 2017
ImageNet, Webvision
AlexNet, GoogleNet, VggNet, ResNet, DenseNet
Learni ning ng resour urce ce Wh What to do Learni ning ng strategy gy How w to do How w to do Model el Capabi bility ity
June 26, 2017
Perfor
mance on
Imag ageNet eNet ha has s been n saturation turation. .
From
0% (2009) 009) --
~2.2% .2% (2017) 17)
lop p ne new appro roaches aches work
ing on
ge-scale cale data a in n real-wo world rld scenari narios
ta, mod
el archi hite tecture cture, los
s, train inin ing g strat ategy egy are e all important portant
ain CNNs NNs from
eb image ges s are e most
mon tasks sks in n ind ndustri ustries es
ain CNNs Ns withou thout t hu human man labelli elling ng --
> weakly kly-superv upervis ised d learn rning ing
June 26, 2017
Program Chairs General Chairs
Wen Li Limin Wang Wei Li
June 26, 2017
WebVi Visi sion
et:
ges 1,000 0 semanti ntic c conce ncept pts from m ILSVRC VRC 2012
Wen Li, Limin in Wang ng, Wei Li, Eirikur kur Agus ustss tsson
bVisi ision
abase: Visual ual Learning rning and d Unde ders rstandin tanding g from
arXi Xiv: : 1708.028 8.02862, , 2017. .
June 26, 2017
Tench Terrapin Caretta
June 26, 2017
June 26, 2017
June 26, 2017
robust st Algorith thms ms
cleansi eansing ng methods hods
>Need ed a s small ll set of manual ually ly-la labeled beled
ifficul ficult t to id identify ify mis isla labe beled led samples les from m hard train inin ing g samples les
“Humans and animals learn much better when the examples are not randomly presented but organized in a meaningful order which illustrates gradually more concepts, and gradually more complex ones.”
June 26, 2017
June 26, 2017
Steps: ps:
June 26, 2017
arXiv:170 :1707.0 7.001 0183, 2017.
June 26, 2017
June 26, 2017
(Rodriguez and Laio, Science, 2014.)
Subset1 Subset2 Subset3
Terrapin Tench Tench
Subse bset t 1 Subse bset t N
Terrapin
June 26, 2017
Subset t 1 Subset t 3 Subset t 2
Task One Task Two Task Three
𝝐𝒈 𝛜𝒙𝒋𝒌
𝒏 = 𝒖 𝒍=𝟐 𝒐𝒖
𝝐𝒈 𝝐𝑷𝒍
𝒏 · 𝝐𝑷𝒍 𝒏
𝝐𝒙𝒋𝒌
𝒏
· 𝒔𝒖 𝑢 is number of subtasks, t= 3 𝑠𝑢 is sample weight, , 𝑠 = {1, 0.5,0.5} 𝑠
1=1
𝑠2=0.5 𝑠3=0.5
Task sk Three ee Meta Data Curri rriculu culum m Learn earning ing Curriculum 2 Subsets Model- B Model- C Model- D
Subset 1
Model- A Curriculum 3 Subsets Curricu riculum lum Learn earning ing
June 26, 2017
Baseli seline ne Model el 1 Task sk One Baseli seline ne Model el 2 Curricu riculum lum Model el 1 Curri rriculu culum m Model el 2
June 26, 2017
Conv nv.1
9x9 5x5 7x7
Input Data
Feature ture Map-1 Feature ture Map-2 Featur ture e Map-3
Concat ncat
Feature ture Map
Enhan ance ce low-le level vel featu tures res whic ich imp mprov
e the performa rmance nce (about ut 0.5% 5%).
June 26, 2017
ng loss of four ur diff fferent ent mode dels s with Incep eption tion_v2 v2 (also compar paring ing to K-mean an cluste terin ring g in curriculum culum design) gn)
June 26, 2017
June 26, 2017
Table le 1. Dif iffer ferent ent models els based d on Incep eption_v2 tion_v2 on validati ation
Table 3. Model el-D D with h various
works Table 2. Model el-D D with h various
unts ts of hig ighl hly y nois isy y data.
June 26, 2017
Improve 668 categories, reduce 195 categories, and 137 unchanged
June 26, 2017
Improve 668 categories, reduce 195 categories, and 137 unchanged
June 26, 2017
—> > Train ain high-performa performance nce CNNs Ns from
ge-scale cale web imag ages es —> > Be Bette ter r gene neraliza ralization tion capabil pability ity —> > Improv rove e our products ucts where re real-wo world rld data ta was as claw awed d from
rnet with th less s human an labelli lling ng or
ls are incon
siste tence nce —> > Will ll develop
supervised ervised and nd wea eakly kly-supervis upervised ed app ppro roache aches
—> > Hand ndle le label bel inc ncon
siste stence nce and nd data ta un unbalance alance
June 26, 2017
We live in a world of products. In retail, manufacturing, and security scenarios, products need to be routinely recognized at a high-level, microscopic-level, and even the invisible (x-ray) level. If a machine can “see” products as well as people can, higher efficiency can be achieved in retail product checkouts, higher quality in manufacturing product testing, and higher safety via baggage scanning of products – just to name a few. Using breakthrough GPU-powered semi-supervised deep learning algorithms, scientists at Malong invented product recognition technology which operates at human-level performance across the full-stack of visual input levels – the big, the small, and the invisible, to help improve efficiency, quality, and safety, for our world.