ION I N S I D E - O U T S I D E N E T : D E T E C T I N G O B J E - PowerPoint PPT Presentation

ION I N S I D E - O U T S I D E N E T : D E T E C T I N G O B J E C T S I N C O N T E X T W I T H S K I P P O O L I N G A N D R E C U R R E N T N E U R A L N E T W O R K S S E A N B E L L ( C O R N E L L U N I V E R S I T Y ) K AV I TA B A L A ( C O R N E L L U N I V E R S I T Y ) L A R RY Z I T N I C K ( M I C R O S O F T R E S E A R C H , N O W A T FA I R )   R O S S G I R S H I C K ( M I C R O S O F T R E S E A R C H , N O W A T FA I R )

I O N T E A M Larry Zitnick Ross Girshick Sean Bell Kavita Bala (Microsoft Research,   (Cornell University) now both at FAIR)

S U M M A RY: M S C O C O D E T E C T I O N test-competition test-dev Runtime Best Student Entry Competition 31.0% 31.2% 2.7 s (3rd Place Overall) Post-Competition 33.1% 5.5 s (single ConvNet model, no ensembling) • New ION detector (+5.1 mAP) Key pieces: • Better proposals, more data (+3.9 mAP) • Better training/testing (+4.1 mAP) Tech report: http://arxiv.org/pdf/1512.04143.pdf

I O N D E T E C T O R +5.1 mAP on COCO test-dev compared to Fast R-CNN

FA S T R - C N N [ G I R S H I C K 2 0 1 5 ] fc6 fc7 conv5 cls Input “ROI Pooling” ConvNet bbox Feature extraction Classification Can we improve on feature extraction? - For small objects, the footprint on conv5 might only cover   a 1x1 cell, which gets upsampled to 7x7 - Only local features (inside the ROI) are used for classification

L E T ’ S A D D S K I P C O N N E C T I O N S conv3 conv4 conv5 cls fc6 fc7 bbox dim reduction [Sermanet 2013]   concatenate [Hariharan 2015]   [Liu 2015] Feature extraction Classification

P R O B L E M : F E AT U R E A M P L I T U D E - Different layers have very different amplitudes - We must account for this to combine features - L2 normalize to length 1, and then re-scale [Liu 2015]

C O M B I N I N G A C R O S S L AY E R S conv3 conv4 conv5 cls fc6 fc7 bbox normalize, concatenate, re-scale

R E S C A L I N G F E AT U R E A M P L I T U D E S Naive With L2 norm + rescaling 80 74.6 74.6 74.4 71.5 70.8 69.7 60 63.6 VOC2007 49.3 40 mAP 20 0 conv5 conv5+4 conv5+4+3 conv5+4+3+2 (same as Fast R-CNN)

I O N : I N S I D E - O U T S I D E N E T Base ConvNet: VGG16 [Simonyan 2014]

L AT E R A L R N N ( M O V E S A C R O S S A N I M A G E ) Output   (which we interpret as   - Repeat for each row context features) - Can compute each   Hidden state column in parallel - We can also move   … in 4 different directions Convolutional   conv5 features [Schuster 1997], [Graves 2009], [Byeon 2015], [Visin 2015]

R N N I M P L E M E N TAT I O N Down: Up: Right: Left: conv5 conv5 conv5 conv5

R N N I M P L E M E N TAT I O N Down: Up: Right: Left: Abstract away the complexity: Transpose everything to left-to-right and write a single GPU implementation conv5 conv5 conv5 conv5

R N N I M P L E M E N TAT I O N ReLU RNN:   “IRNN” [Le 2015]

R N N I M P L E M E N TAT I O N Merge the hidden-to-output into a single conv.

R N N I M P L E M E N TAT I O N Share the input-to-hidden transition

R N N I M P L E M E N TAT I O N Our final architecture: Features used by our detector Stack 2 RNNs together

R N N : S PAT I A L D E P E N D E N C Y

I O N : I N S I D E - O U T S I D E N E T Main changes: - Inside: Skip connections with L2 normalization - Outside: Stacked 4-direction RNNs for context Base ConvNet: VGG16 [Simonyan 2014]

B E T T E R P R O P O S A L S , M O R E D ATA +3.9 mAP on COCO test-dev,   compared to Selective Search

R E G I O N P R O P O S A L N E T W O R K ( R P N ) Faster R-CNN [Ren 2015]

R E G I O N P R O P O S A L N E T W O R K ( R P N ) • Original RPN [Ren 2015] used 9 anchors : 3 scales x 3 aspect ratios.   RPN works well for VOC, but not COCO • We extend this to 22 anchors: 7 scales x 3 aspect ratios, and 32x32 Avg. Recall Selective Search [Uijlings 2013] 41.7% MCG [Arbelaez 2014] 51.6% RPN with 10 anchors [Ren 2015] 39.9% 44.1% RPN with 22 anchors • We mix MCG with RPN, which performs better than either individually   (1000 of each for training, 2000 of each for testing)

B E T T E R T R A I N I N G / T E S T I N G +4.1 mAP on COCO test-dev,   compared to Fast R-CNN setup

T R A I N I N G I M P R O V E M E N T S • No dropout (+0.6 mAP) • Train for longer with larger mini-batches   4 images (512 ROIs total) / batch (+0.8 mAP) • Regularize with semantic segmentation predictions (+1.3 mAP)   (see tech report) (mAP on test-dev)

T E S T I N G I M P R O V E M E N T S • We use iterative box regression and weighted voting , from MR-CNN [Gidaris 2015] • Helps on PASCAL (+2.0 mAP) • Reduces score on COCO (-0.5 mAP), since COCO requires precise localization • New thresholds: NMS: ~0.45, voting: ~0.85 (+1.3 mAP) • Left-right flips: evaluate on original and flipped image and average (+0.8 mAP) [Gidaris 2015]

C O M PA R I S O N T O R E S N E T ( W I N N E R ) [ H E 2 0 1 5 ] Combining ResNet101 and ION is potentially complimentary Our single-model (post-competition) result: 33.1% mAP

C O N C L U S I O N Sean   Kavita   Improvement breakdown: Bell Bala • New ION detector (+5.1 mAP) • Better proposals, more data (+3.9 mAP) • Better training/testing (+4.1 mAP) Larry   Ross   Zitnick Girshick Thanks: • NVIDIA (GPU Donation) • Microsoft Research (Internship) ION Tech Report: http://arxiv.org/pdf/1512.04143.pdf

E X T R A S L I D E S

S U R P R I S I N G F I N D : H 2 H T R A N S I T I O N N O T N E E D E D We use H2H for our submission, but there is barely any drop without it!

W H AT A B O U T O T H E R C O N T E X T M E T H O D S ?

I S S E G M E N TAT I O N L O S S W O R T H I T ? Test: +1mAP , same speed Train: 1.5x-2x slower

H O W M A N Y R N N L AY E R S ?

W H Y C O N V 3 , C O N V 4 , C O N V 5 ?

R E S U LT S ( V O C 2 0 0 7 T E S T ) M E T H O D M A P FA S T R - C N N   7 0 . 0 [ G I R S H I C K 2 0 1 4 ] FA S T E R R - C N N   7 3 . 2 [ G I R S H I C K 2 0 1 5 ] 7 5 . 6 C O N V 3 + C O N V 4 + C O N V 5 7 6 . 5 + R N N + S E G M E N TAT I O N L O S S + S E C O N D B B O X R E G R E S S I O N + 7 8 . 5 W E I G H T E D V O T I N G 7 9 . 2 — D R O P O U T

A C T I VAT I O N S Input Positive Negative

R N N A C T I VAT I O N S Input Positive Negative

R E C U R R E N T N E U R A L N E T W O R K S [Karpathy 2015]

T Y P E S O F R N N S “Vanilla”   LSTM GRU RNN   (Long Short- (Gated (tanh) Term Memory) Recurrent Unit) [Rumelhart 1986], [Hochreiter and Schmidhuber 1997], [Cho 2014]

C A N W E U S E R E L U W I T H A N R N N ? - Replacing tanh with ReLU gave huge gains for AlexNet - Is there some way to use ReLU with RNNs? tanh “Vanilla”   RNN   ReLU (tanh) [Krizhevsky 2012]

ION I N S I D E - O U T S I D E N E T : D E T E C T I N G O B J E - PowerPoint PPT Presentation

ION I N S I D E - O U T S I D E N E T : D E T E C T I N G O B J E C T S I N C O N T E X T W I T H S K I P P O O L I N G A N D R E C U R R E N T N E U R A L N E T W O R K S S E A N B E L L ( C O R N E L L U N I V E R S I T Y ) K AV I TA

I nt roduct ion t o Lab 2 I nt roduct ion t o Lab 2 I nt roduct ion t o Lab 2 I nt roduct ion t

Plasmacluster Ion Generator Plasmacluster Ion Generator A Revolution in Air Treatment Natures

ION RIT POWE RPOINT PRE SE NT AT ION SUBMISSION PRE SE NT AT ION GUIDE L INE S

LITHIUM ION IN MATERIALS HANDLING LITHIUM ION IN MATERIALS HANDLING LITHIUM ION IN WAREHOUSE

Overview Int ntrod oductio ion Prod roductio ion Su Supp pported Exp Explo loratio ion

E XT RADIT ION & E XT RADIT ION L AW RE NDIT ION: E xtra ditio n Cla use o f

S OCIAL INCLUS ION S OCIAL INCLUS ION S ocial inclusion is t he realizat ion of the

SPECIAL TOPICS IN ION BEAM ANALYSIS PART 2 SINGLE ION TECHNIQUES: STIM & IBIC Milko

CS4513 Dist ribut ed synchronizat ion needed f or t ransact ions (bank account via ATM)

Chapt er 2: Applicat ion Layer Applicat ions and applicat ion-layer prot ocols Chapt er goals:

New Capabilities for Analysis of Biogenic Amines using Ion Biogenic Amines using Ion

Product Presentation ION 6 Adventure awaits The fundamentally redesigned ION 6 (EN/LTF B) is

T HE PROCE SS OF RE GIONAL INT E GRAT ION AND COOPE RAT ION IN T HE PACIF IC RE

ION MOBILITY DRIFT TUB PAPRI CHAKRABORT 02.07.2016 What is Ion Mobility Spectrometry ?

Bid id brie riefin ing se sessio ion: Request for r Acquis isit itio ion of f Onlin

Q4|2018 PERFORMANCE RESULTS PRESENTATION FOR INVESTOR & ANALYST DISCLAIMER The informat ion

October 14, 2020 CHIEF FOIA CHIEF FOIA OFFICERS COUNCIL OFFICERS COUNCIL THE FOIA DISPUTE

Better Living Through Software Tiffani Ashley Bell Founder + Executive Director

Hello. Its great to be able to make this presenta5on via webinar for all those who were not

Welcome Back Operational Effectiveness Keith Mullett, President and General Manager,

P4 - Central Limit Theorem STAT 587 (Engineering) Iowa State University August 28, 2020 Main

An Interconnected Strategy Understanding, Correcting, and Preventing Bullying, Harassment and

BASIC PRINCIPLES OF PHOTOTHERMAL TECHNIQUES AND THEIR APPLICATIONS. by ERNESTO MARN

p -adic actions on Fukaya categories and iterations of symplectomorphisms Yusuf Bar s Kartal

ION I N S I D E - O U T S I D E N E T : D E T E C T I N G O B J E - PowerPoint PPT Presentation

ION I N S I D E - O U T S I D E N E T : D E T E C T I N G O B J E C T S I N C O N T E X T W I T H S K I P P O O L I N G A N D R E C U R R E N T N E U R A L N E T W O R K S S E A N B E L L ( C O R N E L L U N I V E R S I T Y ) K AV I TA

I nt roduct ion t o Lab 2 I nt roduct ion t o Lab 2 I nt roduct ion t o Lab 2 I nt roduct ion t

Plasmacluster Ion Generator Plasmacluster Ion Generator A Revolution in Air Treatment Natures

ION RIT POWE RPOINT PRE SE NT AT ION SUBMISSION PRE SE NT AT ION GUIDE L INE S

LITHIUM ION IN MATERIALS HANDLING LITHIUM ION IN MATERIALS HANDLING LITHIUM ION IN WAREHOUSE

Overview Int ntrod oductio ion Prod roductio ion Su Supp pported Exp Explo loratio ion

E XT RADIT ION &amp; E XT RADIT ION L AW RE NDIT ION: E xtra ditio n Cla use o f

S OCIAL INCLUS ION S OCIAL INCLUS ION S ocial inclusion is t he realizat ion of the

SPECIAL TOPICS IN ION BEAM ANALYSIS PART 2 SINGLE ION TECHNIQUES: STIM &amp; IBIC Milko

CS4513 Dist ribut ed synchronizat ion needed f or t ransact ions (bank account via ATM)

Chapt er 2: Applicat ion Layer Applicat ions and applicat ion-layer prot ocols Chapt er goals:

New Capabilities for Analysis of Biogenic Amines using Ion Biogenic Amines using Ion

Product Presentation ION 6 Adventure awaits The fundamentally redesigned ION 6 (EN/LTF B) is

T HE PROCE SS OF RE GIONAL INT E GRAT ION AND COOPE RAT ION IN T HE PACIF IC RE

ION MOBILITY DRIFT TUB PAPRI CHAKRABORT 02.07.2016 What is Ion Mobility Spectrometry ?

Bid id brie riefin ing se sessio ion: Request for r Acquis isit itio ion of f Onlin

Q4|2018 PERFORMANCE RESULTS PRESENTATION FOR INVESTOR &amp; ANALYST DISCLAIMER The informat ion

October 14, 2020 CHIEF FOIA CHIEF FOIA OFFICERS COUNCIL OFFICERS COUNCIL THE FOIA DISPUTE

Better Living Through Software Tiffani Ashley Bell Founder + Executive Director

Hello. Its great to be able to make this presenta5on via webinar for all those who were not

Welcome Back Operational Effectiveness Keith Mullett, President and General Manager,

P4 - Central Limit Theorem STAT 587 (Engineering) Iowa State University August 28, 2020 Main

An Interconnected Strategy Understanding, Correcting, and Preventing Bullying, Harassment and

BASIC PRINCIPLES OF PHOTOTHERMAL TECHNIQUES AND THEIR APPLICATIONS. by ERNESTO MARN

p -adic actions on Fukaya categories and iterations of symplectomorphisms Yusuf Bar s Kartal

E XT RADIT ION & E XT RADIT ION L AW RE NDIT ION: E xtra ditio n Cla use o f

SPECIAL TOPICS IN ION BEAM ANALYSIS PART 2 SINGLE ION TECHNIQUES: STIM & IBIC Milko

Q4|2018 PERFORMANCE RESULTS PRESENTATION FOR INVESTOR & ANALYST DISCLAIMER The informat ion