A Good Box is not a Guarantee of a Good Mask Pantone / 185C C 75 / - PowerPoint PPT Presentation

LVIS Challenge 2020 A Good Box is not a Guarantee of a Good Mask Pantone / 185C C 75 / M 59 / Y 37 / K 0 C 71 / M 63 / Y 60 / K 14 C 0 / M 95 / Y 85 / K 0 R 88 / G 106 / B 135 R 89 / G 89 / B 89 R231 / G 36 / B 39 #586a87 #595959 #e72427 Jingru Tan 1 Gang Zhang 2 Hanming Deng 3 Changbao Wang 3 Lewei Lu 3 Quanquan Li 3 Pantone / 129C 1 Tongji University 2 Tsinghua University 3 Sensetime Research C 0 / M 20 / Y 100 / K 0 R253 / G 208 / B 0 #fdd000 1

Introduction of LVIS Long tail distribution High quality mask annotations Training Pipeline Representation learning stage Fine-tuning stage Our Results Improvements & tricks Challenges in LVIS Inconsistent annotations Overview Objects that are hard to represent with boxes 2

Introduction of LVIS Long tail distribution High quality mask annotations Overview 3

Introduction of LVIS Long Tail Distribution Classifier is heavily biased towards head categories. Tail categories are hard to classify 4

Introduction of LVIS High Quality Mask Annotations COCO LVIS Coarse polygon annotations Precise polygon annotations 5

Introduction of LVIS Long tail distribution High quality mask annotations Training Pipeline Representation learning stage Fine-tuning stage Overview 6

Training pipeline Representation Learning Learning universal representation Fine-tuning Balancing classifier (for long-tail distribution) Pay more attention on mask prediction (for high quality mask) 7

Representation Learning Equalization Loss Equalization Loss for Long-tailed Object Recognition, CVPR 2020 Repeat Factor Sampling LVIS: A Dataset for Large Vocabulary Instance Segmentation , CVPR 2019 Mosaic & Rotate & Multi-Scale Training YOLOv4: Optimal Speed and Accuracy of Object Detection, Arxiv preprint 8

Representation Learning Self-Training To further enhance the model performance, pseudo label is inferenced on LVIS and external datasets like Open Images for self-training. For self-training, we ignore all proposals matched with the Pseudo Label pseudo boxes. AP@Seg AP@r AP@ AP@f AP@BBox m c Baseline 26.2 17.1 26.2 30.2 27.0 Open Images 26.8 17.5 27.2 30.5 28.1 Ignore LVIS 26.8 17.0 27.1 30.9 27.8 pseudo 9

Fine-tuning – BBox Head Balanced Classifier Classifier is heavily biased towards head Balanced Group Softmax categories. 10 Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax. CVPR 2020

Fine-tuning – Mask Head A Good Box is not a Guarantee of a Good Mask Category Frequency BBox AP Mask AP Area Mask/BBox Mask AP - BBox AP coatrack common 73.6 10.1 0.29 -63.5 tripod frequent 40.9 3.8 0.22 -37.1 necklace frequent 32.8 3.0 0.17 -29.8 ski pole frequent 36.2 7.5 0.15 -28.7 fork frequent 47.2 21.9 0.26 -25.3 windshield wiper frequent 29.6 5.2 0.26 -24.4 giraffe frequent 79.2 60.7 0.33 -18.5 Mask AP - BBox AP < -5.0: 270 of 1203 categories (~25%) Large BBox, Small Mask Mask AP - BBox AP < -5.0 and Area Mask/BBox < 0.5: 168 of 1203 categories 11

Fine-tuning – Mask Head A Good Box is not a Guarantee of a Good Mask The smaller the mask/bbox ratio, the larger the gap between mask and bbox AP. 12

Fine-tuning – Mask Head Some Examples Tripod Giraffe Ski pole 13

Fine-tuning – Mask Head Solution for Categories with Small Ratio New Strategy for Mask Proposal Assignment Feature Pyramid Networks for Object Detection, CVPR 2017 AP@Segm AP@r AP@c AP@f AP@BBox Baseline 34.7 26.1 34.9 38.1 37.6 Ratio Assign 35.0 26.4 35.2 38.6 37.6 14

Fine-tuning – Mask Head Solution for Categories with Small Ratio Balanced Mask Loss: foreground/background imbalance AP@Segm AP@r AP@c AP@f AP@BBox Baseline 34.7 26.1 34.9 38.1 37.6 +BML 35.0 26.1 35.3 38.5 37.6 15

Fine-tuning – Mask Head w Ratio Assign & Balanced Mask Loss Category Frequency Area AP Gap AP Gap (Ours) Improvemen Mask/BBox t coatrack common 0.29 -63.5 -60.1 +3.4 tripod frequent 0.22 -37.1 -33.1 +4.0 necklace frequent 0.17 -29.8 -27.7 +2.1 ski pole frequent 0.15 -28.7 -24.5 +4.2 fork frequent 0.26 -25.3 -21.3 +4.0 windshield frequent 0.26 -24.4 -21.1 +3.3 wiper giraffe frequent 0.33 -18.5 -14.4 +4.1 However, it is still an open problem. Leave the further research in the future. 16

Fine-tuning – Mask Head Predicting High Quality Mask AP@Segm AP@r AP@c AP@f AP@BBox Baseline 34.7 26.1 34.9 38.1 37.6 + Ratio Assign 35.0 26.2 35.2 38.5 37.6 + Balanced Mask Loss 35.2 26.0 35.4 38.9 37.6 + Boundary 35.6 26.9 35.6 39.3 37.6 Supervision* + 7 Convs for Mask 35.8 26.8 35.9 39.6 37.6 Head + Deformable RoI 36.1 28.8 35.8 39.8 38.3 Pooling Boundary-preserving Mask R-CNN, ECCV 2020 17

Introduction of LVIS Long tail distribution High quality mask annotations Training Pipeline Representation learning stage Fine-tuning stage Our Results Improvements & tricks Overview 18

Our Results Baseline 19.2 baseline 19

Our Results Data Augmentation (Mosaic, Rotate) 20.3 19.2 data augmentation baseline 20

Our Results Equalization Loss 22.4 20.3 19.2 EQL data augmentation baseline baseline 21

Our Results Repeat Factor Sampling 26.2 22.4 RFS 20.3 19.2 EQL data augmentation baseline 22

Our Results HTC w/o Semantic Branch 28.8 26.2 HTC 22.4 RFS 20.3 19.2 EQL data augmentation baseline baseline 23

Our Results ResNeSt101 + DCN + 400-1400 Multi-Scale training 32.0 28.8 S101 & DCN 26.2 HTC 22.4 RFS 20.3 19.2 EQL data augmentation baseline 24

Our Results Some Tricks Make sampling probability in mosaic align with RFS Make rotated boxes align with rotated masks 33.2 32.0 28.8 tricks S101 & DCN 26.2 HTC 22.4 RFS 20.3 19.2 EQL data baseline augmentation 25

Our Results Self-Training 33.7 33.2 32.0 self training 28.8 tricks S101 & DCN 26.2 HTC 22.4 RFS 20.3 19.2 EQL data baseline augmentation 26

Our Results Mask Scoring + Pseudo Ignore + ResNeSt269 36.5 33.7 33.2 mask scoring + 32.0 pseudo ignore + self training 28.8 tricks S269 S101 & DCN 26.2 HTC 22.4 RFS 20.3 19.2 EQL data baseline augmentation 27

Our Results Balanced Group Softmax 37.6 36.5 balanced group softmax 33.7 33.2 mask scoring + 32.0 self training pseudo ignore + tricks S269 28.8 S101 & DCN 26.2 HTC 22.4 RFS 20.3 19.2 EQL data baseline augmentation 28

Our Results High Quality Mask 38.8 37.6 36.5 high quality mask 33.7 33.2 balanced group softmax misc 32.0 self training tricks 28.8 S101 & DCN 26.2 HTC 22.4 RFS 20.3 19.2 EQL data baseline augmentation misc: mask scoring, pseudo ignore, ResNeSt269 29

Our Results Testing Time Augmentation 41.5 38.8 TTA 37.6 36.5 high quality mask 33.7 33.2 balanced group softmax misc 32.0 self training 28.8 tricks S101 & DCN 26.2 HTC 22.4 RFS 20.3 19.2 EQL data baseline augmentation TTA: (1) multi-scale testing (2) scale-aware inference (3) revised Softnms 30

Introduction of LVIS Long tail distribution High quality mask annotations Training Pipeline Representation learning stage Fine-tuning stage Our Results Improvements & tricks Challenges in LVIS Inconsistent annotations Overview Objects that are hard to represent with boxes 31

Challenges in LVIS Not-well Boxable Objects Fire Hose (Mask AP 3.9) Hose (Mask AP 6.5) 32

Challenges in LVIS Categories that are Hard to Detect Hook (Mask AP 7.3) Stirrup (Mask AP 1.2) 33

Challenges in LVIS Inconsistent Annotations Crib (Mask AP - BBox AP = -51.6) 34

LVIS Challenge 2020 Pantone / 185C C 75 / M 59 / Y 37 / K 0 C 71 / M 63 / Y 60 / K 14 C 0 / M 95 / Y 85 / K 0 R 88 / G 106 / B 135 R 89 / G 89 / B 89 Thank you R231 / G 36 / B 39 #586a87 #595959 #e72427 Pantone / 129C C 0 / M 20 / Y 100 / K 0 R253 / G 208 / B 0 #fdd000 35

A Good Box is not a Guarantee of a Good Mask Pantone / 185C C 75 / - PowerPoint PPT Presentation

LVIS Challenge 2020 A Good Box is not a Guarantee of a Good Mask Pantone / 185C C 75 / M 59 / Y 37 / K 0 C 71 / M 63 / Y 60 / K 14 C 0 / M 95 / Y 85 / K 0 R 88 / G 106 / B 135 R 89 / G 89 / B 89 R231 / G 36 / B 39 #586a87 #595959 #e72427

1. procedure ONE TO ALL BC( d , my id , X ) 2. begin mask := 2 d 1; 3. /* Set all d bits of

Paradoxes in Probability How probability continues to amuse me! Let's play a game! Box A Box B

WHOLEHEARTED Digging Deeper to Broaden Our Reach WE WEAR THE MASK We Wear the Mask BY PAUL

Single mask technology implementation Piotr Bielwka 10 th RD51 Stony Brook Single mask

Kid s Box American English Level 1 Presentation Plus: Kid s Box American English Kid s Box

Flux Box Flux Box A concept by Flux Laboratory Flux box : concept Flux box : concept What is Flux

BLACK SOAP GHASSOUL MASK CLAY MASK White Clay Green Clay MASSAGE OIL ARGAN OIL ESSENCE WATER

Critical Contact NIV mask fitting workshop Therapeutic Care October 2018 Learning objectives

Development of a unique reusable safety respirator The Elipse Half-Face Mask represents a major

A C N A I B Enhance Skin complexion Enhance Skin complexion Bianca Facial Mask Enhanced

Classless Subnetting Explained When given an IP Address, Major Network Mask, and a Subnet Mask,

Role of credit guarantee in improving financial access to Micro SMEs History of Credit Guarantee

Question Box An Open Mind Project What is Question Box? Question Box is an elegant shortcut

Using Box for Document Management Michael Fisher January 26, 2016 What is Box? Box (or

Public Hearing: Section 108 Loan Guarantee Program March 4, 2020 Community, Housing and Human

E-COMMERCE OMMERCE E-commerce : Mission We guarantee a smooth cross- We guarantee a

Even more on Speech Even more on Speech Perception: It s not just s not just Perception:

Managing Descriptive Metadata with Open XML Gregory Wiedeman University Archivist University at

Pos osition on and nd Direction on 1 masterthecurriculum.co.uk Describe be tur urns ns -

The Epistle to the ROMANS Rom. 1:17, For in it the righteousness of God is revealed from

Data- -Centric Query in Sensor Networks Centric Query in Sensor Networks Data Jie Gao Computer

Computer Graphics Seminar MTAT.03.305 Spring 2018 Raimond Tunnel Contact Information

Control Structures 1 / 16 Structured Programming Any algorithm can be expressed by: Sequence

ESO Science Archive: 1D spectra publishing process ESO archive evolving from raw to science-ready

Sambuz

Useful Links

Newsletter

Mail Us