Multiple Instance Detection Network with Online Instance Classifier - - PowerPoint PPT Presentation

multiple instance detection network with online instance
SMART_READER_LITE
LIVE PREVIEW

Multiple Instance Detection Network with Online Instance Classifier - - PowerPoint PPT Presentation

Huazhong University of Science and Technology 1 Multiple Instance Detection Network with Online Instance Classifier Refinement Peng Tang pengtang@hust.edu.cn Huazhong University of Science and Technology Weakly-supervised visual learning


slide-1
SLIDE 1

Multiple Instance Detection Network with Online Instance Classifier Refinement

Peng Tang pengtang@hust.edu.cn Huazhong University of Science and Technology

Huazhong University of Science and Technology 1

slide-2
SLIDE 2

Weakly-supervised visual learning (WSVL)

Huazhong University of Science and Technology

2

 Weakly-supervised visual learning is a new trend in CVPR

http://cvpr2017.thecvf.com/program/main_conference Search keyword “weakly supervised” (14 papers), “weakly-supervised” (5 papers), “multi-instance” (1 paper), and “multiple instance” (3 papers), 23/783 papers in total

slide-3
SLIDE 3

WSVL avoids the expensive human annotations

Huazhong University of Science and Technology

3

Tasks Training info Testing output Weakly-supervised

  • bject detection (WSOD)

Image-level Bounding box Weakly-supervised sematic segmentation Image-level Pixel-level Weakly-supervised instance segmentation Image- level/bounding box Instance pixel-level Semi-supervised object detection/segmentation Partial of fully- labeled data Bounding box/pixel- level

slide-4
SLIDE 4

WSVL avoids the expensive human annotations

Huazhong University of Science and Technology

4

Tasks Training info Testing output Weakly-supervised

  • bject detection (WSOD)

Image-level Bounding box Weakly-supervised sematic segmentation Image-level Pixel-level Weakly-supervised instance segmentation Image- level/bounding box Instance pixel-level Semi-supervised object detection/segmentation Partial of fully- labeled data Bounding box/pixel- level

slide-5
SLIDE 5

How to do WSOD

Huazhong University of Science and Technology

5

slide-6
SLIDE 6

How to do WSOD

Huazhong University of Science and Technology

6

Possible solutions to this problem, clustering based, matching based, co-segmentation based, topic model based, multi-instance learning based method.

slide-7
SLIDE 7

How to do WSOD

Huazhong University of Science and Technology

7

Solving this problem by multiple instance learning

  • Image as bag, since image label is given
  • Proposals (Selective Search, EdgeBox, Bing) as instances
  • Proposal descriptors: Deep CNN Features, Fisher Vectors
  • Number of proposals: ~2k (SS), ~4k (EB)

Possible solutions to this problem, clustering based, matching based, co-segmentation based, topic model based, multi-instance learning based method.

slide-8
SLIDE 8

What is the core problem in WSOD

Huazhong University of Science and Technology

8

 Is it a bird?

The answers are YES!

slide-9
SLIDE 9

What is the core problem in WSOD

Huazhong University of Science and Technology

9

 Is it a bird?

The answers are YES! However, only some of them are correct detection results (IoU>0.5)

slide-10
SLIDE 10

What is the core problem in WSOD

Huazhong University of Science and Technology

10

 Is it a bird?

The answers are YES! However, only some of them are correct detection results (IoU>0.5)

slide-11
SLIDE 11

What is the core problem in WSOD

Huazhong University of Science and Technology

11

 Is it a bird?

The answers are YES! However, only some of them are correct detection results (IoU>0.5)

ambiguity

slide-12
SLIDE 12

What is the core problem in WSOD

Huazhong University of Science and Technology

12

Previous methods tend to localize parts of objects instead

  • f whole objects.

Result of MIDN/WSDDN [4]

slide-13
SLIDE 13

Huazhong University of Science and Technology

13

 Multiple Instance Detection Network with Online

Instance Classifier Refinement

slide-14
SLIDE 14

Motivation

Huazhong University of Science and Technology

14

Result of MIDN/WSDDN [4]

Proposals having high spatial overlaps with detected parts may cover the whole object, or at least contain larger portion of the object.

slide-15
SLIDE 15

Motivation

Huazhong University of Science and Technology

15

Result of MIDN/WSDDN [4] Result of OICR

Propagating the scores to the highly overlapped proposals to alleviate the problem cased by ambiguity

slide-16
SLIDE 16

The multi-instance detection network (MIDN)

Huazhong University of Science and Technology

16

slide-17
SLIDE 17

The multi-instance detection network (MIDN)

Huazhong University of Science and Technology

17

 The basic network of WSDDN [4] by H. Bilen and A. Vedaldi  Single network, end to end training

slide-18
SLIDE 18

The multi-instance detection network (MIDN)

Huazhong University of Science and Technology

18

slide-19
SLIDE 19

The multi-instance detection network (MIDN)

Huazhong University of Science and Technology

19

slide-20
SLIDE 20

The network for OICR

Huazhong University of Science and Technology

20

slide-21
SLIDE 21

The network for OICR

Huazhong University of Science and Technology

21

 Additional blocks (instance classifiers) for score propagation  In-network supervision

slide-22
SLIDE 22

Effective online training/refinement

Huazhong University of Science and Technology

22

slide-23
SLIDE 23

Effective online training/refinement

Huazhong University of Science and Technology

23

 The top scoring proposal can

always detect at least parts of

  • bjects.
slide-24
SLIDE 24

Effective online training/refinement

Huazhong University of Science and Technology

24

 The top scoring proposal can

always detect at least parts of

  • bjects.

 Proposals having high spatial

  • verlaps with detected parts

may cover larger portion of the

  • bject.
slide-25
SLIDE 25

Effective online training/refinement

Huazhong University of Science and Technology

25

 The top scoring proposal can

always detect at least parts of

  • bjects.

 Proposals having high spatial

  • verlaps with detected parts

may cover larger portion of the

  • bject.

 Proposals with high spatial

  • verlap could share similar

label information.

slide-26
SLIDE 26

Effective online training/refinement

Huazhong University of Science and Technology

26

slide-27
SLIDE 27

Effective online training/refinement

Huazhong University of Science and Technology

27

The loss weight controls the learning process.

slide-28
SLIDE 28

Experimental Results

Huazhong University of Science and Technology

28

slide-29
SLIDE 29

Experimental Results

Huazhong University of Science and Technology

29

 The influence of refinement times and different refinement

strategies

slide-30
SLIDE 30

Experimental Results

Huazhong University of Science and Technology

30

 Detection results (mAP) on VOC 2007 test set  Detection results (mAP) on VOC 2012 test set

slide-31
SLIDE 31

Experimental Results

Huazhong University of Science and Technology

31

slide-32
SLIDE 32

Conclusion

Huazhong University of Science and Technology

32

 Advantages

 Our method can improve the detection results a lot

through our OICR strategy, especially for rigid objects.

 The network can be trained in a very efficiently (online)

way.

slide-33
SLIDE 33

Conclusion

Huazhong University of Science and Technology

33

 Advantages

 Our method can improve the detection results a lot

through our OICR strategy, especially for rigid objects.

 The network can be trained in a very efficiently (online)

way.

 Limitations

 The performance is poor for non-rigid objects, as these

  • bjects are always with great deformation and their

representative parts are with much less deformation.

slide-34
SLIDE 34

Huazhong University of Science and Technology

34

 Preprint is available at https://arxiv.org/abs/1704.00138.  Codes are available at https://github.com/ppengtang/oicr.

Thank you!