Revisiting Class Activation Mapping for Learning from Imperfect Data - - PowerPoint PPT Presentation

revisiting class activation mapping for learning from
SMART_READER_LITE
LIVE PREVIEW

Revisiting Class Activation Mapping for Learning from Imperfect Data - - PowerPoint PPT Presentation

The 2nd Learning from Imperfect Data (LID) Workshop Revisiting Class Activation Mapping for Learning from Imperfect Data Wonho Bae *, Junhyug Noh*, Jinhwan Seo, and Gunhee Kim Challenge Results 1 st place Track 3: Weakly Supervised Object


slide-1
SLIDE 1

Revisiting Class Activation Mapping for Learning from Imperfect Data

The 2nd Learning from Imperfect Data (LID) Workshop

Wonho Bae*, Junhyug Noh*, Jinhwan Seo, and Gunhee Kim

slide-2
SLIDE 2

1st place

Track 3: Weakly Supervised Object Localization

2

Challenge Results

2nd place

Track 1: Weakly Supervised Semantic Segmentation

slide-3
SLIDE 3

Weakly-Supervised Object Localization

3

Input Output monkey

slide-4
SLIDE 4

Class Activation Mapping (CAM)

!",$ ∗ + !',$ ∗ + !(,$ ∗ + ⋯ + !*,$ ∗ =

+$ ," ,' ,( ,-

GAP

!",$ !',$ !*,$

… .: 012345

6789

1 2 3 = ⋯

,

CAM (Class Activation Maps)

CNN +$

>

> @ABC

D$

localization result resize

4

slide-5
SLIDE 5

Class Activation Mapping (CAM)

!",$ ∗ + !',$ ∗ + !(,$ ∗ + ⋯ + !*,$ ∗ =

+$ ," ,' ,( ,-

GAP

!",$ !',$ !*,$

… .: 012345

6789

1 2 3 = ⋯

,

CAM (Class Activation Maps)

CNN +$

>

> @ABC

D$

localization result resize

5

slide-6
SLIDE 6

Class Activation Mapping (CAM) for Track 3

6 !!,# ∗ + !$,# ∗ + !%,# ∗ + ⋯ + !&,# ∗ =

!% "& "' "( ")

GAP

!!,# !$,# !%,#

… *: ,-./01

#234

1 2 3 $ ⋯

"

CAM (Class Activation Maps)

CNN !%

5

> #&'(

$%

localization result resize

slide-7
SLIDE 7

[HaS] Singh, et al. ICCV 2017 [AE] Wei, et al. CVPR 2017 [ACoL] Zhang, et al. CVPR 2018 [ADL] Choe, et al. CVPR 2019

How to Grasp Whole Object Region?

7

slide-8
SLIDE 8

Our Approach

  • Motivation
  • Information to capture the whole area of the object already exists in feature maps
  • Problem
  • Three modules (M1–M3) of CAM do not take phenomena (P1–P3) into account
  • It results in the localization being limited to small discriminative regions of an object
  • Solution
  • Correctly utilize the information by simply modifying the three modules

!",$ ∗ + !',$ ∗ + !(,$ ∗ + ⋯ + !*,$ ∗ =

+$ ," ,' ,( ,-

GAP

!",$ !',$ !*,$

… .: 012345

6789

1 2 3 = ⋯

,

M1: Global Average Pooling (GAP) M2: Class Activation Maps (CAM)

CNN +$

>

> @ABC

D$

M3: Thresholding localization result resize Phenomena observed in the feature map (,) P1: P2: P3:

slide-9
SLIDE 9

Our Approach (1) Thresholded Average Pooling

  • Problem: Global Average Pooling (GAP) under P1

!",$ ∗ + !',$ ∗ + !(,$ ∗ + ⋯ + !*,$ ∗ =

+$ ," ,' ,( ,-

GAP

!",$ !',$ !*,$

… .: 012345

6789

1 2 3 = ⋯

,

M1: Global Average Pooling (GAP) M2: Class Activation Maps (CAM)

CNN +$

>

> @ABC

D$

M3: Thresholding localization result resize Phenomena observed in the feature map (,) P1:

9

slide-10
SLIDE 10

!

"

!# $

  • Problem: Global Average Pooling (GAP) under P1

Our Approach (1) Thresholded Average Pooling

10

slide-11
SLIDE 11

GAP GAP 2.5

9.9

"

#$,& (= 0.04) #

',&

(= 0.01) (

' (max: 59.2)

($ (max: 64.7) = 0.100 + 0.099 + ⋯ *+,-

#$,& ∗ = #

',& ∗

=

Classification phase Localization phase

  • Problem: Global Average Pooling (GAP) under P1

Our Approach (1) Thresholded Average Pooling

11

slide-12
SLIDE 12
  • Problem: Global Average Pooling (GAP) under P1
  • Solution: Thresholded Average Pooling (TAP)

Our Approach (1) Thresholded Average Pooling

12

slide-13
SLIDE 13
  • Problem: Class Activation Maps (CAM) under P2

!",$ ∗ + !',$ ∗ + !(,$ ∗ + ⋯ + !*,$ ∗ =

+$ ," ,' ,( ,-

GAP

!",$ !',$ !*,$

… .: 012345

6789

1 2 3 = ⋯

,

M1: Global Average Pooling (GAP) M2: Class Activation Maps (CAM)

CNN +$

>

> @ABC

D$

M3: Thresholding localization result resize Phenomena observed in the feature map (,) P2:

Our Approach (2) Negative Weight Clamping

slide-14
SLIDE 14
  • Problem: Class Activation Maps (CAM) under P2

Our Approach (2) Negative Weight Clamping

14

− =

Positive only Negative only Both

slide-15
SLIDE 15
  • Problem: Class Activation Maps (CAM) under P2

Our Approach (2) Negative Weight Clamping

IoA between the ground truth boxes and the CAMs

15

Positive weights Negative weights

slide-16
SLIDE 16
  • Problem: Class Activation Maps (CAM) under P2
  • Solution: Negative Weight Clamping (NWC)

Our Approach (2) Negative Weight Clamping

16

slide-17
SLIDE 17
  • Problem: Maximum as a Standard (MaS) under P3

Our Approach (3) Percentile as a Thresholding Standard

!",$ ∗ + !',$ ∗ + !(,$ ∗ + ⋯ + !*,$ ∗ =

+$ ," ,' ,( ,-

GAP

!",$ !',$ !*,$

… .: 012345

6789

1 2 3 = ⋯

,

M1: Global Average Pooling (GAP) M2: Class Activation Maps (CAM)

CNN +$

>

> @ABC

D$

M3: Thresholding localization result resize Phenomena observed in the feature map (,) P3:

17

slide-18
SLIDE 18
  • Problem: Maximum as a Standard (MaS) under P3

Our Approach (3) Percentile as a Thresholding Standard

18

Num of channels (activation > "!.#) Result with CAM CAM values (descending order)

threshold (!!"#) threshold (!!"#)

100 − percentile (%) 100 − percentile (%)

slide-19
SLIDE 19
  • Problem: Maximum as a Standard (MaS) under P3
  • Solution: Percentile as a Standard (PaS)

Our Approach (3) Percentile as a Thresholding Standard

19

slide-20
SLIDE 20

Experimental Setting

  • Backbone: ResNet50-SE
  • Batch size: 210
  • Input size: 384×384
  • Random crop size: 336×336
  • TAP threshold (𝜐!"#): 0.05
  • PaS percentile (𝑗): 98

20

slide-21
SLIDE 21

Results on Validation Set

  • Results with different components
  • To preserve the details of masks, we also applied a fully connected CRF.
  • The performance gradually improves as each component is added.

21

slide-22
SLIDE 22

Leaderboard

22

  • Track 3: Weakly Supervised Object Localization
slide-23
SLIDE 23

Qualitative Results

CAM + Ours CAM + Ours CAM + Ours CAM + Ours

slide-24
SLIDE 24

Expansion to Track 1

24

slide-25
SLIDE 25

Expansion to Track 1

25

Our target!

slide-26
SLIDE 26

Class Activation Mapping (CAM) for Track 1

26 !!,# ∗ + !$,# ∗ + !%,# ∗ + ⋯ + !&,# ∗ =

!% "& "' "( ")

GAP

!!,# !$,# !%,#

… *: ,-./01

#234

1 2 3 $ ⋯

"

CAM (Class Activation Maps)

CNN !%

5

> #&'(

$%

localization result resize

slide-27
SLIDE 27

Leaderboard

27

  • Track 1: Weakly Supervised Semantic Segmentation
slide-28
SLIDE 28

Thank You!