Angular Visual Hardness Beidi Chen Department of Computer Science, - - PowerPoint PPT Presentation

angular visual hardness
SMART_READER_LITE
LIVE PREVIEW

Angular Visual Hardness Beidi Chen Department of Computer Science, - - PowerPoint PPT Presentation

Angular Visual Hardness Beidi Chen Department of Computer Science, Rice University Collaborators: Weiyang Liu, Animesh Garg, Zhiding Yu, Anshumali Shrivastava, Jan kautz, and Anima Anandkumar PART Motivation 0 Motivation Background


slide-1
SLIDE 1

Angular Visual Hardness

Beidi Chen

Department of Computer Science, Rice University Collaborators: Weiyang Liu, Animesh Garg, Zhiding Yu, Anshumali Shrivastava, Jan kautz, and Anima Anandkumar

slide-2
SLIDE 2

PART

Motivation

slide-3
SLIDE 3

Applications Discoveries Background Motivation Conclusion

Human Visual Hardness

Image Degradation Semantic Ambiguity

slide-4
SLIDE 4

Applications Discoveries Background Motivation Conclusion

Gap between human visual system and CNNs

Nail 0.93 0.2 Oil Filter 0.998 0.2 Easy for Human and Hard for CNNs Hard for Human and Easy for CNNs

C l a s s N a m e Softmax Score H u m a n S e l e c t i

  • n

F r e q u e n c y

Golf Ball 0.001 1.0 Radio 0.001 1.0

slide-5
SLIDE 5

Agenda

Part 1

Background

Part 2

Discoveries

Part 3 Part 4

Applications Conclusion

slide-6
SLIDE 6

01

PART

Background

slide-7
SLIDE 7

Applications Discoveries Background Motivation Conclusion

Do ImageNet Classifiers Generalize to ImageNet?

Human Labeling Interface

Recht et al. “Do ImageNet Classifiers Generalize to ImageNet?” ICML 2019

slide-8
SLIDE 8

Applications Discoveries Background Motivation Conclusion

Loss function of CNNs in visual recognition

Softmax cross-entropy loss

The angle between feature and classifier The magnitude information Model Confidence

slide-9
SLIDE 9

02

PART

Discoveries

slide-10
SLIDE 10

Applications Discoveries Background Motivation Conclusion Given a sample x with label y: where,

1 2 3 4 5 6 7 8 9

Norm ||x|| Angle θ(x,wy) Classifier wy

wi is the classifier for the i-th class.

Proposal: Angular Visual Hardness (AVH)

Theoretical Foundation: Soudry et al. “The Implicit Bias of Gradient Descent on Separable Data” ICLR 2018

slide-11
SLIDE 11

Applications Discoveries Background Motivation Conclusion

Simple Example: AVH vs. ||x|| ||x||

Raw data Color map of AVH Color map of ||x|| ||x||

slide-12
SLIDE 12

Applications Discoveries Background Motivation Conclusion

CNN characteristics vs. human selection frequency

slide-13
SLIDE 13

Applications Discoveries Background Motivation Conclusion

AVH is well aligned with human frequency

Spearman rank correlations

slide-14
SLIDE 14

Applications Discoveries Background Motivation Conclusion

Discovery 1

Alexnet VGG19 ResNet50 Alexnet VGG19 Resnet50

AVH hits a plateau very early even when the accuracy or loss is still improving

slide-15
SLIDE 15

Applications Discoveries Background Motivation Conclusion

Alexnet VGG19 ResNet50 Alexnet VGG19 Resnet50

Discovery 2

AVH is an indicator of model’s generalization ability

slide-16
SLIDE 16

Applications Discoveries Background Motivation Conclusion

Alexnet VGG19 Resnet50 Alexnet VGG19 ResNet50

Discovery 3

The norm of feature embeddings keeps increasing during training

slide-17
SLIDE 17

Applications Discoveries Background Motivation Conclusion

Discovery 4

AVH’s correlation with human selection frequency holds across models throughout training

Alexnet VGG19 ResNet50 Alexnet VGG19 Resnet50

slide-18
SLIDE 18

Applications Discoveries Background Motivation Conclusion

Discovery 5

The norm’s correlation with human selection frequency is not consistent

Alexnet VGG19 ResNet50 Alexnet VGG19 Resnet50

slide-19
SLIDE 19

Applications Discoveries Background Motivation Conclusion

Conjecture on training dynamic of CNNs

  • Softmax cross-entropy loss will first optimize the angles among

different classes while the norm will fluctuate and increase very slowly.

  • The angles become more stable and change very slowly while the norm

increases rapidly.

  • Easy examples: the angles get decreased enough for correct

classification, the softmax cross-entropy loss can be well minimized by increasing the norm.

  • Hard examples: the plateau is cause by unable to decrease the angle to

correctly classify examples or increase the norms otherwise hurting loss.

slide-20
SLIDE 20

03

PART

Applications

slide-21
SLIDE 21

Applications Discoveries Background Motivation Conclusion

Self-training and Domain Adaptation

road sidewalk building wall fence pole traffic lgt traffic sgn vegetation terrain sky person rider car truck bus train motorcycle bike

Zou et al. “Unsupervised domain adaptation for semantic segmentation via class-balanced self-training” ECCV

slide-22
SLIDE 22

Applications Discoveries Background Motivation Conclusion

AVH for Self-training and Domain Adaptation

Replace Softmax-based confidence with AVH-based one during sample selection: Similarly, AVH-based pseudo label

slide-23
SLIDE 23

Applications Discoveries Background Motivation Conclusion

Main Results

slide-24
SLIDE 24

Applications Discoveries Background Motivation Conclusion

Inner Metric

Examples chosen by AVH but not Softmax

slide-25
SLIDE 25

Applications Discoveries Background Motivation Conclusion

AVH-based loss for Domain Generalization

AVH-based Loss:

slide-26
SLIDE 26

04

PART

Conclusion

slide-27
SLIDE 27

Applications Discoveries Background Motivation Conclusion

Summary

  • Propose AVH as a measure for visual hardness
  • Validate that AVH has a statistically significant stronger correlation with

human selection frequency

  • Make observations on the dynamic evolution of AVH scores during

ImageNet training

  • Show the superiority of AVH with its application to self-training for

unsupervised domain adaptation and domain generalization

slide-28
SLIDE 28

Applications Discoveries Background Motivation Conclusion

Trajectory

Discussions

Trajectory of an adversarial example switching from one class to another

  • Connection to deep metric learning
  • Connection to fairness in machine learning
  • Connection to knowledge transfer and

curriculum learning

  • Uncertainty estimation (Aleatoric and

Epistemic)

  • Adversarial Example: A Counter Example?
slide-29
SLIDE 29

THANKS

Paper URL Contact: beidi.chen@rice.edu