Shallow-Deep Networks: Understanding and Mitigating Network - - PowerPoint PPT Presentation

shallow deep networks understanding and
SMART_READER_LITE
LIVE PREVIEW

Shallow-Deep Networks: Understanding and Mitigating Network - - PowerPoint PPT Presentation

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking Yiitcan Kaya , Sanghyun Hong, Tudor Dumitra University of Maryland, College Park ICML 2019 - Long Beach, CA What is overthinking? We, especially grad students , often


slide-1
SLIDE 1

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

Yiğitcan Kaya, Sanghyun Hong, Tudor Dumitraș

University of Maryland, College Park ICML 2019 - Long Beach, CA

slide-2
SLIDE 2

What is overthinking?

We, especially grad students, often think more than needed to solve a problem.

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

slide-3
SLIDE 3

What is overthinking?

We, especially grad students, often think more than needed to solve a problem.

i. Wastes our valuable energy (wasteful)

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

slide-4
SLIDE 4

What is overthinking?

We, especially grad students, often think more than needed to solve a problem.

i. Wastes our valuable energy (wasteful) ii. Causes us to make mistakes (destructive)

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

slide-5
SLIDE 5

Do deep neural networks overthink too?

Without requiring the full depth, DNNs can correctly classify the majority of samples. Experiments on four recent CNNs and three common image classification tasks

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

slide-6
SLIDE 6

Do deep neural networks overthink too?

Without requiring the full depth, DNNs can correctly classify the majority of samples.

i. Wastes computation for up to 95% of the samples (wasteful)

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

slide-7
SLIDE 7

Do deep neural networks overthink too?

Without requiring the full depth, DNNs can correctly classify the majority of samples.

i. Wastes computation for up to 95% of the samples (wasteful) ii. Occurs in ~50% of all misclassifications (destructive)

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

slide-8
SLIDE 8

How do we detect overthinking?

Internal classifiers allow us to observe whether the DNN correctly classifies the sample at an earlier layer.

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

slide-9
SLIDE 9

How do we detect overthinking?

Internal classifiers allow us to observe whether the DNN correctly classifies the sample at an earlier layer.

➢Our generic Shallow-Deep Network (SDN) modification introduces internal classifiers to DNNs.

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

slide-10
SLIDE 10

The SDN modification

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

conv1 conv2 conv3 conv4 full FR full Input

Internal Prediction Final Prediction Internal Classifier Internal Layers Final Classifier

Original CNN SDN modification

Applied to VGG, ResNet, WideResNet and MobileNet.

slide-11
SLIDE 11

The SDN modification

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

Challenge

How to train accurate internal classifiers?

slide-12
SLIDE 12

The SDN modification

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

Challenge

How to train accurate internal classifiers?

Prior Work

Claims this hurts the accuracy in off-the-shelf DNNs Proposes a unique architecture[1]

[1] Huang, Gao, et al. "Multi-scale dense convolutional networks for efficient prediction." ICLR 2018

slide-13
SLIDE 13

The SDN modification

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

Challenge

How to train accurate internal classifiers?

Results

Our modification often improves the original accuracy by up to 10%. (See our poster)

slide-14
SLIDE 14

The wasteful effect of overthinking

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

conv1 conv2 conv3 conv4 full FR full Input

Horse ✔ Horse✔

Wasteful for the correct classification

slide-15
SLIDE 15

The wasteful effect of overthinking

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

Challenge

How can we know where in the DNN to stop?

slide-16
SLIDE 16

The wasteful effect of overthinking

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

Challenge

How can we know where in the DNN to stop?

Our Solution

Classification confidence of the internal classifiers

slide-17
SLIDE 17

The wasteful effect of overthinking

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

Our Solution

Classification confidence of the internal classifiers

Results

A confidence-based early exit scheme reduces the average inference cost by up to 50%. (See our poster)

slide-18
SLIDE 18

The destructive effect of overthinking

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

conv1 conv2 conv3 conv4 full FR full Input

Horse✔ Dog X

Destructive for the correct classification

slide-19
SLIDE 19

The destructive effect causes disagreement

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

conv1 conv2 conv3 conv4 full FR full Input

Horse✔ Dog X

slide-20
SLIDE 20

The destructive effect causes disagreement

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

Challenge

How can we quantify the internal disagreement?

Our Solution

The confusion metric

slide-21
SLIDE 21

The destructive effect causes disagreement

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

Our Solution

The confusion metric?

Results

Confusion indicates whether a misclassification is likely. Confusion is a reliable error indicator. (See our poster)

slide-22
SLIDE 22

The destructive effect causes disagreement

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

Our Solution

The confusion metric?

Results

Backdoor attacks [2] also increase the confusion of the victim DNN for malicious samples. (See our poster)

[2] Gu, Tianyu, et al. "BadNets: Evaluating Backdooring Attacks on Deep Neural Networks." IEEE Access 7 (2019): 47230-47244.

slide-23
SLIDE 23

Implications

  • Eliminating overthinking would lead to a significant

boost in accuracy and inference-time.

  • We need DNNs that can adjust their complexity

based on the required feature complexity.

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

slide-24
SLIDE 24

Thank you!

Don’t overthink! Come and see our poster!

Pacific Ballroom – Poster #24 – 06:30-09:00 PM

Shallow-Deep Networks: Understanding and Mitigating Network Overthinking

For more details, visit our website http://shallowdeep.network