Cost-aware Pre-training for Multiclass Cost-sensitive Deep Learning - - PowerPoint PPT Presentation

cost aware pre training for multiclass cost sensitive
SMART_READER_LITE
LIVE PREVIEW

Cost-aware Pre-training for Multiclass Cost-sensitive Deep Learning - - PowerPoint PPT Presentation

Cost-aware Pre-training for Multiclass Cost-sensitive Deep Learning Yu-An Chung 1 Hsuan-Tien Lin 1 Shao-Wen Yang 2 1 Dept. of Computer Science and Information Engineering National Taiwan University, Taiwan 2 Intel Labs Intel Corporation, USA


slide-1
SLIDE 1

Cost-aware Pre-training for Multiclass Cost-sensitive Deep Learning

Yu-An Chung1 Hsuan-Tien Lin1 Shao-Wen Yang2

1 Dept. of Computer Science and Information Engineering

National Taiwan University, Taiwan

2 Intel Labs

Intel Corporation, USA

IJCAI 2016

Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 1 / 17

slide-2
SLIDE 2

Outline

1

Cost-sensitive Classification Setup

2

Estimate the costs - Regression Network

3

A novel Cost-aware Pre-training Technique

4

Conclusions

Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 2 / 17

slide-3
SLIDE 3

Cost-sensitive Classification Setup

Outline

1

Cost-sensitive Classification Setup

2

Estimate the costs - Regression Network

3

A novel Cost-aware Pre-training Technique

4

Conclusions

Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 3 / 17

slide-4
SLIDE 4

Cost-sensitive Classification Setup

What is the status of the patient?

?? H1N1-infected Cold-infected Healthy A classification problem – grouping patients into different status. Which mistake is more serious? Predicting ... H1N1 as Healthy vs. Cold as Healthy

Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 4 / 17

slide-5
SLIDE 5

Cost-sensitive Classification Setup

Cost-sensitive Classification

Measuring the Mis-classification Costs by Cost Matrix C =

Actual Predicted

H1N1 Cold Healthy H1N1 1000 100000 Cold 100 3000 Healthy 100 30 C(i, j): cost of classifying a class i example as class j Regular classification: special case of cost-sensitive classificaiton Cost-sensitive Classification Setup Input: A training set S = {(xn, yn)}N

n=1 and a cost matrix C, where

xn ∈ X, yn ∈ Y = {1, 2, ..., K} Goal: Use S and C to train a classifier g : X → Y such that the expected cost C(y, g(x)) on test example (x, y) is minimal

Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 5 / 17

slide-6
SLIDE 6

Cost-sensitive Classification Setup

Our Contributions

Where are we? Shallow Models (e.g., SVM) Deep Learning Regular (Cost-insensitive) Classification Well-studied Popular and undergoing Cost-sensitive Classification Well-studied Our work First work that studies Cost-sensitive Deep Learning

1

a novel Cost-sensitive Loss (CSL) for training any deep models (end-to-end)

2

a Cost-sensitive Autoencoder (CAE) equipped with CSL for pre-training deep models (layer-wise)

3

a combination of 1) and 2) as a complete Cost-sensitive Deep Neural Network (CSDNN) solution

4

extensive experimental results have shown that deep models indeed

  • utperformed shallow ones (potential to study more!)

Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 6 / 17

slide-7
SLIDE 7

Estimate the costs - Regression Network

Outline

1

Cost-sensitive Classification Setup

2

Estimate the costs - Regression Network

3

A novel Cost-aware Pre-training Technique

4

Conclusions

Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 7 / 17

slide-8
SLIDE 8

Estimate the costs - Regression Network

Regression Network

Network: to estimate the per-class costs Training:

motivated by an earlier cost-sensitive SVM work, a Cost-sensitive Loss (CSL) that trains the network cost-sensitively is derived in this work (see paper or poster for details)

Prediction: g(x) ≡ argmin

1kK

rk(x)

Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 8 / 17

slide-9
SLIDE 9

A novel Cost-aware Pre-training Technique

Outline

1

Cost-sensitive Classification Setup

2

Estimate the costs - Regression Network

3

A novel Cost-aware Pre-training Technique

4

Conclusions

Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 9 / 17

slide-10
SLIDE 10

A novel Cost-aware Pre-training Technique

Recap on Unsupervised Pre-training

A classical way of training DNNs Two steps

Unsupervised layer-wise pre-training

Autoencoder, Restricted Boltzmann Machine (RBM) Several Autoencoders or RBMs can then be stacked to form a DNN.

End-to-end supervised fine-tuning

Cost-aware Pre-training Embed the proposed Cost-sensitive Loss (CSL) into Autoencoder

a cost-sensitive version of Autoencoder (CAE) conduct cost-related features extraction

Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 10 / 17

slide-11
SLIDE 11

A novel Cost-aware Pre-training Technique

Autoencoder (AE)

Autoencoder (AE): Let LCE denotes the reconstruction errors of the AE to be minimized (CE stands for cross-entropy).

Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 11 / 17

slide-12
SLIDE 12

A novel Cost-aware Pre-training Technique

Cost-sensitive Autoencoder (CAE) for cost-aware pre-training

Cost-sensitive Autoencoder (CAE): CAE: Reconstruct x and estimate C simultaneously Objective function for CAE: (1 − β) × LCE + β × LCSL β ∈ [0, 1] When β = 0, CAE ≡ AE

Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 12 / 17

slide-13
SLIDE 13

A novel Cost-aware Pre-training Technique

Experimental Results (Selected)

3 methods were compared to show the validity of CSL and CAE:

Cost-sensitive pre-training? Cost-sensitive training? DNN no no DNN + CSL no yes CSDNN yes yes

Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 13 / 17

slide-14
SLIDE 14

Conclusions

Outline

1

Cost-sensitive Classification Setup

2

Estimate the costs - Regression Network

3

A novel Cost-aware Pre-training Technique

4

Conclusions

Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 14 / 17

slide-15
SLIDE 15

Conclusions

Conclusions

CSL: make any deep model cost-sensitive (see paper for details) CSDNN = CAE pre-training + CSL fine-tuning: both techniques lead to significant improvements Extensive experimental results showed the superiority of CSDNN (see paper or poster)

Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 15 / 17

slide-16
SLIDE 16

Conclusions

Thank you!

Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 16 / 17

slide-17
SLIDE 17

Supplementary Materials

β vs. Test Costs

0.2 0.4 0.6 0.8 1 0.16 0.18 0.2 0.22 0.24

MNISTimb

0.2 0.4 0.6 0.8 1 4 4.2 4.4 4.6 4.8 5 5.2

bg−img−rotimb

0.2 0.4 0.6 0.8 1 0.24 0.26 0.28 0.3 0.32 0.34

SVHNimb

0.2 0.4 0.6 0.8 1 6.4 6.6 6.8 7 7.2 7.4 7.6

CIFAR−10imb

Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 17 / 17