Cost-aware Pre-training for Multiclass Cost-sensitive Deep Learning - PowerPoint PPT Presentation

Cost-aware Pre-training for Multiclass Cost-sensitive Deep Learning Yu-An Chung 1 Hsuan-Tien Lin 1 Shao-Wen Yang 2 1 Dept. of Computer Science and Information Engineering National Taiwan University, Taiwan 2 Intel Labs Intel Corporation, USA IJCAI 2016 Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 1 / 17

Outline Cost-sensitive Classification Setup 1 Estimate the costs - Regression Network 2 A novel Cost-aware Pre-training Technique 3 Conclusions 4 Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 2 / 17

Cost-sensitive Classification Setup Outline Cost-sensitive Classification Setup 1 Estimate the costs - Regression Network 2 A novel Cost-aware Pre-training Technique 3 Conclusions 4 Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 3 / 17

Cost-sensitive Classification Setup What is the status of the patient? ?? H1N1-infected Cold-infected Healthy A classification problem – grouping patients into different status. Which mistake is more serious? Predicting ... H1N1 as Healthy vs. Cold as Healthy Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 4 / 17

Cost-sensitive Classification Setup Cost-sensitive Classification Measuring the Mis-classification Costs by Cost Matrix Predicted H1N1 Cold Healthy Actual C = H1N1 0 1000 100000 Cold 100 0 3000 Healthy 100 30 0 C ( i , j ): cost of classifying a class i example as class j Regular classification: special case of cost-sensitive classificaiton Cost-sensitive Classification Setup Input: A training set S = { ( x n , y n ) } N n =1 and a cost matrix C , where x n ∈ X , y n ∈ Y = { 1 , 2 , ..., K } Goal: Use S and C to train a classifier g : X → Y such that the expected cost C ( y , g ( x )) on test example ( x , y ) is minimal Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 5 / 17

Cost-sensitive Classification Setup Our Contributions Where are we? Shallow Models Deep Learning (e.g., SVM) Regular (Cost-insensitive) Popular and Well-studied Classification undergoing Cost-sensitive Classification Well-studied Our work First work that studies Cost-sensitive Deep Learning a novel Cost-sensitive Loss ( CSL ) for training any deep models 1 (end-to-end) a Cost-sensitive Autoencoder ( CAE ) equipped with CSL for 2 pre-training deep models (layer-wise) a combination of 1) and 2) as a complete Cost-sensitive Deep Neural 3 Network ( CSDNN ) solution extensive experimental results have shown that deep models indeed 4 outperformed shallow ones (potential to study more!) Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 6 / 17

Estimate the costs - Regression Network Outline Cost-sensitive Classification Setup 1 Estimate the costs - Regression Network 2 A novel Cost-aware Pre-training Technique 3 Conclusions 4 Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 7 / 17

Estimate the costs - Regression Network Regression Network Network: to estimate the per-class costs Training: motivated by an earlier cost-sensitive SVM work, a Cost-sensitive Loss ( CSL ) that trains the network cost-sensitively is derived in this work (see paper or poster for details) Prediction: g ( x ) ≡ argmin r k ( x ) 1 � k � K Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 8 / 17

A novel Cost-aware Pre-training Technique Outline Cost-sensitive Classification Setup 1 Estimate the costs - Regression Network 2 A novel Cost-aware Pre-training Technique 3 Conclusions 4 Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 9 / 17

A novel Cost-aware Pre-training Technique Recap on Unsupervised Pre-training A classical way of training DNNs Two steps Unsupervised layer-wise pre-training Autoencoder, Restricted Boltzmann Machine (RBM) Several Autoencoders or RBMs can then be stacked to form a DNN. End-to-end supervised fine-tuning Cost-aware Pre-training Embed the proposed Cost-sensitive Loss ( CSL ) into Autoencoder a cost-sensitive version of Autoencoder (CAE) conduct cost-related features extraction Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 10 / 17

A novel Cost-aware Pre-training Technique Autoencoder (AE) Autoencoder (AE): Let L CE denotes the reconstruction errors of the AE to be minimized (CE stands for cross-entropy). Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 11 / 17

A novel Cost-aware Pre-training Technique Cost-sensitive Autoencoder (CAE) for cost-aware pre-training Cost-sensitive Autoencoder (CAE): CAE: Reconstruct x and estimate C simultaneously Objective function for CAE: (1 − β ) × L CE + β × L CSL β ∈ [0 , 1] When β = 0, CAE ≡ AE Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 12 / 17

A novel Cost-aware Pre-training Technique Experimental Results (Selected) 3 methods were compared to show the validity of CSL and CAE: Cost-sensitive pre-training? Cost-sensitive training? DNN no no DNN + CSL no yes CSDNN yes yes Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 13 / 17

Conclusions Outline Cost-sensitive Classification Setup 1 Estimate the costs - Regression Network 2 A novel Cost-aware Pre-training Technique 3 Conclusions 4 Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 14 / 17

Conclusions Conclusions CSL: make any deep model cost-sensitive (see paper for details) CSDNN = CAE pre-training + CSL fine-tuning: both techniques lead to significant improvements Extensive experimental results showed the superiority of CSDNN (see paper or poster) Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 15 / 17

Conclusions Thank you! Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 16 / 17

Supplementary Materials β vs. Test Costs MNIST imb bg−img−rot imb 0.24 5.2 5 0.22 4.8 0.2 4.6 4.4 0.18 4.2 0.16 4 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 SVHN imb CIFAR−10 imb 7.6 0.34 7.4 0.32 7.2 0.3 7 0.28 6.8 6.6 0.26 6.4 0.24 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Y.-A. Chung, H.-T. Lin, S.-W. Yang Cost-sensitive Deep Learning IJCAI 2016 17 / 17

Cost-aware Pre-training for Multiclass Cost-sensitive Deep Learning - PowerPoint PPT Presentation

Cost-aware Pre-training for Multiclass Cost-sensitive Deep Learning Yu-An Chung 1 Hsuan-Tien Lin 1 Shao-Wen Yang 2 1 Dept. of Computer Science and Information Engineering National Taiwan University, Taiwan 2 Intel Labs Intel Corporation, USA

Selective sampling algorithms for cost-sensitive multiclass prediction Alekh Agarwal Microsoft

Multiclass Boosting with Repartitioning Ling Li Learning Systems Group, Caltech ICML 2006

Multiclass Predictions CMSC 422 M ARINE C ARPUAT marine@cs.umd.edu T opics Given an arbitrary

A6: Sensitive Data Exposure A6 Sensitive Data Exposure Sensitive data stored or transmitted

Perception of Average Value in Multiclass Scatterplots Michael Gleicher, Michael Correll,

Lecture 7: Multiclass Classification Princeton University COS 495 Instructor: Yingyu Liang

Multiclass Multilabel Classification with More Classes than Examples Ohad Shamir Weizmann

MOTIVATE 1 and 2 Trials Maraviroc in Patients with Multiclass Drug Resistance MOTIVATE 1 and 2:

Model Combination in Multiclass Classification Sam Reid Advisors: Mike Mozer, Greg Grudic

Multiclass object recognition Sharing parts and transfer learning Sharat Chikkerur Outline

CSC 411: Lecture 07: Multiclass Classification Class based on Raquel Urtasun & Rich Zemels

Adversarial Surrogate Losses for General Multiclass Classification Rizal Zaini Ahmad Fathony

Toolkit to Support Intelligibility in Context Aware Applications Context-Aware Applications P

Protecting Sensitive Data Implementation of a Sensitive Data Manager Recommendation Briefed

Locality-Sensitive Hashing Documents LSH Metric Spaces Sensitive Function Anil Maheshwari

Class Imbalance Multiclass Problems General Idea Original D Training data .... Step 1:

Q3 2014 Results 31 October 2014 Q314 results highlights 3 rd consecutive quarter of attributable

Research Data Management Introduc*on and overview Mar/n Donnelly, Digital Cura/on

Beyond The Data 1. Opening the process of generating science 2. From data centres to computing

Excellence Framework Follow us on Twitter at REF consultation events #REF2021 David Sweeney

Slide 1 ___________________________________ 2.3 Costs and the financ ial Stmts o T he c ost c

Multiclass and Multi-label Classification INFO-4604, Applied Machine Learning University of

Decision Trees Prof. Mike Hughes Many slides attributable to: Erik Sudderth (UCI) Finale

Object detection as supervised classification Tues Nov 10 Kristen Grauman UT Austin Today

Cost-aware Pre-training for Multiclass Cost-sensitive Deep Learning - PowerPoint PPT Presentation

Cost-aware Pre-training for Multiclass Cost-sensitive Deep Learning Yu-An Chung 1 Hsuan-Tien Lin 1 Shao-Wen Yang 2 1 Dept. of Computer Science and Information Engineering National Taiwan University, Taiwan 2 Intel Labs Intel Corporation, USA

Selective sampling algorithms for cost-sensitive multiclass prediction Alekh Agarwal Microsoft

Multiclass Boosting with Repartitioning Ling Li Learning Systems Group, Caltech ICML 2006

Multiclass Predictions CMSC 422 M ARINE C ARPUAT marine@cs.umd.edu T opics Given an arbitrary

A6: Sensitive Data Exposure A6 Sensitive Data Exposure Sensitive data stored or transmitted

Perception of Average Value in Multiclass Scatterplots Michael Gleicher, Michael Correll,

Lecture 7: Multiclass Classification Princeton University COS 495 Instructor: Yingyu Liang

Multiclass Multilabel Classification with More Classes than Examples Ohad Shamir Weizmann

MOTIVATE 1 and 2 Trials Maraviroc in Patients with Multiclass Drug Resistance MOTIVATE 1 and 2:

Model Combination in Multiclass Classification Sam Reid Advisors: Mike Mozer, Greg Grudic

Multiclass object recognition Sharing parts and transfer learning Sharat Chikkerur Outline

CSC 411: Lecture 07: Multiclass Classification Class based on Raquel Urtasun &amp; Rich Zemels

Adversarial Surrogate Losses for General Multiclass Classification Rizal Zaini Ahmad Fathony

Toolkit to Support Intelligibility in Context Aware Applications Context-Aware Applications P

Protecting Sensitive Data Implementation of a Sensitive Data Manager Recommendation Briefed

Locality-Sensitive Hashing Documents LSH Metric Spaces Sensitive Function Anil Maheshwari

Class Imbalance Multiclass Problems General Idea Original D Training data .... Step 1:

Q3 2014 Results 31 October 2014 Q314 results highlights 3 rd consecutive quarter of attributable

Research Data Management Introduc*on and overview Mar/n Donnelly, Digital Cura/on

Beyond The Data 1. Opening the process of generating science 2. From data centres to computing

Excellence Framework Follow us on Twitter at REF consultation events #REF2021 David Sweeney

Slide 1 ___________________________________ 2.3 Costs and the financ ial Stmts o T he c ost c

Multiclass and Multi-label Classification INFO-4604, Applied Machine Learning University of

Decision Trees Prof. Mike Hughes Many slides attributable to: Erik Sudderth (UCI) Finale

Object detection as supervised classification Tues Nov 10 Kristen Grauman UT Austin Today

CSC 411: Lecture 07: Multiclass Classification Class based on Raquel Urtasun & Rich Zemels