Data Valuation using Reinforcement Learning Jinsung Yoon, Sercan O. - - PowerPoint PPT Presentation

data valuation using reinforcement learning
SMART_READER_LITE
LIVE PREVIEW

Data Valuation using Reinforcement Learning Jinsung Yoon, Sercan O. - - PowerPoint PPT Presentation

Data Valuation using Reinforcement Learning Jinsung Yoon, Sercan O. Arik, Tomas Pfister Google Cloud AI 2020 International Conference on Machine Learning (ICML 2020) 1 Problem Defjnition What is data valuation? How much does each


slide-1
SLIDE 1

Jinsung Yoon, Sercan O. Arik, Tomas Pfister

Google Cloud AI

Data Valuation using Reinforcement Learning

1

2020 International Conference on Machine Learning (ICML 2020)

slide-2
SLIDE 2

Problem Defjnition

  • What is data valuation?

○ How much does each data contribute to the trained model

2

Amirata Ghorbani, James Y. Zou, Data Shapley: Equitable Valuation of Data for Machine Learning, ICML, 2019

slide-3
SLIDE 3

Objective & Use-cases

  • Learn in reliable way
  • Data valuation

○ Fair valuation for the labelers and data provider ○ Insights about the dataset

3

Ruoxi Jia et al., Towards Efficient Data Valuation Based on the Shapley Value, AISTATS, 2019

slide-4
SLIDE 4

Objective & Use-cases

  • Learn in reliable way
  • Corrupted sample discovery

4

High-value samples Low-value samples

slide-5
SLIDE 5

Objective & Use-cases

  • Learn in reliable way
  • Robust learning with noisy (or cheaply-acquired) datasets

○ Augmented learning

5

High valued samples Cheaply-acquired samples

Amirata Ghorbani, James Y. Zou, Data Shapley: Equitable Valuation of Data for Machine Learning, ICML, 2019

slide-6
SLIDE 6

Objective & Use-cases

  • Learn in reliable way
  • Domain adaptation

○ Assigns higher values on the samples from the target distribution

6

Type B

Training Set

Type C Type A

Target Set

Type D Type D

High valued samples

slide-7
SLIDE 7

Related works - Leave-one-out

  • Not reasonable when there are two similar training samples.

Amirata Ghorbani, James Y. Zou, Data Shapley: Equitable Valuation of Data for Machine Learning, ICML, 2019

7

slide-8
SLIDE 8

Related works - Data Shapley

  • Computational complexity is exponential with the number of samples.

Amirata Ghorbani, James Y. Zou, Data Shapley: Equitable Valuation of Data for Machine Learning, ICML, 2019

8

slide-9
SLIDE 9

Challenges & Motivation

  • The search space is extremely large.

○ Impossible to explore the entire space.

  • Training processes can be non-differentiable

○ Selection operation (i.e. sampler block) is non-differentiable. ○ Performance metrics can be non-differentiable (accuracy, AUC). ○ End-to-end back-propagation may not be possible.

  • Reinforcement learning is an efficient way to explore large search

space and to handle non-differentiable process.

9

slide-10
SLIDE 10

High-level fjgure for DVRL

  • Jointly train selector and predictor in an end-to-end way.

10

slide-11
SLIDE 11

Problem formulation

  • Components

○ Training set: ○ Validation set: ○ Predictor model: ○ Data valuation model: To minimize the validation loss Weighted optimization for predictor

11

slide-12
SLIDE 12

Block diagram

12

slide-13
SLIDE 13

Experiments - How to quantitatively evaluate the data valuation?

  • Remove high / low valued samples
  • Corrupted sample discovery
  • Robust learning with noisy data
  • Domain adaptation

13

slide-14
SLIDE 14

Results - Remove high / low valued samples

  • Standard supervised learning setting (train, validation, test datasets

come from the same distribution)

  • Remove high valued samples: Fastest performance degradation
  • Remove low valued samples: Slowest performance degradation

14

slide-15
SLIDE 15

Results - Corrupted sample discovery

  • Corrupted sample setting (20% of label noise)
  • Highest True Positive Rate (TPR) for corrupted sample discovery

15

slide-16
SLIDE 16

Results - Robust learning with noisy labels (40%)

  • Proves scalability of DVRL in terms of complex models

(WideResNet-28-10 and ResNet-32) and large datasets (CIFAR)

  • State-of-the-art robust learning performance

Mengye Ren et al., Learning to Reweight Examples for Robust Deep Learning, ICML, 2018 16

slide-17
SLIDE 17

Results - Domain adaptation on Retail dataset

Type B

Training Set

Type C Type A

Testing Set

Type D Type C

Training Set

Type B Type A Type D

Training Set

Type D

Train-on-All Train-on-Rest Train-on-Specific

17

slide-18
SLIDE 18

Results - Domain adaptation on Retail dataset

  • Significant gain on Train on Rest setting (largest domain mismatch)
  • Reasonable gain on Train on All setting (most common setting)
  • Marginal gain on Train on Specific setting (no domain mismatch)

18

slide-19
SLIDE 19

Results - Domain adaptation in other domains

  • Main source of gain:

○ DVRL jointly optimizes the data valuator and corresponding predictor model

Amirata Ghorbani, James Y. Zou, Data Shapley: Equitable Valuation of Data for Machine Learning, ICML, 2019

19

slide-20
SLIDE 20

Discussion: How many validation samples are needed?

  • A small number of validation samples are enough for DVRL training.
  • Reasonable performances even with 10 validation samples on Adult data.

20

slide-21
SLIDE 21

DVRL - Github: https://github.com/google-research/google-research/tree/master/dvrl DVRL- AI-Hub: https://aihub.cloud.google.com/u/0/p/products%2Fcb6b588c-1582-4868-a944-dc70ebe61a36

Codebase of DVRL

21