Low-Cost Learning via Active Data Procurement EC 2015 Jacob - - PowerPoint PPT Presentation

low cost learning via active data procurement
SMART_READER_LITE
LIVE PREVIEW

Low-Cost Learning via Active Data Procurement EC 2015 Jacob - - PowerPoint PPT Presentation

Low-Cost Learning via Active Data Procurement EC 2015 Jacob Abernethy Yiling Chen Chien-Ju Ho Bo Waggoner 1 General problem: buy data for learning h hypothesis (predictor) Learners LLC We Buy Data! 2 General problem: buy data for


slide-1
SLIDE 1

Low-Cost Learning via Active Data Procurement

EC 2015 Jacob Abernethy Yiling Chen Chien-Ju Ho Bo Waggoner

1

slide-2
SLIDE 2

2

We Buy Data!

Learners LLC h

hypothesis (predictor)

General problem: buy data for learning

slide-3
SLIDE 3

3

We Buy Data!

Learners LLC h

hypothesis (predictor)

General problem: buy data for learning

Example: each person has medical data... … learn to predict disease

slide-4
SLIDE 4

4

Example task: classification

h

  • Data point: pair (x, label) where label is or
  • Hypothesis: hyperplane separating the two types
  • Loss: 0 if h(x) = correct label, 1 if incorrect label
  • Goal: pick h with low expected loss on new data point

We Buy Data!

Learners LLC

slide-5
SLIDE 5

Learn a good hypothesis by purchasing data from the crowd

5

General Goal:

slide-6
SLIDE 6
  • 1. price data actively based on value
  • 2. machine-learning style bounds
  • 3. transform learning algs to mechanisms

6

This paper:

learning alg mechanism

slide-7
SLIDE 7
  • 1. price data actively based on value
  • 2. machine-learning style bounds
  • 3. transform learning algs to mechanisms

7

This paper:

learning alg mechanism

slide-8
SLIDE 8

8

How to assess value/price of data?

slide-9
SLIDE 9

9

Use the learner’s current hypothesis!

slide-10
SLIDE 10

10

Use the learner’s current hypothesis!

slide-11
SLIDE 11

c1

11

Our model

z1

i.i.d.

z2

mechanism

distribution

c2

h

hypothesis Cost of revealing data

  • lies in [0,1]
  • worst-case, arbitrarily correlated with the data
  • nline arrival
slide-12
SLIDE 12

12

Agent-mechanism interaction

At each time t = 1, …, T:

data: 65 30 65 price: $0.22 $0.41 $0.88

  • 1. mechanism posts menu
slide-13
SLIDE 13

13

ct zt

Agent-mechanism interaction

At each time t = 1, …, T:

data: 65 30 65 price: $0.22 $0.41 $0.88

  • 2. agent arrives

accepts mechanism learns (zt, ct) and pays price(zt) rejects mechanism sees rejection and pays nothing

  • 1. mechanism posts menu
slide-14
SLIDE 14
  • 1. price data actively based on value
  • 2. machine-learning style bounds
  • 3. transform learning algs to mechanisms

14

This paper:

learning alg mechanism

slide-15
SLIDE 15

15

What is the “classic” learning problem?

z1

i.i.d.

z2

learning alg

distribution h

hypothesis

slide-16
SLIDE 16

16

Classic ML bounds

E loss( h ) ≤ E loss( h* ) + O

VC-dim

T

h

alg’s hypothesis

  • ptimal hypothesis

# of data points measure of problem difficulty

slide-17
SLIDE 17

17

Main result

For a variety of learning problems:

E loss( h ) ≤ E loss( h* ) + O

γ

B

h

  • ur hypothesis
  • ptimal hypothesis

Budget constraint measure of “problem difficulty”, in [0,1].

(Assume: γ is approximately known in advance)

slide-18
SLIDE 18

18

Main result

For a variety of learning problems:

E loss( h ) ≤ E loss( h* ) + O

B

h

  • ur hypothesis
  • ptimal hypothesis

Budget constraint measure of “problem difficulty”, in [0,1].

(Assume: γ is approximately known in advance)

1 T

γ ≈ average cost * difficulty “if problem is cheap or easy or has good correlations, we do well”

γ

slide-19
SLIDE 19

19

Related work in purchasing data

this work Meir, Procaccia, Rosenschein 2012 Cummings, Ligett, Roth, Wu, Ziani 2015 Dekel, Fisher, Procaccia 2008 Ghosh, Ligett, Roth, Schoenebeck 2014 Horel, Ionnadis, Muthukrishnan 2014 Roth, Schoenebeck 2012 Ligett, Roth 2012 Cai, Daskalakis, Papadimitriou 2015

Type of goal Model

slide-20
SLIDE 20

Key features/ideas:

  • 1. price data actively based on value
  • 2. machine-learning style bounds
  • 3. transform learning algs to mechanisms

20

This paper:

learning alg mechanism

slide-21
SLIDE 21

21

Learning algorithms: FTRL

  • Follow-The-Regularized-Leader (FTRL)

(Multiplicative Weights, Online Gradient Descent, ….)

  • FTRL algs do “no regret” learning:

○ output a hypothesis at each time ○ want low total loss

  • we interface with FTRL as a black box…

… but analysis relies on “opening the box”

slide-22
SLIDE 22

22

Our mechanism

At each time t = 1, …, T:

  • 1. post menu

ht current hypothesis Alg

price(z) ~ distribution(ht, z)

slide-23
SLIDE 23

23

ct zt

Our mechanism

At each time t = 1, …, T:

  • 1. post menu
  • 2. agent arrives

accepts rejects

null data point ht current hypothesis de-biased data Alg

price(z) ~ distribution(ht, z)

slide-24
SLIDE 24

24

Analysis idea: use no-regret setting!

c1 z1 z2 c2

h h

  • Propose regret minimization with purchased data
  • Prove upper and lower bounds on regret
  • low regret ⇒ good prediction on new data (main result)
slide-25
SLIDE 25

25

Summary

Problem: learn a good hypothesis by buying data from arriving agents For a variety of learning problems: E loss( h ) ≤ E loss( h* ) + O

γ

B

slide-26
SLIDE 26

26

Key ideas

  • 1. price data actively based on value
  • 2. machine-learning style bounds
  • 3. transform learning algs to mechanisms

learning alg mechanism

slide-27
SLIDE 27

27

Future work

  • Improve bounds (no-regret: gap between

lower and upper bounds)

  • Propose “universal quantity” to replace

γ in bounds (analogue of VC-dimension)

  • Variants of the model, better batch mechanisms
  • Explore black-box use of learning algs in mechanisms
slide-28
SLIDE 28

28

Future work

  • Improve bounds (no-regret: gap between

lower and upper bounds)

  • Propose “universal quantity” to replace

γ in bounds (analogue of VC-dimension)

  • Variants of the model, better batch mechanisms
  • Explore black-box use of learning algs in mechanisms

Thanks!

slide-29
SLIDE 29

Additional slides

29

slide-30
SLIDE 30

Naive 1: post price of 1, obtain B points, run a learner on them. Naive 2: post lower prices, obtain biased data, do what?? Roth-Schoenebeck (EC 2012): draw prices from a distribution, obtain biased data, de-bias it.

  • Batch setting (offer each data point the same price distribution)
  • Each agent has a number. Task is to estimate the mean
  • Derives price distribution to minimize variance of estimate

30

What would you do before this work?

slide-31
SLIDE 31

31

Related work

ML-style risk bounds Minimize variance

  • r related goal

this work Meir, Procaccia, Rosenschein 2012 Cummings, Ligett, Roth, Wu, Ziani 2015 Dekel, Fisher, Procaccia 2008 Ghosh, Ligett, Roth, Schoenebeck 2014 Horel, Ionnadis, Muthukrishnan 2014 Roth, Schoenebeck 2012 Ligett, Roth 2012 Cai, Daskalakis, Papadimitriou 2015 can fabricate data (like in peer- prediction) principal-agent style, data depends on effort agents cannot fabricate data, have costs

slide-32
SLIDE 32

32

Simulation results

MNIST dataset -- handwritten digit classification Brighter green = higher cost Toy problem: classify (1 or 4) vs (9 or 8)

slide-33
SLIDE 33

33

Simulation results

  • T = 8503
  • train on half,

test on half

  • Alg: Online Gradient

Descent Naive: pay 1 until budget is exhausted, then run alg Baseline: run alg on all data points (no budget) Large γ: bad correlations Small γ: independent cost/data

slide-34
SLIDE 34
  • Value of data = size of loss

size of gradient of loss (“how much you learn from the loss”)

  • Pricing distribution:

Pr[ price ≥ x ] =

  • K = normalization constant proportional to γ = ∑t ǁ∇loss(ht,zt)ǁ ct

(assume approximate knowledge of K … in practice, can estimate it online)

  • Distribution is derived by optimizing regret bound of mechanism for “at-

cost” variant of no-regret setting

T

34

“value” and pricing distribution?

ǁ ∇ loss(ht , zt ) ǁ K x

1

slide-35
SLIDE 35

35

Pricing distribution