Learnability and models of decision making under uncertainty - - PowerPoint PPT Presentation

learnability and models of decision making under
SMART_READER_LITE
LIVE PREVIEW

Learnability and models of decision making under uncertainty - - PowerPoint PPT Presentation

Learnability and models of decision making under uncertainty Pathikrit Basu Federico Echenique Caltech Virginia Tech DT Workshop April 6, 2018 Pathikrit Basu-Echenique Learnability To think is to forget a difference, to generalize, to


slide-1
SLIDE 1

Learnability and models of decision making under uncertainty

Pathikrit Basu Federico Echenique

Caltech

Virginia Tech DT Workshop – April 6, 2018

slide-2
SLIDE 2

Pathikrit

Basu-Echenique Learnability

slide-3
SLIDE 3

To think is to forget a difference, to generalize, to

  • abstract. In the overly replete world of Funes, there

were nothing but details. Jorge Luis Borges, “Funes el memorioso”

Basu-Echenique Learnability

slide-4
SLIDE 4

Motivation

Complex models vs. Occam’s razor:

◮ Use a model of economic behavior to infer welfare ◮ Make choices for the agent. ◮ Complex models lead to overfitting.

“Uniform learnability” ⇔ no overfitting ⇔ simplicity (these are applications of old ideas in ML)

Basu-Echenique Learnability

slide-5
SLIDE 5

Setup

◮ Ω a finite state space. ◮ x ∈ X = RΩ are acts ◮ ⊆ X × X = Z is a preference ◮ P is a class of preferences.

Basu-Echenique Learnability

slide-6
SLIDE 6

Learning (informal)

Model: P Data: choices generated by some ∈ P The choices are among pairs (x, y) ∈ Z drawn from some unknown µ ∈ ∆(Z). (Uniform) learning: Get arbitrarily close to , with high prob. after a finite sample. (Uniform) Poly-time learnable: Get arbitrarily close to , with high prob. w/sample size that doesn’t explode with |Ω|.

Basu-Echenique Learnability

slide-7
SLIDE 7

Our results

Learnable Sample complexity (|Ω|) Expected utility

  • Linear

Maxmin (2 states)

  • NA

Maxmin (states > 2) X +∞ Choquet expected utility

  • Exponential

Table: Summary

Basu-Echenique Learnability

slide-8
SLIDE 8

Digression What is a normal Martian?

Basu-Echenique Learnability

slide-9
SLIDE 9

Digression

height weight

Basu-Echenique Learnability

slide-10
SLIDE 10

Digression

height weight

Basu-Echenique Learnability

slide-11
SLIDE 11

Digression

height weight

Basu-Echenique Learnability

slide-12
SLIDE 12

Digression

height weight

Basu-Echenique Learnability

slide-13
SLIDE 13

Digression

Basu-Echenique Learnability

slide-14
SLIDE 14

Digression

Basu-Echenique Learnability

slide-15
SLIDE 15

Digression

Basu-Echenique Learnability

slide-16
SLIDE 16

Digression

Basu-Echenique Learnability

slide-17
SLIDE 17

Digression

Basu-Echenique Learnability

slide-18
SLIDE 18

Digression

Basu-Echenique Learnability

slide-19
SLIDE 19

Digression

Basu-Echenique Learnability

slide-20
SLIDE 20

Digression

Basu-Echenique Learnability

slide-21
SLIDE 21

Digression

Basu-Echenique Learnability

slide-22
SLIDE 22

VC dimension

Let P be a collection of sets. A finite set A is always rationalized (“shattered”) by P if, no matter how A is labeled, P can rationalize it. The Vapnik-Chervonenkis (VC) dimension of a collection of subsets is the largest cardinality of a set that can always be rationalized. VC(rectangles) = 4. VC(all finite sets) = ∞

Basu-Echenique Learnability

slide-23
SLIDE 23

VC dimension

ΠP(k) = the largest number of labelings that can be rationalized for a data of cardinality S. A measure of how “rich” or “complex” P is. How prone to

  • verfitting.

Basu-Echenique Learnability

slide-24
SLIDE 24

VC dimension

ΠP(k) = the largest number of labelings that can be rationalized for a data of cardinality S. A measure of how “rich” or “complex” P is. How prone to

  • verfitting.

Observe: if k ≤ V C(P) then ΠP(k) = 2k. Thm (Sauer’s lemma): If V C(P) = d then ΠP(k) ≤ ke d d for k > d.

Basu-Echenique Learnability

slide-25
SLIDE 25

Data

A dataset consists of a finite set of pairs (xi, yi) ∈ Z: (x1, y1) a1 (x2, y2) a2 . . . . . . (xn, y2) an, with a labeling ai ∈ {0, 1}; where ai = 1 iff xi is chosen over yi.

Basu-Echenique Learnability

slide-26
SLIDE 26

Data

A dataset is a finite sequence D ∈

  • n≥1

(Z × {0, 1})n. The set of all datasets is denoted by D

Basu-Echenique Learnability

slide-27
SLIDE 27

Learning

A learning rule is a map σ : D → P.

Basu-Echenique Learnability

slide-28
SLIDE 28

Data generating process

Given ∈ P.

◮ µ ∈ ∆(Z) (full support) ◮ (x, y) drawn iid ∼ µ ◮ (x, y) labeled according to .

Basu-Echenique Learnability

slide-29
SLIDE 29

Learning

Distance between , ′∈ P: dµ(, ′) = µ( △ ′), where △ ′= {(x, y) ∈ Z : x y and x ′ y}∪ {(x, y) ∈ Z : x y and x ′ y}.

Basu-Echenique Learnability

slide-30
SLIDE 30

Learning

P′ ⊆ P is learnable, if ∃ a learning rule σ s.t. ∀ε, δ > 0 ∃s(ε, δ) ∈ N s.t. ∀n ≥ s(ε, δ), (∀ ∈ P′)(∀µ ∈ ∆f(Z))(µn(dµ(σn, ) > ε) < δ)

Basu-Echenique Learnability

slide-31
SLIDE 31

Decisions under uncertainty

◮ Ω a finite state space. ◮ x ∈ X = RΩ are acts ◮ ⊆= X × X = Z is a preference ◮ P is a class of preferences.

Basu-Echenique Learnability

slide-32
SLIDE 32

Decisions under uncertainty

x, y ∈ X are comonotonic if there are no ω, ω′ s.t x(ω) > x(ω′) but y(ω) < y(ω′).

Basu-Echenique Learnability

slide-33
SLIDE 33

Axioms

◮ (Weak order) is complete and transitive. ◮ (Independence) ∀x, y, z ∈ X λ ∈ (0, 1),

x y iff λx + (1 − λ)z λy + (1 − λ)z

◮ (Continuity) ∀x ∈ X,

Ux = {y ∈ X | y x} and Lx = {y ∈ X | x y} are closed.

◮ (Convex) ∀x ∈ X, the upper contour set

Ux = {y ∈ X | y x} is a convex set.

Basu-Echenique Learnability

slide-34
SLIDE 34

Axioms

◮ (Comonotic Independence) ∀x, y, z ∈ X that are

comonotonic and λ ∈ (0, 1), x y iff λx + (1 − λ)z λy + (1 − λ)z

◮ (C-Independence) ∀x, y ∈ X, constant act c ∈ X and

λ ∈ (0, 1), x y iff λx + (1 − λ)c λy + (1 − λ)c

Basu-Echenique Learnability

slide-35
SLIDE 35

Decisions under uncertainty

◮ PEU: set of preferences satisfying weak order and

independence

◮ PMEU: set of preferences satisfying weak order,

monotonicity, c-independence, continuity, convexity and homotheticity.

◮ PCEU: set of preferences satisfying comonotonic

independence, continuity and monotonicity.

Basu-Echenique Learnability

slide-36
SLIDE 36

Decisions under uncertainty

Theorem

◮ V C(PEU) = |Ω| + 1. ◮ If |Ω| ≥ 3, then V C(PMEU) = +∞ and PMEU is not

learnable

◮ If |Ω| = 2, then V C(PMEU) ≤ 8 and PMEU is learnable. ◮ |Ω| |Ω|/2

  • ≤ V C(PCEU) ≤ (|Ω|!)2(2|Ω| + 1) + 1

Basu-Echenique Learnability

slide-37
SLIDE 37

Decisions under uncertainty

Corollary

◮ PEU, PCEU and, when |Ω| = 2, PMEU are learnable. ◮ PEU requires a minimum sample size that grows linearly

with |Ω|,

◮ PCEU requires a minimum sample size that grows

exponentially with |Ω|.

◮ PMEU is not learnable when |Ω| ≥ 3.

Basu-Echenique Learnability

slide-38
SLIDE 38

Ideas in the proof

For EU: If A ⊆ Rn and |A| ≥ n + 2, then A = A1 ∪ A2, A1 ∩ A2 = ∅ and cvh(A1) ∩ cvh(A2) = ∅.

Basu-Echenique Learnability

slide-39
SLIDE 39

Ideas in the proof

For max-min. |Ω| ≥ 3. Model can be characterized by a single upper contour set {x : x 0}. This upper contour set is a closed convex cone. Consider a circle C in {x ∈ RΩ :

i xi = 1} distance 1 to

(1/2, . . . , 1/2). For any n, choose n points x1, . . . , xn on C: label any subset. The closed conic hull of the labeled points will exclude all the non-labeled points.

Basu-Echenique Learnability

slide-40
SLIDE 40

Ideas in the proof

For CEU: For a large enough sample, a large enough number of acts must be comonotonic. Apply similar ideas to those used for EU to comonotonic acts, (via comonotonic independence). This shows that VC is finite (and exact upper bound can be calculated).

Basu-Echenique Learnability

slide-41
SLIDE 41

Ideas in the proof

For the exponential-sized lower bound: choose exponentially many unordered events in Ω and consider a dataset of bets on each event. Since events are unordered one can construct a CEU that explains any labeling of the data.

Basu-Echenique Learnability