Learnability and models of decision making under uncertainty - PowerPoint PPT Presentation

Learnability and models of decision making under uncertainty Pathikrit Basu Federico Echenique Caltech Virginia Tech DT Workshop – April 6, 2018

Pathikrit Basu-Echenique Learnability

To think is to forget a difference, to generalize, to abstract. In the overly replete world of Funes, there were nothing but details. Jorge Luis Borges, “Funes el memorioso” Basu-Echenique Learnability

Motivation Complex models vs. Occam’s razor: ◮ Use a model of economic behavior to infer welfare ◮ Make choices for the agent. ◮ Complex models lead to overfitting. “Uniform learnability” ⇔ no overfitting ⇔ simplicity (these are applications of old ideas in ML) Basu-Echenique Learnability

Setup ◮ Ω a finite state space. ◮ x ∈ X = R Ω are acts ◮ � ⊆ X × X = Z is a preference ◮ P is a class of preferences. Basu-Echenique Learnability

Learning (informal) Model: P Data: choices generated by some �∈ P The choices are among pairs ( x, y ) ∈ Z drawn from some unknown µ ∈ ∆( Z ). (Uniform) learning: Get arbitrarily close to � , with high prob. after a finite sample. (Uniform) Poly-time learnable: Get arbitrarily close to � , with high prob. w/sample size that doesn’t explode with | Ω | . Basu-Echenique Learnability

Our results Sample complexity ( | Ω | ) Learnable Expected utility Linear � Maxmin (2 states) � NA + ∞ Maxmin (states > 2) X Choquet expected utility � Exponential Table: Summary Basu-Echenique Learnability

Digression What is a normal Martian? Basu-Echenique Learnability

Digression weight height Basu-Echenique Learnability

Digression Basu-Echenique Learnability

VC dimension Let P be a collection of sets. A finite set A is always rationalized (“shattered”) by P if, no matter how A is labeled, P can rationalize it. The Vapnik-Chervonenkis ( VC ) dimension of a collection of subsets is the largest cardinality of a set that can always be rationalized. VC(rectangles) = 4. VC(all finite sets) = ∞ Basu-Echenique Learnability

VC dimension Π P ( k ) = the largest number of labelings that can be rationalized for a data of cardinality S . A measure of how “rich” or “complex” P is. How prone to overfitting. Basu-Echenique Learnability

VC dimension Π P ( k ) = the largest number of labelings that can be rationalized for a data of cardinality S . A measure of how “rich” or “complex” P is. How prone to overfitting. Observe: if k ≤ V C ( P ) then Π P ( k ) = 2 k . Thm (Sauer’s lemma): If V C ( P ) = d then � d � ke Π P ( k ) ≤ d for k > d . Basu-Echenique Learnability

Data A dataset consists of a finite set of pairs ( x i , y i ) ∈ Z : ( x 1 , y 1 ) a 1 ( x 2 , y 2 ) a 2 . . . . . . ( x n , y 2 ) a n , with a labeling a i ∈ { 0 , 1 } ; where a i = 1 iff x i is chosen over y i . Basu-Echenique Learnability

Data A dataset is a finite sequence � ( Z × { 0 , 1 } ) n . D ∈ n ≥ 1 The set of all datasets is denoted by D Basu-Echenique Learnability

Learning A learning rule is a map σ : D → P . Basu-Echenique Learnability

Data generating process Given � ∈ P . ◮ µ ∈ ∆( Z ) (full support) ◮ ( x, y ) drawn iid ∼ µ ◮ ( x, y ) labeled according to � . Basu-Echenique Learnability

Learning Distance between � , � ′ ∈ P : d µ ( � , � ′ ) = µ ( � △ � ′ ) , where � △ � ′ = { ( x, y ) ∈ Z : x � y and x � � ′ y }∪ { ( x, y ) ∈ Z : x � � y and x � ′ y } . Basu-Echenique Learnability

Learning P ′ ⊆ P is learnable , if ∃ a learning rule σ s.t. ∀ ε, δ > 0 ∃ s ( ε, δ ) ∈ N s.t. ∀ n ≥ s ( ε, δ ), ( ∀ � ∈ P ′ )( ∀ µ ∈ ∆ f ( Z ))( µ n ( d µ ( σ n , � ) > ε ) < δ ) Basu-Echenique Learnability

Decisions under uncertainty ◮ Ω a finite state space. ◮ x ∈ X = R Ω are acts ◮ � ⊆ = X × X = Z is a preference ◮ P is a class of preferences. Basu-Echenique Learnability

Decisions under uncertainty x, y ∈ X are comonotonic if there are no ω, ω ′ s.t x ( ω ) > x ( ω ′ ) but y ( ω ) < y ( ω ′ ) . Basu-Echenique Learnability

Axioms ◮ ( Weak order ) � is complete and transitive. ◮ ( Independence ) ∀ x, y, z ∈ X λ ∈ (0 , 1), x � y iff λx + (1 − λ ) z � λy + (1 − λ ) z ◮ ( Continuity ) ∀ x ∈ X , U x = { y ∈ X | y � x } and L x = { y ∈ X | x � y } are closed. ◮ ( Convex ) ∀ x ∈ X , the upper contour set U x = { y ∈ X | y � x } is a convex set. Basu-Echenique Learnability

Axioms ◮ ( Comonotic Independence ) ∀ x, y, z ∈ X that are comonotonic and λ ∈ (0 , 1), x � y iff λx + (1 − λ ) z � λy + (1 − λ ) z ◮ ( C-Independence ) ∀ x, y ∈ X , constant act c ∈ X and λ ∈ (0 , 1), x � y iff λx + (1 − λ ) c � λy + (1 − λ ) c Basu-Echenique Learnability

Decisions under uncertainty ◮ P EU : set of preferences satisfying weak order and independence ◮ P MEU : set of preferences satisfying weak order, monotonicity, c-independence, continuity, convexity and homotheticity. ◮ P CEU : set of preferences satisfying comonotonic independence, continuity and monotonicity. Basu-Echenique Learnability

Decisions under uncertainty Theorem ◮ V C ( P EU ) = | Ω | + 1 . ◮ If | Ω | ≥ 3 , then V C ( P MEU ) = + ∞ and P MEU is not learnable ◮ If | Ω | = 2 , then V C ( P MEU ) ≤ 8 and P MEU is learnable. ◮ � | Ω | � ≤ V C ( P CEU ) ≤ ( | Ω | !) 2 (2 | Ω | + 1) + 1 | Ω | / 2 Basu-Echenique Learnability

Decisions under uncertainty Corollary ◮ P EU , P CEU and, when | Ω | = 2 , P MEU are learnable. ◮ P EU requires a minimum sample size that grows linearly with | Ω | , ◮ P CEU requires a minimum sample size that grows exponentially with | Ω | . ◮ P MEU is not learnable when | Ω | ≥ 3 . Basu-Echenique Learnability

Ideas in the proof For EU: If A ⊆ R n and | A | ≥ n + 2, then A = A 1 ∪ A 2 , A 1 ∩ A 2 = ∅ and cvh ( A 1 ) ∩ cvh ( A 2 ) � = ∅ . Basu-Echenique Learnability

Ideas in the proof For max-min. | Ω | ≥ 3. Model can be characterized by a single upper contour set { x : x � 0 } . This upper contour set is a closed convex cone. Consider a circle C in { x ∈ R Ω : � i x i = 1 } distance 1 to (1 / 2 , . . . , 1 / 2). For any n , choose n points x 1 , . . . , x n on C : label any subset. The closed conic hull of the labeled points will exclude all the non-labeled points. Basu-Echenique Learnability

Ideas in the proof For CEU: For a large enough sample, a large enough number of acts must be comonotonic. Apply similar ideas to those used for EU to comonotonic acts, (via comonotonic independence). This shows that VC is finite (and exact upper bound can be calculated). Basu-Echenique Learnability

Ideas in the proof For the exponential-sized lower bound: choose exponentially many unordered events in Ω and consider a dataset of bets on each event. Since events are unordered one can construct a CEU that explains any labeling of the data. Basu-Echenique Learnability

Learnability and models of decision making under uncertainty - PowerPoint PPT Presentation

Learnability and models of decision making under uncertainty Pathikrit Basu Federico Echenique Caltech Virginia Tech DT Workshop April 6, 2018 Pathikrit Basu-Echenique Learnability To think is to forget a difference, to generalize, to

DECISION MAKING readysetpresent.com Decision Making Program Objectives ( 1 of 2 ) To examine

6 Decision- -Making Making MVC (revisited) 6 Decision MVC (revisited) decision

Decision Making 1 Decision Making Skills Establishing a positive decision-making environment.

Decision Making Under Uncertainty Making Decisions Under Uncertainty AI C LASS 10 (C H .

Decision Making Under Decision Making . . . General Set Uncertainty: Proof of This Result

Plan Introduction 1 On categorial grammars and learnability 2 Logical Information Systems

Evaluating Learnability of - User interface and inline help - Inline/Online Tutorials Aim:

An experimental study of the learnability of congestion control Anirudh Sivaraman, Keith

Machine learning theory Nonuniform learnability Hamid Beigy Sharif university of technology

Decision making under uncertainty Course overview Christos Dimitrakakis October 29, 2013 . .

MI MI and Shared MI MI and Shared and Shared Decision Making and Shared Decision Making

Learning Decision Trees Representation is a decision tree. Bias is towards simple decision

S C DECISION E N C E decision science SDS CMU What is Decision Science? Behavioral

A Formal Proof of PAC Learnability for Decision Stumps Joseph Tassarotti Boston College

APT TECHNICAL CPD - MAF SHORT TERM DECISION MAKING Short term decision making and pricing

Supported Decision-Making in Wisconsin Today we will talk about: The concept of Supported

COL866: Foundations of Data Science Ragesh Jaiswal, IITD Ragesh Jaiswal, IITD COL866:

Sigmoid curves and a case for close-to-linear nonlinear models Charles Y. Tan charles

Econometric Evaluation of Social Programs Part I: Counterfactuals, Causality and Structural

tic r The he e extr xtragala lactic ray sk y sky Thr hree a appr pproa oache

Generalization and Simplification in Machine Learning Shay Moran School of Mathematics, IAS

Learning grammar(s) statistically Mark Johnson joint work with Sharon Goldwater and Tom Griffiths

Statistical Machine Learning Lecture 07: Clustering and Evaluation Kristian Kersting TU

Decision Tree Learning Mitchell, Chapter 3 CptS 570 Machine Learning School of EECS Washington