Strategy-proof estimators for simple regression By Javier Perote - PowerPoint PPT Presentation

Strategy-proof estimators for simple regression By Javier Perote (University of Salamanca) and Juan Perote-Peña (University of Zaragoza)

MOTIVATION • First, this is the continuation of a research project consisting in introducing private information and strategic considerations into well-known “aggregation” and “decision” techniques like: – Operations Research (PERT, queuing theory, linear programming,…) – Multicriteria decision making – Clustering techniques – Econometrics • Are these techniques “robust” to individual manipulation using the private information?

MOTIVATION • Secondly, strategic data manipulation evokes the literature on “ robustness ” to avoid random contamination and outlier detection: most of the estimators proposed in that literature use the properties of the median to aggregate data • Interestingly, the median as an allocation device to aggregate information is strategy-proof in some contexts: i.e., when individuals have “single- peaked” preferences on a single dimension in public goods allocation problems • Can the incentives literature (from social choice theory) answer questions on econometrics?

STRUCTURE OF THE PAPER • First, we argue that the informational problem can be very important in some econometric studies. Therefore, designing estimators that are robust to data manipulation can be useful • Secondly, we examine the most popular estimators, OLS and show that they may lead to sample contamination (they’re NOT robust) • Then, we propose a whole family of estimators for the simple regression case that can be proved to be immune to this kind of data contamination • Finally, we’ll confront some of them with OLS in a Monte Carlo experiment

WHAT KIND OF PROBLEM? • Some econometric problems use reported or declared information (that cannot be easily and costlessly observed or verified) from agents or individuals (like questionnaires i.e., it is the agent’s private information) • The information extracted from the data is (or can be) used to allocate “something” or to assess policies that might be important to the agents • Therefore, the agents might be tempted to report false information if they think that the data managing process can be profitably manipulated

AN EXAMPLE • A big firm or a government department has a number of divisions (perhaps located in different regions) • Measures of the output “produced” by the divisions cannot be verified without important costs (inventory costs, monitoring costs, etc.). For instance, number of clients served in a month • Therefore, the information about each division’s output is privately owned by the division manager and is reported by him to the firm’s manager

THE MODEL WITH THE EXAMPLE • Some of the inputs affecting each division’s output are known to the planner (firm’s boss), maybe because the planner himself “allocated” then in the past (i.e., the number of workers in each division, the estimated demand in each region, the monthly division’s budget, etc.) { } : N = • set of divisions (= agents) 1 , 2 ,..., n ∈ • each agent is also an “observation” , : i j N ∀ i ∈ • division i’s measure of (true) output , : N y i ~ ∀ i ∈ , : • N y division i’s reported output i

THE MODEL WITH THE EXAMPLE ∀ i ∈ • publicly known explanatory variable , : N x i = β + β + • True data generating process: y x e 0 1 i i i = σ 1 ,..., • where i n and is an i.i.d. : ( 0 , ) e i N random variable (error term or random shock) ⎡ ⎤ ⎡ ⎤ • Let and : 1 x y ( , ) X Y 1 1 ⎢ ⎥ ⎢ ⎥ ... ... ... ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ = = 1 X x Y y True i i ⎢ ⎥ ⎢ ⎥ ... ... ... ⎥ ⎥ ⎢ ⎢ sample ⎢ ⎥ ⎢ ⎥ ⎣ 1 ⎦ ⎣ ⎦ x y n n

THE MODEL WITH THE EXAMPLE • A regression estimator is a function “ T ” of the ′ ˆ ˆ ˆ β = β β = sample ( , ) : X Y ( , ) ( , ) T X Y 0 1 • The estimated or predicted values of the response variable for each observation are generated as: = β ˆ + β ˆ ∀ ∈ ˆ • . y x i N 0 1 i i ∀ ∈ ˆ , e i i N • And the residuals are the differences: = − ˆ ˆ • e y y . The most widely used estimator is i i i n the OLS one: ∑ β ˆ = ∀ 2 ˆ arg , ( , ) min e X Y OLS i = 1 i

THE MODEL WITH THE EXAMPLE • When the true sample is known to the ( , ) X Y planner, the OLS estimator is the unbiased one with minimum variance (good properties) • But when the true sample is unknown, the only ~ information received by the planner is ( , ) X Y instead of ( , ) . Applying OLS to the reported X Y ~ sample only maintain the good poperties ( , ) X Y ~ Y = when all agents do not lie! (i.e., ) Y • QUESTION : In which cases will the agents lie?

THE MODEL WITH THE EXAMPLE • We must assume some “preferences” guiding the agents’ declaring behaviour. We opt by the… • SINGLE-PEAKEDNESS ASSUMPTION: i ∈ y • Agent with true response value has single- N i y peaked preferences R on the real line E if: i i ∀ ∈ ≠ y , y P v v E v y i • (i) i i i ∀ > > → + + y • (ii) and , 0 , ( ) ( ) v v v v y v P y v i i i i − − y ( ) ( ). y v P y v i i i i

EXAMPLE OF SINGLE-PEAKEDNESS i ∈ • Possible single-peaked preferences for N Preference “intensity” E y i The real line representing ˆ y predicted values i

THE MODEL WITH THE EXAMPLE ( ) = • Let us use the partitioned notation: , Y y i Y − i ~ ~ ′ β ˆ = β ˆ β ˆ = • Def : Regression estimator ( , ) ( , , ) T X y i Y − 0 1 i ~ • is manipulable at sample ∈ by observation ( , ) X Y Z { } ~ ~ ~ ∃ ∈ ℜ ∃ ∈ ≠ ∈ y y , ( ) • if R y E y y such that 1 ,..., i n i i i i i i [ ] [ ] ~ ~ ~ ~ ~ ~ ~ β ˆ + β ˆ β ˆ + β ˆ y ( , , ) ( , , ) ( , , ) ( , , ) X y Y X y Y x P X y Y X y Y x i − − − − 0 i i i i i i 0 i i i i i ~ ~ ′ β ˆ = β ˆ β ˆ = ( , ) ( , , ) T X y i Y • Def : Regression estimator − 0 1 i • is strategy-proof if it is NOT manipulable at any ~ { } ∈ ∈ sample for any observation ( , ) X Y Z 1 ,..., i n

i L • The workers’ union’s wage setting problem i w SOME EXAMPLES i FB + i L i w = i ~ y i L i rK − i q i p

SOME EXAMPLES • The efficiency frontier estimation problem log r i β ˆ 1 ~ = + y w L FB i i i i β ˆ 0 σ log i σ i = β + β σ + : log log DGP r e 0 1 i i i

SOME EXAMPLES • The tax pay-as-you-go rates allocation problem t i : PAYG average tax rate 30% β ˆ 1 20% ~ = + y w L FB i i i i β ˆ 0 I i : income 10 , 000 $ ˆ ˆ = β + β : PAYG tax schedule t I 0 1 i i

i x variables for 5 True response observations OLS IS NOT STRATEGY-PROOF 2 x 2 i y ~ y • Example:

i The OLS estimator x regression line generates the OLS IS NOT STRATEGY-PROOF 2 x 2 i y ~ y • Example:

OLS IS NOT STRATEGY-PROOF The regression ~ y • Example: line slightly i shifts downwards ~ y ≠ y Lie: : 2 2 And the new x prediction for 2 y y 2 is closer to true ~ 2 y x 2 x i 2 By lying and under- y estimating , agent 2 2 can be better off

A STRATEGY-PROOF ESTIMATOR ~ • Only recommended for the case of Z = such ( , ) X Y > 0 ∀ ∈ β 0 = 0 that x i i N and : it is an extension of the median voter theorem: the MV estimator, ~ ⎧ ⎫ defined as: y ~ β ˆ = β ˆ = − β ˆ ⎨ ⎬ i , ( ) med med y x ∈ 1 0 1 i N i i ⎩ ⎭ x i

A STRATEGY-PROOF ESTIMATOR ~ • Only recommended for the case of Z = such ( , ) X Y > 0 ∀ ∈ β 0 = 0 that x i i N and : it is an extension of the median voter theorem: the MV estimator, ~ ⎧ ⎫ defined as: y ~ β ˆ = β ˆ = − β ˆ ⎨ ⎬ i , ( ) med med y x ∈ ~ 1 0 1 i N i i ⎩ ⎭ x y i i ~ Case of 5 y 2 observations x i x 2

A STRATEGY-PROOF ESTIMATOR ~ • Only recommended for the case of Z = such ( , ) X Y > 0 ∀ ∈ β 0 = 0 that x i i N and : it is an extension of the median voter theorem: the MV estimator, ~ ⎧ ⎫ defined as: y ~ β ˆ = β ˆ = − β ˆ ⎨ ⎬ i , ( ) med med y x ∈ ~ 1 0 1 i N i i ⎩ ⎭ x y i i β ˆ is the median 1 of the slopes x i

A STRATEGY-PROOF ESTIMATOR ~ • Only recommended for the case of Z = such ( , ) X Y > 0 ∀ ∈ β 0 = 0 that x i i N and : it is an extension of the median voter theorem: the MV estimator, ~ ⎧ ⎫ defined as: y ~ β ˆ = β ˆ = − β ˆ ⎨ ⎬ i , ( ) med med y x ∈ ~ 1 0 1 i N i i ⎩ ⎭ x y i i β ˆ is the median 1 2 of the slopes 1 3 x 4 i 5

Strategy-proof estimators for simple regression By Javier Perote - PowerPoint PPT Presentation

Strategy-proof estimators for simple regression By Javier Perote (University of Salamanca) and Juan Perote-Pea (University of Zaragoza) MOTIVATION First, this is the continuation of a research project consisting in introducing private

L-estimators, R-estimators, Redescending M gr. Jakub Petr asek Estimators Revision Seminar

Regression Discontinuity Estimators and LATE James Heckman University of Chicago Econ 312 May

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

Continuous attractors as unreliable estimators Arvind Murugan Dept. of Physics Regression using

Linear regression Linear regression is a simple approach to supervised learning. It assumes

Linear regression Linear regression is a simple approach to supervised learning. It assumes

3515ICT Theory of Computation Some sample proofs 4-0 Proof types 1. Proof

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

Regression Methods 1. Linear Regression and Logistic Regression: definitions, and a common

Survival models and Cox-regression Rates and Survival Lifetable estimators Bendix Carstensen

Dynamic Panel Data estimators Christopher F Baum EC 823: Applied Econometrics Boston College,

Small Sample Performance of Instrumental Variables Probit Estimators: A Monte Carlo Investigation

Review - Mathematical Statistics Estimators and Estimates Unbiased estimators Efficiency

Review - Mathematical Statistics Estimators and Estimates Unbiased estimators Efficiency

Dynamic Panel Data estimators Christopher F Baum ECON 8823: Applied Econometrics Boston College,

From Importance Sampling to Doubly Robust Policy Gradient Jiawei Huang (UIUC) Nan Jiang (UIUC)

Project Advisory Committee Meeting #1 October 16, 2018 Introductions MnDOT Staff Members

Estimation of Key Parameters for CGE Models Azusa OKAGAWA JSPS Research Fellow National

Making Work Pay: An Assessment of the Experience with Action Emploi Guy Lacroix

Energy Efficiency Modeling Discussion October 14th, 2016 2 Major Energy Efficiency Modeling

Economic analysis: an essential tool for effective competition policy Conference on Institution

- Progress and Preliminary Outcomes Prof. Dr. Seo-Young Cho Faculty of Economics

Taking an Active Learning Course Online Cornell Economics June 17, 2020 2 Applied Econometrics

The Persistent Effects of Perus Mining Mita by Melissa Dell, Econometrica (2010)