Combining Estimates from Related Surveys via Bivariate Models - PowerPoint PPT Presentation

Combining Estimates from Related Surveys via Bivariate Models (Application: using ACS estimates to improve estimates from smaller U.S. surveys) William R. Bell and Carolina Franco, U.S. Census Bureau 2016 Ross-Royall Symposium February 26, 2016 Bell & Franco () Combining estimates from related surveys February 26, 2016 1 / 17

Disclaimer: This report is released to inform interested parties of ongoing research and to encourage discussion. The views expressed on statistical, methodological, technical, or operational issues are those of the author(s) and not necessarily those of the U.S. Census Bureau. Bell & Franco () Combining estimates from related surveys February 26, 2016 2 / 17

Introduction Investigate the potential of using bivariate models to borrow strength from estimates from a large survey to improve related estimates from smaller surveys. Motivation: “Large survey” is the Census Bureau’s American Community Survey (ACS), the largest U.S. household survey. Approach is simple and requires no covariates from auxiliary information. Real examples show that large reductions in standard errors of estimates are possible. Bell & Franco () Combining estimates from related surveys February 26, 2016 3 / 17

ACS: The Largest U.S. Household Survey American Community Survey (ACS) Conducted annually (data collected throughout the year) and has replaced the decennial census long form sample. Samples approximately 3.5 million addresses each year. Encompasses a broad range of topics: demographic, income, health insurance, employment, disabilities, occupations, housing, education, veteran status, etc. Produces estimates annually based on 1 or 5 years of data. Bell & Franco () Combining estimates from related surveys February 26, 2016 4 / 17

Three Smaller U.S. Surveys Survey of Income and Program Participation (SIPP) Disability Module Approx. 37,000 households and 70,000 persons in 2008 panel. Detailed questions about many di¤erent aspects of disability. National Health Interview Survey (NHIS) About 110,000 persons in Family Core component, 2013. Questions about a broad range of health topics asked in personal household interviews. Estimates used to track health status, health care access, and progress toward achieving national health objectives Current Population Survey (CPS) Annual Social and Economic Supplement . Samples about 100,000 addresses. Provides o¢cial national estimates of income and poverty. Bell & Franco () Combining estimates from related surveys February 26, 2016 5 / 17

Four Applications SIPP estimates of U.S. state disability rates . 1 ACS variable: Estimate of state disability rates (types of disabilities and the time frames di¤er from SIPP). NHIS estimates of U.S. state uninsured rates . 2 ACS variable: Estimate of U.S. state uninsured rates (questions asked and the mode of survey delivery and design di¤er from NHIS). CPS estimates of per capita expenditure on health insurance 3 premiums by state ACS variable: Estimated per capita income by state. ACS 1-yr estimates (of anything! Take county rates of children 4 in poverty to illustrate) 2nd variable: Corresponding previous ACS 5-yr estimates (larger sample size, but less current). Bell & Franco () Combining estimates from related surveys February 26, 2016 6 / 17

Univariate Gaussian Shrinkage Model for Survey Estimates For m small areas: y i = Y i + e i i = 1 , . . . , m Y i = µ + u i y i is the direct survey estimate of Y i , the population characteristic of interest for area i . e i is the sampling error in y i , generally assumed to be N ( 0 , v i ) , independent with v i known. u i is the area i random e¤ect, usually assumed to be i.i.d. N ( 0 , σ 2 u ) and independent of the e i . Bell & Franco () Combining estimates from related surveys February 26, 2016 7 / 17

Shrinkage Estimation (Stein 1956, Carter and Rolph 1974) Best linear predictor of Y i ( µ and σ 2 known): ˆ Y i = ( 1 � γ i ) y i + γ i µ where v i γ i = v i + σ 2 u Weighted average ˆ Y i “shrinks” the direct estimate y i towards the overall mean µ . The smaller is the sampling variance v i the more weight is placed on the direct survey estimate y i . Parameters unknown: estimate by ML or REML, or take Bayesian approach. Fay and Herriot (1979) extended the approach to shrink y i towards a regression mean µ i = x 0 i β , and applied this approach to small area estimation. Bell & Franco () Combining estimates from related surveys February 26, 2016 8 / 17

Bivariate Gaussian Model y 1 i = Y 1 i + e 1 i = ( µ 1 + u 1 i ) + e 1 i , i = 1 , . . . , m . y 2 i = Y 2 i + e 2 i = ( µ 2 + u 2 i ) + e 2 i � u 1 i � � σ 11 � σ 12 i . i . d � N ( 0 , Σ ) , Σ = u 2 i σ 12 σ 22 � e 1 i � � v 11 � 0 i . i . d � N ( 0 , V i ) , V i = e 2 i 0 σ 22 y 1 i is the direct estimate of the quantity of interest Y 1 i , and y 2 i is the direct estimate from another survey of a related quantity Y 2 i . Note that V i assumes the sampling errors e 1 i and e 2 i are uncorrelated. This can be generalized. The alternative of simply including y 2 i as a regression covariate in the model would ignore their sampling errors! Bell & Franco () Combining estimates from related surveys February 26, 2016 9 / 17

Estimation/Inference for Model Parameters Unknown parameters: µ 1 , µ 2 , σ 11 , σ 22 , and σ 12 or ρ = σ 12 / p σ 11 σ 22 . Sampling variances v 1 i and v 2 i are treated as known (really estimated using survey microdata). Can estimate unknown parameters by ML or REML. We shall use a Bayesian approach with ‡at priors on µ 1 , µ 2 , σ 11 > 0 , σ 22 > 0 and ρ 2 ( � 1 , 1 ) . Approach was implemented in JAGS. Bell & Franco () Combining estimates from related surveys February 26, 2016 10 / 17

Prediction When Model Parameters are Known In matrix notation y i = Y i + e i = ( µ + u i ) + e i ^ = E ( Y i j y i ) = µ + Σ ( Σ + V i ) � 1 ( y i � µ ) Y BP i MSE ( ^ Y BP ) = Var ( Y i j y i ) = Σ � Σ ( Σ + V i ) � 1 Σ i We are interested in predicting Y 1 i only, not Y 2 i Y BP ˆ is a linear combination of µ 1 , ( y 1 i � µ 1 ) , and ( y 2 i � µ 2 ) . 1 i Bell & Franco () Combining estimates from related surveys February 26, 2016 11 / 17

MSE % Reductions from Shrinkage Estimation direct estimation to univariate shrinkage: � � 1 � Var ( Y 1 i j y 1 i ) 100 � v 1 i (more reduction as v 1 i increases) Bell & Franco () Combining estimates from related surveys February 26, 2016 12 / 17

MSE % Reductions from Shrinkage Estimation direct estimation to univariate shrinkage: � � 1 � Var ( Y 1 i j y 1 i ) 100 � v 1 i (more reduction as v 1 i increases) univariate to bivariate shrinkage: � � 1 � Var ( Y 1 i j y 1 i , y 2 i ) 100 � Var ( Y 1 i j y 1 i ) (more reduction as v 2 i decreases and as ρ increases) Bell & Franco () Combining estimates from related surveys February 26, 2016 12 / 17

MSE % Reductions from Shrinkage Estimation direct estimation to univariate shrinkage: � � 1 � Var ( Y 1 i j y 1 i ) 100 � v 1 i (more reduction as v 1 i increases) univariate to bivariate shrinkage: � � 1 � Var ( Y 1 i j y 1 i , y 2 i ) 100 � Var ( Y 1 i j y 1 i ) (more reduction as v 2 i decreases and as ρ increases) direct estimation to bivariate shrinkage: � � 1 � Var ( Y 1 i j y 1 i , y 2 i ) 100 � v 1 i Bell & Franco () Combining estimates from related surveys February 26, 2016 12 / 17

Application I: 2010 Disability Rates for U.S. States: SIPP borrowing from ACS y 1 i = SIPP disability estimate, y 2 i = ACS disability estimate Smoothing of SIPP direct sampling variance estimates is applied. ρ = . 82 ˆ Univariate shrinkage yields an MSE decrease of 2 % � 67 % from direct, with a median of 19 % Bell & Franco () Combining estimates from related surveys February 26, 2016 13 / 17

Application I: 2010 Disability Rates for U.S. States: SIPP borrowing from ACS y 1 i = SIPP disability estimate, y 2 i = ACS disability estimate Smoothing of SIPP direct sampling variance estimates is applied. ρ = . 82 ˆ Univariate shrinkage yields an MSE decrease of 2 % � 67 % from direct, with a median of 19 % The MSE decrease from bivariate vs. univariate model is 6 % � 59 % with a median of 29 % Bell & Franco () Combining estimates from related surveys February 26, 2016 13 / 17

Application I: 2010 Disability Rates for U.S. States: SIPP borrowing from ACS y 1 i = SIPP disability estimate, y 2 i = ACS disability estimate Smoothing of SIPP direct sampling variance estimates is applied. ρ = . 82 ˆ Univariate shrinkage yields an MSE decrease of 2 % � 67 % from direct, with a median of 19 % The MSE decrease from bivariate vs. univariate model is 6 % � 59 % with a median of 29 % The MSE decrease from bivariate vs. direct is 8 � 86 % , with a median decrease of 43 % Bell & Franco () Combining estimates from related surveys February 26, 2016 13 / 17

Combining Estimates from Related Surveys via Bivariate Models - PowerPoint PPT Presentation

Combining Estimates from Related Surveys via Bivariate Models (Application: using ACS estimates to improve estimates from smaller U.S. surveys) William R. Bell and Carolina Franco, U.S. Census Bureau 2016 Ross-Royall Symposium February 26,

Multivariate probability distributions September 1, 2017 STAT 151 Class 2 Slide 1 Outline

Bivariate Data Marc H. Mehlman marcmehlman@yahoo.com University of New Haven Marc Mehlman Marc

Bivariate Correlation r > 0 r < 0 r = 0 r = 0 r > 0 r = 0 remember: r measures

Bat surveys undertaken in 2017 1. Roost Assessment Surveys 2. Activity transects 3. Crossing

V1E 12 Sept 2016 Surveys V1 2016 SLDM Surveys 1 V1 2015 StatChat2 2 2 Polls and Surveys

US Decadal Surveys David Spergel Tokyo (via ZOOM) Multiple Decadal Surveys Astrophysics

Combining GLM and ABI Data for Enhanced GOES-R Rainfall Estimates A New GOES-R3 Project (combining

Combining nutritional data from two surveys to augment dietary intake estimates Authors M.Crowe,

Bayesian and Non-Bayesian Analysis of Soccer Data using Bivariate Poisson Regression Models

Linear Regression 18.05 Spring 2014 Agenda Fitting curves to bivariate data Measuring the

Lionel Riou Fransca Univariate & bivariate Two kind of analysis Univariate

Contents 1 The Classic Bivariate Least Squares Model 1 1.1 The Setup . . . . . . . . . . . . .

Counting reducible and singular bivariate polynomials Joachim von zur Gathen Bonn 1 Four

Linear Regression 18.05 Spring 2014 Agenda Fitting curves to bivariate data Measuring the

Bivariate Count Processes for Earthquake Frequency Mathieu Boudreault & Arthur Charpentier

The Foundation of Regression Analysis Bivariate Linear Regression James H. Steiger Department of

Device-independent Randomness Expansion with Entangled Photons Yanbao Zhang NTT Research Center

Hearing #11 on Competition and Consumer Protection in the 21st Century Federal Trade Commission

Format Abstraction for Sparse Tensor Algebra Compilers Stephen Chou , Fredrik Kjolstad, and Saman

+ Bellringer What is the overall topic of your essay? Your THEME 1. Name the five types of

Quantum Entanglement and the Bell Matrix Marco Pedicini (Roma Tre University) in collaboration

23 rd WiN Global Annual Conference Women in Nuclear Meet Atoms for Peace Programme of Action for

SRv6 Network Programming (draft-filsfils-spring-srv6-network-programming-00) C. Filsfils (Cisco)

Interference, Dependence and Bells Theorem Samson Abramsky Department of Computer Science,

Combining Estimates from Related Surveys via Bivariate Models - PowerPoint PPT Presentation

Combining Estimates from Related Surveys via Bivariate Models (Application: using ACS estimates to improve estimates from smaller U.S. surveys) William R. Bell and Carolina Franco, U.S. Census Bureau 2016 Ross-Royall Symposium February 26,

Multivariate probability distributions September 1, 2017 STAT 151 Class 2 Slide 1 Outline

Bivariate Data Marc H. Mehlman marcmehlman@yahoo.com University of New Haven Marc Mehlman Marc

Bivariate Correlation r &gt; 0 r &lt; 0 r = 0 r = 0 r &gt; 0 r = 0 remember: r measures

Bat surveys undertaken in 2017 1. Roost Assessment Surveys 2. Activity transects 3. Crossing

V1E 12 Sept 2016 Surveys V1 2016 SLDM Surveys 1 V1 2015 StatChat2 2 2 Polls and Surveys

US Decadal Surveys David Spergel Tokyo (via ZOOM) Multiple Decadal Surveys Astrophysics

Combining GLM and ABI Data for Enhanced GOES-R Rainfall Estimates A New GOES-R3 Project (combining

Combining nutritional data from two surveys to augment dietary intake estimates Authors M.Crowe,

Bayesian and Non-Bayesian Analysis of Soccer Data using Bivariate Poisson Regression Models

Linear Regression 18.05 Spring 2014 Agenda Fitting curves to bivariate data Measuring the

Lionel Riou Fransca Univariate &amp; bivariate Two kind of analysis Univariate

Contents 1 The Classic Bivariate Least Squares Model 1 1.1 The Setup . . . . . . . . . . . . .

Counting reducible and singular bivariate polynomials Joachim von zur Gathen Bonn 1 Four

Linear Regression 18.05 Spring 2014 Agenda Fitting curves to bivariate data Measuring the

Bivariate Count Processes for Earthquake Frequency Mathieu Boudreault &amp; Arthur Charpentier

The Foundation of Regression Analysis Bivariate Linear Regression James H. Steiger Department of

Device-independent Randomness Expansion with Entangled Photons Yanbao Zhang NTT Research Center

Hearing #11 on Competition and Consumer Protection in the 21st Century Federal Trade Commission

Format Abstraction for Sparse Tensor Algebra Compilers Stephen Chou , Fredrik Kjolstad, and Saman

+ Bellringer What is the overall topic of your essay? Your THEME 1. Name the five types of

Quantum Entanglement and the Bell Matrix Marco Pedicini (Roma Tre University) in collaboration

23 rd WiN Global Annual Conference Women in Nuclear Meet Atoms for Peace Programme of Action for

SRv6 Network Programming (draft-filsfils-spring-srv6-network-programming-00) C. Filsfils (Cisco)

Interference, Dependence and Bells Theorem Samson Abramsky Department of Computer Science,

Bivariate Correlation r > 0 r < 0 r = 0 r = 0 r > 0 r = 0 remember: r measures

Lionel Riou Fransca Univariate & bivariate Two kind of analysis Univariate

Bivariate Count Processes for Earthquake Frequency Mathieu Boudreault & Arthur Charpentier