 
              Lessons Learned from Implementing Response Propensity Models in the 2013 Census Test Gina Walejko, PhD U.S. Census Bureau, Center for Survey Measurement The views expressed on statistical, methodological, technical, or operational issues are those of the author and not necessarily those of the U.S. Census Bureau.
Outline  2013 Census Test  Takeaways  Day 0 Model  Goals of test  Case Prioritization  Case control & tracking Model  Model  Systems infrastructure  Test execution 2
2013 Census Test  Operational study of Census NRFU procedures  In field Nov. - Dec. 2013  2,077 cases in Philadelphia  Used existing Census Bureau survey infrastructure  One goal included implementing automated propensity model to assign “high priority” cases to CAPI interviewers  Score propensity to complete on next attempt  Assign highest seven scores to interviewers daily 3
2013 Census Test  Operational study of NRFU procedures  In field Nov. - Dec. 2013  2,077 cases in Philadelphia CAPI interviewers  Used existing Census Bureau infrastructure assigned instructions  One goal included implementing automated propensity model to assign “high priority” based on daily model output. cases to CAPI interviewers  Score propensity to complete on next attempt  Assign highest seven scores to each interviewer 4
Day 0 Model  Day 0 Model  Used for cases with no contact attempt data  Used 2010 decennial information to predict household’s likelihood of response on the first personal visit  Housing unit status variables, refusal indicators, respondent information, and household characteristics  Ran model once on all cases 5
Case Prioritization Model 𝑞 𝑗𝑑 𝑚𝑝 = +𝛾 1 𝑦 𝑗𝑑1 + ⋯ + 𝛾 𝑙 𝑦 𝑗𝑑𝑙 1 − 𝑞 𝑗𝑑  𝑦 𝑗𝑑1 , … , 𝑦 𝑗𝑑𝑙 are the covariates on the next contact attempt, c , for the i th case associated with that contact.  𝛾 1 + ⋯ + 𝛾 𝑙 are regression parameter estimated from all prior contact attempts. 6
How?  Load frame file  Collect paradata Available in same place  Know workload  Execute program  Data setup  Model Daily  Business rules  Generate instructions  Put instructions on server  Transmit instructions to interviewers 7
Lessons Learned 8
1. Goals of Model-Based Intervention  What is the purpose of intervention?  e.g. reduce cost, reduce NR bias, increase RR  What are the anomalies of your data collection?  E.g. proxy interviews on occupied housing units (decennial Census)  E.g. vacant housing units (decennial Census, ACS) 9
2. Case Control & Tracking Considerations  If working with multiple modes, must identify cases in each.  Mode switches (e.g. CATI to CAPI)  Late returns (i.e. in CAPI but completed via self-response)  Decide how to handle reassignments.  Two copies of one case (e.g. one on interviewer laptop and one awaiting supervisor review)  May be reassigned after instructions already delivered to interviewer  For data analyses:  If you cannot guarantee interviewer will receive instructions, attached what interviewer saw to contact attempt data.  Save daily propensities.  For monitoring purposes:  Consider what data needs to be stored for reports 10
3. Model Considerations  If modeling contacts, decide how to manage certain attempts.  Non-sample unit members  Telephone contact attempts  Decide how to handle certain case dispositions.  Vacants and cases that are not housing units  Cases closed by supervisor’s action  Late returns completed via another mode of contact  No contacts (i.e. case “untouched” by interviewer) 11
4. Execution  Obtain interviewer Percent Days compliance. with Compliant  Design of case Transmissions management systems Attempted All High (two approaches) Priority Cases 45.22  Train (untrain?), Did Not Attempt All incentivize, supervise, High Priority Cases but monitor. Did Not Attempt Other  Account for Cases 6.96 Did Not Attempt All non-compliance in High Priority Cases and CAPI simulations. Attempted Other Cases 47.83 12
Questions? gina.k.walejko@census.gov Thanks to: Tamara Adams, Karen Bagwell, Stephanie Coffey, Jaya Damineni, Chandra Erdman, Susanne Johnson, Ganesan Kakkan, Scott Konicki, Peter Miller, Shadana Myers, and many others! 13
Back-up Slides 14
15
Case Prioritization Model ( con’t ) Type of input Description Case’s initial propensity to respond predicted by Day 0 Model Frame data Study treatment Whether or not the sample unit is in multi-unit structure Mode of each contact attempt (telephone or in-person) Total number of contact attempts already made on sample unit If contact made with household member during current or any previous contact attempts Paradata If potential respondent expressed reluctance during current or previous contact attempts If contact performed during “peak” hours, weekend or after 6:00 p.m. on weekday If more than one person in the housing unit If all sample unit members are less than 30 years-old Admin. records If all sample unit members are 70+ years-old If there are children under 5 years-old in the house 16
Systems Testing  Use an initial test to generate data to build program. Conduct subsequent test that includes instructions generated by program.  Examine diverse set of scenarios.  Interview: e.g. multiple refusals, partial complete, case put on hold, case reassigned, case recycled from different mode  Interviewer: e.g. didn’t work first day of field period, got let go before finishing all cases, didn’t follow expected procedures  Systems: e.g. instructions did not generate, data transmissions did not work  Test exact scenarios.  Be prepared to adjust scenarios if results aren’t what were expected. 17
Day 0 Model Details • Three main-effects stepwise models run on 2010 NRFU cases in the Philadelphia MSA to determine variables significant in predicting likelihood of response at the first, second, and third personal visit • Some manual examination and variable changes made to increase model parsimony • Due to the high predictive value of the main-effect models, 2- way interactions excluded • Data split into two panels, and parameters calculated on one panel and scored on the other panel to test model • Determined that first personal visit model most appropriate because we wanted to predict likelihood of a completed response at the first contact. • Predictive value was determined using concordance, how often the model correctly predicted that a response occurred within the 1 st , 2 nd , or 3 rd visit. 18
Recommend
More recommend