Predicting Constituency Vote Shares from Pre-Election Polls Chris - - PowerPoint PPT Presentation

predicting constituency vote shares from pre election
SMART_READER_LITE
LIVE PREVIEW

Predicting Constituency Vote Shares from Pre-Election Polls Chris - - PowerPoint PPT Presentation

Predicting Constituency Vote Shares from Pre-Election Polls Chris Hanretty (UEA) Benjamin E. Lauderdale (LSE) Nick Vivyan (Durham University) 1 / 24 #1: The problem 2 / 24 Constituency-level election prediction in the UK Generating


slide-1
SLIDE 1

Predicting Constituency Vote Shares from Pre-Election Polls

Chris Hanretty (UEA) Benjamin E. Lauderdale (LSE) Nick Vivyan (Durham University)

1 / 24

slide-2
SLIDE 2

#1:

The problem

2 / 24

slide-3
SLIDE 3

Constituency-level election prediction in the UK

  • Generating constituency-level polling estimates for the 632

(England, Wales, Scotland) constituencies is infeasible.

  • For a sample of 500 per constituency, would need a national

sample of 316,000.

  • Uniform national swing is a reasonable approximation, but

could be wrong in any given election.

  • How can we combine national polling data and other sources of

relevant information to generate better constituency-level predictions?

3 / 24

slide-4
SLIDE 4

#2:

Using information about constituencies

4 / 24

slide-5
SLIDE 5

Constituency-level information about constituencies

  • Principle:
  • People who live in constituencies with similar characteristics

are more similar in their voting intentions

  • Each respondent we poll tells us a little bit about respondents

in similar constituencies

5 / 24

slide-6
SLIDE 6

Constituency-level information about constituencies

  • Principle:
  • People who live in constituencies with similar characteristics

are more similar in their voting intentions

  • Each respondent we poll tells us a little bit about respondents

in similar constituencies

  • Procedure: Multilevel Regression
  • Build a regression model to predict individual-level votes with

constituency-level characteristics

  • Use regression estimates to predict vote shares in each

constituency

6 / 24

slide-7
SLIDE 7

Constituency-level information about constituencies

  • Principle:
  • People who live in constituencies with similar characteristics

are more similar in their voting intentions

  • Each respondent we poll tells us a little bit about respondents

in similar constituencies

  • Procedure: Multilevel Regression
  • Build a regression model to predict individual-level votes with

constituency-level characteristics

  • Use regression estimates to predict vote shares in each

constituency

  • Caveats:
  • Only as helpful as the predictive power of the variables we use
  • Vote in last election is very powerful (near uniform swing)

7 / 24

slide-8
SLIDE 8

Individual-level information about constituencies

  • Principle:
  • People who share demographic characteristics are more similar

in their voting intentions

  • Each respondent we poll tells us a little bit about respondents

with similar characteristics

8 / 24

slide-9
SLIDE 9

Individual-level information about constituencies

  • Principle:
  • People who share demographic characteristics are more similar

in their voting intentions

  • Each respondent we poll tells us a little bit about respondents

with similar characteristics

  • Procedure: Multilevel Regression + Post-stratification (MRP)
  • Build a regression model to predict individual-level votes with

individual-level characteristics

  • Use Census data to determine how many of each type of person

is in each constituency (construct post-stratification weights)

  • Use regression estimates plus post-stratification weights to

predict vote shares in each constituency

9 / 24

slide-10
SLIDE 10

Individual-level information about constituencies

  • Principle:
  • People who share demographic characteristics are more similar

in their voting intentions

  • Each respondent we poll tells us a little bit about respondents

with similar characteristics

  • Procedure: Multilevel Regression + Post-stratification (MRP)
  • Build a regression model to predict individual-level votes with

individual-level characteristics

  • Use Census data to determine how many of each type of person

is in each constituency (construct post-stratification weights)

  • Use regression estimates plus post-stratification weights to

predict vote shares in each constituency

  • Caveats:
  • Only as helpful as the predictive power of the variables we use
  • UK Census data availability/categories are a constraint

10 / 24

slide-11
SLIDE 11

Geographic information about constituencies

  • Principle:
  • People in nearby constituencies are more similar in their voting

intentions

  • Each respondent we poll tells us a little bit about respondents

in nearby constituencies

11 / 24

slide-12
SLIDE 12

Geographic information about constituencies

  • Principle:
  • People in nearby constituencies are more similar in their voting

intentions

  • Each respondent we poll tells us a little bit about respondents

in nearby constituencies

  • Procedure: Spatially Correlated Random Effects (SCRE)
  • Build a regression model where the constituency-level random

effects are spatially correlated

  • Use regression estimates to predict vote shares in each

constituency

12 / 24

slide-13
SLIDE 13

Geographic information about constituencies

  • Principle:
  • People in nearby constituencies are more similar in their voting

intentions

  • Each respondent we poll tells us a little bit about respondents

in nearby constituencies

  • Procedure: Spatially Correlated Random Effects (SCRE)
  • Build a regression model where the constituency-level random

effects are spatially correlated

  • Use regression estimates to predict vote shares in each

constituency

  • Caveats:
  • Only as helpful as the predictive power of geography

13 / 24

slide-14
SLIDE 14

More information is better

  • We don’t need to choose between individual, constituency, and

geographic data

  • We can combine all three.

14 / 24

slide-15
SLIDE 15

#3:

Revisiting 2010

15 / 24

slide-16
SLIDE 16

Survey data

  • 2010 British Election Study CIPS data (un-weighted)
  • 12,177 total sample size
  • 632 constituencies in England, Wales and Scotland
  • 19.3 mean respondents per constituency is (range: 3 to 46)
  • How well could we have predicted the 2010 constituency-level

results given these data?

16 / 24

slide-17
SLIDE 17

Actual and survey-based national vote shares

Party Actual vote share Raw survey vote share Conservatives 36.1 35.6 Labour 29.0 26.0 Liberal Democrats 23.0 27.1

17 / 24

slide-18
SLIDE 18

Individual, constituency, and geographic data

  • Constituency-level (UK Census)
  • Lagged vote shares (2005 on 2010 boundaries)
  • Log population density
  • Log of median earnings
  • Religious composition (4 levels)
  • Region (11 levels)
  • Individual-level (UK Census)
  • Male/Female
  • Renter/Owner
  • Private/Public Sector
  • Age Group (8 levels)
  • Education Qualifications (6 levels)
  • Social Grade (4 levels)
  • Geographic (UK Ordnance Survey)
  • Constituency adjacency

18 / 24

slide-19
SLIDE 19

Predicted vs Actual Conservative Vote Estimated using disaggregation

  • 20

40 60 80 100 20 40 60 80 100

vote.con

Predicted Actual

MAE = 9.26 r = 0.72

19 / 24

slide-20
SLIDE 20

Predicted vs Actual Conservative Vote Estimated using MRP with SCRE

  • 20

40 60 80 100 20 40 60 80 100

vote.con

Predicted Actual

MAE = 4.47 r = 0.96

20 / 24

slide-21
SLIDE 21

Model comparison

Conservative Labour Lib Dem

  • MRP with SCRE and

seat−level predictors SCRE with seat−level predictors MRP with seat−level predictors Spatially uncorrelated REs with seat−level predictors Spatially uncorrelated REs only Direct 4 6 8 4 6 8 4 6 8

MAE

21 / 24

slide-22
SLIDE 22

Robustness to reduced sample size

Conservative Labour Lib Dem

  • I

I I I I

  • I

I I I I

  • ● ●
  • ● ●
  • I

I I I I

  • I

I I I I

  • I

I I I I

  • ●●
  • I

I I I I

  • I

I I I I

  • I

I I I I

I I I I I

Global smoothing Seat−level predictors ILPP Local smoothing ILPP + local smoothing Global smoothing Seat−level predictors ILPP Local smoothing ILPP + local smoothing Global smoothing Seat−level predictors ILPP Local smoothing ILPP + local smoothing N = 12177 N = 4000 N = 2000 4 6 8 10 12 4 6 8 10 12 4 6 8 10 12

MAE

22 / 24

slide-23
SLIDE 23

#4:

Conclusion

23 / 24

slide-24
SLIDE 24

Summary

  • Multiple kinds of information can be used to generate

constituency-level estimates of vote intention from a medium-sized national poll.

  • Individual-level demographics
  • Constituency-level characteristics
  • Geographic proximity
  • The payoff is large: 10x-100x improvement in the effective

sample size.

  • http://constituencyopinion.org.uk/

24 / 24