[PPT] - Predicting Constituency Vote Shares from Pre-Election Polls Chris PowerPoint Presentation

SLIDE 1

Predicting Constituency Vote Shares from Pre-Election Polls

Chris Hanretty (UEA) Benjamin E. Lauderdale (LSE) Nick Vivyan (Durham University)

1 / 24

SLIDE 2

#1:

The problem

2 / 24

SLIDE 3

Constituency-level election prediction in the UK

Generating constituency-level polling estimates for the 632

(England, Wales, Scotland) constituencies is infeasible.

For a sample of 500 per constituency, would need a national

sample of 316,000.

Uniform national swing is a reasonable approximation, but

could be wrong in any given election.

How can we combine national polling data and other sources of

relevant information to generate better constituency-level predictions?

3 / 24

SLIDE 4

#2:

Using information about constituencies

4 / 24

SLIDE 5

Constituency-level information about constituencies

Principle:
People who live in constituencies with similar characteristics

are more similar in their voting intentions

Each respondent we poll tells us a little bit about respondents

in similar constituencies

5 / 24

SLIDE 6

Constituency-level information about constituencies

Principle:
People who live in constituencies with similar characteristics

are more similar in their voting intentions

Each respondent we poll tells us a little bit about respondents

in similar constituencies

Procedure: Multilevel Regression
Build a regression model to predict individual-level votes with

constituency-level characteristics

Use regression estimates to predict vote shares in each

constituency

6 / 24

SLIDE 7

Constituency-level information about constituencies

Principle:
People who live in constituencies with similar characteristics

are more similar in their voting intentions

Each respondent we poll tells us a little bit about respondents

in similar constituencies

Procedure: Multilevel Regression
Build a regression model to predict individual-level votes with

constituency-level characteristics

Use regression estimates to predict vote shares in each

constituency

Caveats:
Only as helpful as the predictive power of the variables we use
Vote in last election is very powerful (near uniform swing)

7 / 24

SLIDE 8

Individual-level information about constituencies

Principle:
People who share demographic characteristics are more similar

in their voting intentions

Each respondent we poll tells us a little bit about respondents

with similar characteristics

8 / 24

SLIDE 9

Individual-level information about constituencies

Principle:
People who share demographic characteristics are more similar

in their voting intentions

Each respondent we poll tells us a little bit about respondents

with similar characteristics

Procedure: Multilevel Regression + Post-stratification (MRP)
Build a regression model to predict individual-level votes with

individual-level characteristics

Use Census data to determine how many of each type of person

is in each constituency (construct post-stratification weights)

Use regression estimates plus post-stratification weights to

predict vote shares in each constituency

9 / 24

SLIDE 10

Individual-level information about constituencies

Principle:
People who share demographic characteristics are more similar

in their voting intentions

Each respondent we poll tells us a little bit about respondents

with similar characteristics

Procedure: Multilevel Regression + Post-stratification (MRP)
Build a regression model to predict individual-level votes with

individual-level characteristics

Use Census data to determine how many of each type of person

is in each constituency (construct post-stratification weights)

Use regression estimates plus post-stratification weights to

predict vote shares in each constituency

Caveats:
Only as helpful as the predictive power of the variables we use
UK Census data availability/categories are a constraint

10 / 24

SLIDE 11

Geographic information about constituencies

Principle:
People in nearby constituencies are more similar in their voting

intentions

Each respondent we poll tells us a little bit about respondents

in nearby constituencies

11 / 24

SLIDE 12

Geographic information about constituencies

Principle:
People in nearby constituencies are more similar in their voting

intentions

Each respondent we poll tells us a little bit about respondents

in nearby constituencies

Procedure: Spatially Correlated Random Effects (SCRE)
Build a regression model where the constituency-level random

effects are spatially correlated

Use regression estimates to predict vote shares in each

constituency

12 / 24

SLIDE 13

Geographic information about constituencies

Principle:
People in nearby constituencies are more similar in their voting

intentions

Each respondent we poll tells us a little bit about respondents

in nearby constituencies

Procedure: Spatially Correlated Random Effects (SCRE)
Build a regression model where the constituency-level random

effects are spatially correlated

Use regression estimates to predict vote shares in each

constituency

Caveats:
Only as helpful as the predictive power of geography

13 / 24

SLIDE 14

More information is better

We don’t need to choose between individual, constituency, and

geographic data

We can combine all three.

14 / 24

SLIDE 15

#3:

Revisiting 2010

15 / 24

SLIDE 16

Survey data

2010 British Election Study CIPS data (un-weighted)
12,177 total sample size
632 constituencies in England, Wales and Scotland
19.3 mean respondents per constituency is (range: 3 to 46)
How well could we have predicted the 2010 constituency-level

results given these data?

16 / 24

SLIDE 17

Actual and survey-based national vote shares

Party Actual vote share Raw survey vote share Conservatives 36.1 35.6 Labour 29.0 26.0 Liberal Democrats 23.0 27.1

17 / 24

SLIDE 18

Individual, constituency, and geographic data

Constituency-level (UK Census)
Lagged vote shares (2005 on 2010 boundaries)
Log population density
Log of median earnings
Religious composition (4 levels)
Region (11 levels)
Individual-level (UK Census)
Male/Female
Renter/Owner
Private/Public Sector
Age Group (8 levels)
Education Qualifications (6 levels)
Social Grade (4 levels)
Geographic (UK Ordnance Survey)
Constituency adjacency

18 / 24

SLIDE 19

Predicted vs Actual Conservative Vote Estimated using disaggregation

●
20

40 60 80 100 20 40 60 80 100

vote.con

Predicted Actual

MAE = 9.26 r = 0.72

19 / 24

SLIDE 20

Predicted vs Actual Conservative Vote Estimated using MRP with SCRE

●
●
●
●
20

40 60 80 100 20 40 60 80 100

vote.con

Predicted Actual

MAE = 4.47 r = 0.96

20 / 24

SLIDE 21

Model comparison

Conservative Labour Lib Dem

MRP with SCRE and

seat−level predictors SCRE with seat−level predictors MRP with seat−level predictors Spatially uncorrelated REs with seat−level predictors Spatially uncorrelated REs only Direct 4 6 8 4 6 8 4 6 8

MAE

21 / 24

SLIDE 22

Robustness to reduced sample size

Conservative Labour Lib Dem

I

I I I I

I

I I I I

● ●
● ●
I

I I I I

I

I I I I

●
I

I I I I

●
●
●
●●
I

I I I I

I

I I I I

●
●
I

I I I I

●

I I I I I

Global smoothing Seat−level predictors ILPP Local smoothing ILPP + local smoothing Global smoothing Seat−level predictors ILPP Local smoothing ILPP + local smoothing Global smoothing Seat−level predictors ILPP Local smoothing ILPP + local smoothing N = 12177 N = 4000 N = 2000 4 6 8 10 12 4 6 8 10 12 4 6 8 10 12

MAE

22 / 24

SLIDE 23

#4:

Conclusion

23 / 24

SLIDE 24

Summary

Multiple kinds of information can be used to generate

constituency-level estimates of vote intention from a medium-sized national poll.

Individual-level demographics
Constituency-level characteristics
Geographic proximity
The payoff is large: 10x-100x improvement in the effective

sample size.

http://constituencyopinion.org.uk/

24 / 24