Predicting the Results of the Scottish Referendum Zhou Fang - - PowerPoint PPT Presentation

predicting the results of the scottish referendum
SMART_READER_LITE
LIVE PREVIEW

Predicting the Results of the Scottish Referendum Zhou Fang - - PowerPoint PPT Presentation

Predicting the Results of the Scottish Referendum Zhou Fang (Most of this presentation was prepared on the 31st of July. Well have a look at how things have changed later) The Referendum Scots (and people living in Scotland) are due


slide-1
SLIDE 1

Predicting the Results of the Scottish Referendum

Zhou Fang

slide-2
SLIDE 2

(Most of this presentation was prepared on the 31st of July. We’ll have a look at how things have changed later…)

slide-3
SLIDE 3

The Referendum

Scots (and people living in Scotland) are due to vote on the 18th of September on whether or not Scotland becomes an independent country.

  • Huge public interest
  • Intensely fought campaign
  • Result will shape the future of

the UK

slide-4
SLIDE 4

Opinion Polls

Several companies conduct opinion polls on how people plan to vote in the referendum. (We focus on the more mainstream pollsters of the British polling council.) Polls produce a wealth of data on public opinions with respect to the referendum. In the past this has been very successful at predicting the results of, for example, the 2012 US Elections. Can this be done for the Scottish Referendum?

slide-5
SLIDE 5

A look at the polls

Running average of polls

slide-6
SLIDE 6

Nate Silver

"There's virtually no chance that the 'yes' side will win. If you look at the polls, it's pretty definitive really where the no side is at 60-55% and the yes side is about 40 or so." "There is a wide variety of polls and they all show the 'no' vote ahead, some by modest margins and some by

  • verwhelming margins. The best you can do is take an

average of those.”

(13th August 2013)

slide-7
SLIDE 7

Another look at the polls

Let’s focus on the Yes share of the (non-undecided) vote: Y / (Y + N) To smooth the polls, can also use Princeton professor Sam Wang’s median-based method which was also successful in 2012. We get:

slide-8
SLIDE 8

Yes share of vote with 1 month rolling medians

slide-9
SLIDE 9

But...

  • Difficult to interpret as a prediction for Day 0
  • Very non-smooth
  • Ignores effect of different polling companies

○ Could differences be due to this?

slide-10
SLIDE 10

Yes share of vote with 1 month rulling medians

slide-11
SLIDE 11

Yes share of vote, with lines aggregating polls from the same pollster

slide-12
SLIDE 12

Spline model

Assuming

  • the underlying pattern of variation is smooth
  • the ‘house effect’ of each pollster is constant over time

it is natural to opt for a spline model to smooth the data, and make extrapolations: min || YesVote(t,i) - f(t) - Ai ||2 - P(f(t)) with P a smoothness penalty, and t, i the day and pollster associated with each poll. Applying to the data, we get:

slide-13
SLIDE 13

Results of spline model with house effect adjustments (Using package ‘mgcv’ in R)

slide-14
SLIDE 14

Is this enough?

In principle we can make a prediction for referendum day by taking f(0), and some average, say, across the polling companies. However...

slide-15
SLIDE 15

Sampling and weighting

Different pollsters represent different methods of

  • sampling
  • weighting (especially to

political affiliation)

  • asking the question

This can make a big difference! Few previous referendums, so difficult to say which procedure is correct.

slide-16
SLIDE 16

‘Game changer’?

Sudden changes of public opinion in the last few months of a campaign do happen ... even without an obvious ‘event’ to explain it... Even just before the election, opinion polls can fail if pollsters make wrong assumptions about whether people who say they will vote actually go vote. It is thus not so easy to predict. For example, applying method to another referendum 75 days before the end:

slide-17
SLIDE 17

AV referendum at 75 days out

slide-18
SLIDE 18

AV referendum, all the data

slide-19
SLIDE 19

Including the error

We have very little data to use to fully account for these

  • effects. To at least incorporate them, adopt a

randomisation based approach.

  • 1. Randomly select a polling company and make a

prediction at Day 0.

  • 2. Randomly add on +/- a Day-75 prediction error from one
  • f a number of similar previous elections with opinion poll

results Do this many times to create a distribution of predicted results.

slide-20
SLIDE 20

Simulated Yes votes. (Considered elections: AV vote, 2010 general election, Welsh devolution referendum, 2011 Scottish Parliament election)

slide-21
SLIDE 21

Conclusion

  • According to our model and simulation, at the end of

July No has approximately a 69% chance of winning the coming referendum.

  • This value will change as more polls come in and we

get closer to referendum day.

  • Clearly a lot of assumptions have been made!

Curiously, our value is essentially identical to the value

  • btained by David Bell’s (University of Stirling) analysis of

bets made on prediction markets. (The Independence Referendum:Predicting the Outcome)

slide-22
SLIDE 22

What happened since?

After the presentation was given, we have had

  • Two televised debates
  • A number of additional polls
  • We are now closer to the referendum...
slide-23
SLIDE 23

Yes share of vote, updated - dashed lines denote debates

slide-24
SLIDE 24

Yes share of vote, updated - Blue is the original spline, Red is with newer data

slide-25
SLIDE 25

Simulated referendum vote shares

slide-26
SLIDE 26

Simulated referendum vote shares - final estimate and win probability

slide-27
SLIDE 27

Any questions? zhou.fang@bioss.ac.uk zhou.zfang@gmail.com