Forecasting the FIFA World Cup Combining goal- and result-based team - - PowerPoint PPT Presentation

▶

Feb 25, 2024 512 likes •719 views

Forecasting the FIFA World Cup Combining goal- and result-based team ability parameters Pieter Robberechts , Jesse Davis http://people.cs.kuleuven.be/pieter.robberechts Introduction A popular research topic since the '60 Two popular

SLIDE 1

Forecasting the FIFA World Cup

Combining goal- and result-based team ability parameters

Pieter Robberechts, Jesse Davis  http://people.cs.kuleuven.be/pieter.robberechts

SLIDE 2

Introduction

A popular research topic since the '60 Two popular approaches:

1. Goal-based models

Model the number of goals scored by both teams

2. Result-based models

Model win-draw-loss outcomes directly

SLIDE 3

Typical approach: 1. Estimate team abilities based on historical match data 2. Use them to predict future match outcomes

Match outcome prediction

Data → Team ratings → Predictions

SLIDE 4

Match outcome prediction

Typical approach: 1. Estimate team abilities based on historical match data 2. Use them to predict future match outcomes

Data → Team ratings → Predictions

Data scraped from:

post WW2 international games from http://eloratings.net
betting odds from http://betexplorer.com/

SLIDE 5

Match outcome prediction

Typical approach: 1. Estimate team abilities based on historical match data 2. Use them to predict future match outcomes

Data → Team ratings → Predictions

Two rating systems were explored:

ELO ratings (result-based)
ODM ratings (goal-based)

Team ... Strength 2320 2237 2220 2207 ....

SLIDE 6

The ELO rating system 

A Result-based rating system

EH = 1 1 + 10RH−RA

400

R′H = RH + k(SH − EH)

Given:

RH, RA SH = { 1 0.5

Then:

Current home and away team ratings Expected score for the home team Actual score of the home team Updated rating of the home team If the home team won When draw If the home team lost

SLIDE 7

The ELO rating system 

A Result-based rating system

k = k0wi(1+δ)γ

Problem:

Not all games are handled with the same seriousness
Most games are played against weak opponents
Competitiveness factor
Margin of victory

Margin of victory weight Recentness factor

R′H = RH + k(SH − EH)

SLIDE 8

Offense-Defense ratings 

A Goal-based rating system

Given: Then:

Aij =

∑

i=1

Aij di di =

∑

i=1

Aji

Aij = 0 Score team j generated against team i Otherwise Offensive rating of team j Defensive rating of team i

SLIDE 9

Offense-Defense ratings 

A Goal-based rating system

Problem:

Large disparities between the number of games played and the

strength of the opponents

Teams in different confederations rarely play each other

Solution:

Update ratings sequentially For each team:

Pre-game ratings = weighted sum of a team's post game ratings
Post-game ratings = ODM procedure with pre-game ratings as initial ratings

SLIDE 10

Match outcome prediction 

Via team rating systems

Two prediction models were explored:

Ordered logit regression (result-based)
Bivariate poisson regression (goal-based)

Typical approach: 1. Estimate team abilities based on historical match data 2. Use them to predict future match outcomes

Data → Team ratings → Predictions

Predictor

Elo att def Elo def att

[ 0.43 0.33 0.24 ] "Belgium wins" "It's a tie" "England wins" Home advantage?

SLIDE 11

Tuning the predictive power

1 r − 1

r −1

∑

k=1

(

∑

l=1

( ̂ pl − yl))2

How accurate are our predictions? 3 possible interpretations:

1. How many games are predicted correctly?

→ Accuracy

2. How certain was the model about the true outcome?

→ Logarithmic loss

3. How certain was the model about the true ordered outcome?

→ Ranked Probability Score (RPS)

SLIDE 12

Tuning the predictive power

Dataset Test set Validation set Apply best model Training set

Until convergence: For each game ∈ Training set: update_rating(game) If game ∈ Validation set: make_prediction(game) End if End for Compute average RPS Update rating and prediction model parameters Minimise RPS with L-BFG-S algorithm:

SLIDE 13

Challenge I: Match outcome prediction

Accuracy LogLoss RPS ELO ordered logit ELO bivariate Poisson Random forest Bookmakers ELO+ODM ordered logit ELO+ODM bivariate Poisson ODM ordered logit ODM bivariate Poisson

, 5 1 , 6 , 1 4 , 2 3 , 9 2 1 , 1

The models were validated on the 2002, 2006, 2010 and 2014 World Cups

2002 2006 2010 2014 all X

SLIDE 14

Challenge I: Match outcome prediction

Accuracy RPS Bookmakers ELO ordered logit ELO+ODM ordered logit Berrar et al. Hubáček et al. Constantinou Tsokos et al.

And compared with the 2017 Soccer Prediction Challenge submissions

, 5 , 5 4 , 2 1 , 2 9

SLIDE 15

Accuracy LogLoss RPS 2014 Elo Elo+ODM FiveThirthyEight 2010 Elo Elo+ODM 2006 Elo Elo+ODM 2002 Elo Elo+ODM

, 3 , 6 , 1 , 2 4 , 1 5 , 2 5

Challenge II: Tournament elimination

How accurate can we predict the round of elimination of each team in previous World Cups?

SLIDE 16

Our predictions

SLIDE 17

Other's predictions

Accuracy LogLoss RPS FiveThirtyEight Zeileirs et al. Groll et al. Our model UBS

0,5 0,563 0,594 0,563 0,531 0,201 0,224 0,186 0,185 0,182 0,192 0,132 0,126 0,127 0,124

Tournament elimination

SLIDE 18

Online interactive

https://dtai.cs.kuleuven.be/sports/worldcup18/

SLIDE 19

Thanks!

Any questions?

Interactive at: https://dtai.cs.kuleuven.be/sports/worldcup18/