A Comparison of Covariate-based Predictition Methods for FIFA World - PowerPoint PPT Presentation

A Comparison of Covariate-based Predictition Methods for FIFA World Cups A. Groll Faculty of Statistics, TU Dortmund University (joint work with J. Abedieh, C. Ley, A. Mayr, T. Kneib, G. Schauberger, G. Tutz & H. Van Eetvelde) Zurich R User Group Meetup October 25 th 2018, University of Zurich A. Groll (TU Dortmund) Predicting International Soccer Tournaments 1 / 38

Who will celebrate? Sources: youtube.com,EMAJ Magazine,youfrisky.com,Bailiwick Express A. Groll (TU Dortmund) Predicting International Soccer Tournaments 2 / 38

Who will cry? Sources: youtube.com,pinterest,BBC,Daily Mail A. Groll (TU Dortmund) Predicting International Soccer Tournaments 3 / 38

Theoretical Background A. Groll (TU Dortmund) Predicting International Soccer Tournaments 4 / 38

Part I: Regression-based Methods A. Groll (TU Dortmund) Predicting International Soccer Tournaments 5 / 38

Model for international soccer tournaments y ijk ∣ x ik , x jk ∼ Pois ( λ ijk ) i , j ∈ { 1 ,..., n } , i ≠ j λ ijk = exp ( β 0 + ( x ik − x jk ) ⊺ β ) n : Number of teams y ijk : Number of goals scored by team i against opponent j at tournament k x ik , x jk : Covariate vectors of team i and opponent j varying over tournaments β : Parameter vector of covariate effects A. Groll (TU Dortmund) Predicting International Soccer Tournaments 6 / 38

Regularized estimation Maximize penalized log-likelihood l p ( β 0 ,β β ) = l ( β 0 ,β β ) − λ J ( β β ) β β β A. Groll (TU Dortmund) Predicting International Soccer Tournaments 7 / 38

Regularized estimation Maximize penalized log-likelihood l p ( β 0 ,β β ) = l ( β 0 ,β β ) − λ J ( β β ) β β β = l ( β 0 ,β β ) − λ p ∣ β i ∣ , ∑ β i = 1 with lasso penalty term (Tibshirani, 1996): p J ( β β ) = ∣ β i ∣ . ∑ β i = 1 The model can be estimated with the R -package glmnet (Friedman et al., 2010). A. Groll (TU Dortmund) Predicting International Soccer Tournaments 7 / 38

Regularized estimation Maximize penalized log-likelihood l p ( β 0 ,β β ) = l ( β 0 ,β β ) − λ J ( β β ) β β β = l ( β 0 ,β β ) − λ p ∣ β i ∣ , ∑ β i = 1 with lasso penalty term (Tibshirani, 1996): p J ( β β ) = ∣ β i ∣ . ∑ β i = 1 The model can be estimated with the R -package glmnet (Friedman et al., 2010). Versions used for: EURO 2012 (Groll and Abedieh, 2013); World Cup 2014 (Groll et al., 2015); EURO 2016 (Groll et al., 2018) A. Groll (TU Dortmund) Predicting International Soccer Tournaments 7 / 38

Part II: Ranking Methods A. Groll (TU Dortmund) Predicting International Soccer Tournaments 8 / 38

Independent Poisson ranking model ∼ Pois ( λ ijm ) , Y ijm = exp ( β 0 + ( r i − r j ) + h ⋅ I ( team i playing at home )) λ ijm n : Number of teams M : Number of matches y ijm : Number of goals scored by team i against opponent j in match m r i , r j : strengths / ability parameters of team i and team j h : home effect; added if team i plays at home A. Groll (TU Dortmund) Predicting International Soccer Tournaments 9 / 38

Independent Poisson ranking model Likelihood function : ⎛ y jim ! exp ( − λ jim )⎞ w type , m ⋅ w time , m λ y ijm λ y jim L = ∏ M y ijm ! exp (− λ ijm ) ⋅ ijm jim ⎝ ⎠ , m = 1 with weights tm w time , m ( t m ) = ( 1 2 ) Half period and w type , m ∈ { 1 , 2 , 3 , 4 } (depending on type of match) . A. Groll (TU Dortmund) Predicting International Soccer Tournaments 10 / 38

Independent Poisson ranking model Likelihood function : ⎛ y jim ! exp (− λ jim )⎞ w type , m ⋅ w time , m λ y ijm λ y jim L = M y ijm ! exp (− λ ijm ) ⋅ ∏ ijm jim , ⎝ ⎠ m = 1 with weights tm w time , m ( t m ) = ( 1 2 ) Half period and w type , m ∈ { 1 , 2 , 3 , 4 } (depending on type of match) . Different extensions, for example, bivariate Poisson models . Ley et al. (2018) show that bivariate Poisson with Half Period of 3 years is best for prediction. A. Groll (TU Dortmund) Predicting International Soccer Tournaments 10 / 38

Part III: Random Forests A. Groll (TU Dortmund) Predicting International Soccer Tournaments 11 / 38

Random Forests ● introduced by Breiman (2001) ● principle : aggregation of (large) number of classification / regression trees � ⇒ can be used both for classification & regression purposes A. Groll (TU Dortmund) Predicting International Soccer Tournaments 12 / 38

Random Forests ● introduced by Breiman (2001) ● principle : aggregation of (large) number of classification / regression trees � ⇒ can be used both for classification & regression purposes ● final predictions : single tree predictions are aggregated, either by majority vote (classification) or by averaging (regression) A. Groll (TU Dortmund) Predicting International Soccer Tournaments 12 / 38

Random Forests ● introduced by Breiman (2001) ● principle : aggregation of (large) number of classification / regression trees � ⇒ can be used both for classification & regression purposes ● final predictions : single tree predictions are aggregated, either by majority vote (classification) or by averaging (regression) ● feature space is partitioned recursively, each partition has its own prediction A. Groll (TU Dortmund) Predicting International Soccer Tournaments 12 / 38

Random Forests ● introduced by Breiman (2001) ● principle : aggregation of (large) number of classification / regression trees � ⇒ can be used both for classification & regression purposes ● final predictions : single tree predictions are aggregated, either by majority vote (classification) or by averaging (regression) ● feature space is partitioned recursively, each partition has its own prediction ● find split with strongest difference between the two new partitions w.r.t. some criterion A. Groll (TU Dortmund) Predicting International Soccer Tournaments 12 / 38

Random Forests ● introduced by Breiman (2001) ● principle : aggregation of (large) number of classification / regression trees � ⇒ can be used both for classification & regression purposes ● final predictions : single tree predictions are aggregated, either by majority vote (classification) or by averaging (regression) ● feature space is partitioned recursively, each partition has its own prediction ● find split with strongest difference between the two new partitions w.r.t. some criterion ● Observations within the same partition as similar as possible, observations from different partitions very different (w.r.t. response variable) A. Groll (TU Dortmund) Predicting International Soccer Tournaments 12 / 38

Random Forests ● introduced by Breiman (2001) ● principle : aggregation of (large) number of classification / regression trees � ⇒ can be used both for classification & regression purposes ● final predictions : single tree predictions are aggregated, either by majority vote (classification) or by averaging (regression) ● feature space is partitioned recursively, each partition has its own prediction ● find split with strongest difference between the two new partitions w.r.t. some criterion ● Observations within the same partition as similar as possible, observations from different partitions very different (w.r.t. response variable) ● a single tree is usually pruned (lower variance but increases bias) A. Groll (TU Dortmund) Predicting International Soccer Tournaments 12 / 38

Random Forests ● introduced by Breiman (2001) ● principle : aggregation of (large) number of classification / regression trees � ⇒ can be used both for classification & regression purposes ● final predictions : single tree predictions are aggregated, either by majority vote (classification) or by averaging (regression) ● feature space is partitioned recursively, each partition has its own prediction ● find split with strongest difference between the two new partitions w.r.t. some criterion ● Observations within the same partition as similar as possible, observations from different partitions very different (w.r.t. response variable) ● a single tree is usually pruned (lower variance but increases bias) ● visualized in dendrogram A. Groll (TU Dortmund) Predicting International Soccer Tournaments 12 / 38

Dendrogram of regression tree 1 Rank p < 0.001 ≤ −15 > −15 3 Oddset p = 0.003 ≤ −0.003 > −0.003 Node 2 (n = 139) Node 4 (n = 213) Node 5 (n = 160) 8 8 8 6 6 6 4 4 4 2 2 2 0 0 0 Exemplary regression tree for FIFA World Cup 2002 – 2014 data using the function ctree from the R -package party (Hothorn et al., 2006). Response : Number of goals ; predictors : only FIFA Rank and Oddset are used. A. Groll (TU Dortmund) Predicting International Soccer Tournaments 13 / 38

Random Forests ● repeatedly grow different regression trees ● main goal: decrease variance A. Groll (TU Dortmund) Predicting International Soccer Tournaments 14 / 38

Random Forests ● repeatedly grow different regression trees ● main goal: decrease variance � ⇒ decrease correlation between single trees. A. Groll (TU Dortmund) Predicting International Soccer Tournaments 14 / 38

A Comparison of Covariate-based Predictition Methods for FIFA World - PowerPoint PPT Presentation

A Comparison of Covariate-based Predictition Methods for FIFA World Cups A. Groll Faculty of Statistics, TU Dortmund University (joint work with J. Abedieh, C. Ley, A. Mayr, T. Kneib, G. Schauberger, G. Tutz & H. Van Eetvelde) Zurich R

Covariate Adjustment and Statistical Power Tara Slough EGAP Learning Days X Covariate Adjustment

Model Selection Model Selection under Covariate Shift under Covariate Shift Masashi Sugiyama

Motivation: disease progression modelling Covariate-GPLVM Motivation: disease progression

Covariate Balancing Propensity Score for General Treatment Regimes Kosuke Imai Princeton

Covariate Balancing Propensity Score Kosuke Imai Princeton University Winter Conference in

Covariate Balancing Propensity Score Kosuke Imai Princeton University June 1, 2012 Joint work

Importance-Weighted Cross- Importance-Weighted Cross- Validation for Covariate Shift Validation

Treatment choice with many covariate values Aleksey Tetenov (University of Bristol) Cemmap

Comparison Based Merging Upper and Lower bounds EMADS Fall 2003: Comparison Based Merging Page 1

Comparison of Projection Methods TU Berlin derived from Deflation, Domain Deflation Comparison

Sensitivity of the population size estimates for different imputations of a covariate B.F.M.

Linking Design to Analysis of Cluster Randomized Trials: Covariate Balancing Strategies Fan

BNP survival regression with variable dimension covariate vector Peter M uller , UT Austin 1.0

Comparing covariate adjustment in interventional and observational studies Markus Kalisch,

The resurrection of time as a Time is a covariate determinant of rates continuous concept

Gov 2000: 9. Regression with Two Independent Variables Matthew Blackwell Fall 2016 1 / 62 1.

Data from each single match ... <tackle,15.4,41.1,112> <pass,25.0,67.1,113>

Salt Lake Community College Veterans Services Basic Orientation to Support Services and VA

CPSC 310 Software Engineering Lecture 3 Agile Process Models Scrum What is

THE EVOLVING FOI CULTURE IN THE U.S. Presentation for the 2012 International Ombudsman Institute

Social'Data'Science' David'Dreyer'Lassen' UCPH'ECON' September'24,'2015' In'God'we'trust,'

Autonomously Reviewing and Validating the Knowledge Base of a Never-Ending Learning System Saulo

Random Variable Models of Computation Michael W. Mislove Tulane University New Orleans, LA

Portable Enforcement Solution International Product Marketing Department Portable PTZ Dome Body

A Comparison of Covariate-based Predictition Methods for FIFA World - PowerPoint PPT Presentation

A Comparison of Covariate-based Predictition Methods for FIFA World Cups A. Groll Faculty of Statistics, TU Dortmund University (joint work with J. Abedieh, C. Ley, A. Mayr, T. Kneib, G. Schauberger, G. Tutz & H. Van Eetvelde) Zurich R

Covariate Adjustment and Statistical Power Tara Slough EGAP Learning Days X Covariate Adjustment

Model Selection Model Selection under Covariate Shift under Covariate Shift Masashi Sugiyama

Motivation: disease progression modelling Covariate-GPLVM Motivation: disease progression

Covariate Balancing Propensity Score for General Treatment Regimes Kosuke Imai Princeton

Covariate Balancing Propensity Score Kosuke Imai Princeton University Winter Conference in

Covariate Balancing Propensity Score Kosuke Imai Princeton University June 1, 2012 Joint work

Importance-Weighted Cross- Importance-Weighted Cross- Validation for Covariate Shift Validation

Treatment choice with many covariate values Aleksey Tetenov (University of Bristol) Cemmap

Comparison Based Merging Upper and Lower bounds EMADS Fall 2003: Comparison Based Merging Page 1

Comparison of Projection Methods TU Berlin derived from Deflation, Domain Deflation Comparison

Sensitivity of the population size estimates for different imputations of a covariate B.F.M.

Linking Design to Analysis of Cluster Randomized Trials: Covariate Balancing Strategies Fan

BNP survival regression with variable dimension covariate vector Peter M uller , UT Austin 1.0

Comparing covariate adjustment in interventional and observational studies Markus Kalisch,

The resurrection of time as a Time is a covariate determinant of rates continuous concept

Gov 2000: 9. Regression with Two Independent Variables Matthew Blackwell Fall 2016 1 / 62 1.

Data from each single match ... &lt;tackle,15.4,41.1,112&gt; &lt;pass,25.0,67.1,113&gt;

Salt Lake Community College Veterans Services Basic Orientation to Support Services and VA

CPSC 310 Software Engineering Lecture 3 Agile Process Models Scrum What is

THE EVOLVING FOI CULTURE IN THE U.S. Presentation for the 2012 International Ombudsman Institute

Social'Data'Science' David'Dreyer'Lassen' UCPH'ECON' September'24,'2015' In'God'we'trust,'

Autonomously Reviewing and Validating the Knowledge Base of a Never-Ending Learning System Saulo

Random Variable Models of Computation Michael W. Mislove Tulane University New Orleans, LA

Portable Enforcement Solution International Product Marketing Department Portable PTZ Dome Body

Data from each single match ... <tackle,15.4,41.1,112> <pass,25.0,67.1,113>