on the dependency of soccer scores a sparse bivariate
play

On the Dependency of Soccer Scores - A Sparse Bivariate Poisson - PowerPoint PPT Presentation

On the Dependency of Soccer Scores - A Sparse Bivariate Poisson Model for the UEFA EURO 2016 A. Groll & A. Mayr & T. Kneib & G. Schauberger Department of Statistics, Georg-August-University Gttingen MathSport International


  1. On the Dependency of Soccer Scores - A Sparse Bivariate Poisson Model for the UEFA EURO 2016 A. Groll ∗ & A. Mayr & T. Kneib & G. Schauberger ∗ Department of Statistics, Georg-August-University Göttingen MathSport International 2017 Conference, Padua, June 28th Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 1 / 22

  2. Who will celebrate? Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 2 / 22

  3. Who will cry? Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 3 / 22

  4. Theoretical Background Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 4 / 22

  5. Aims The main aims are to ● find an explicit model for exact numbers of goals ● include covariates ● adjust for possible correlations between numbers of goals of both competing teams. Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 5 / 22

  6. Aims The main aims are to ● find an explicit model for exact numbers of goals ● include covariates ● adjust for possible correlations between numbers of goals of both competing teams. ⇒ Different approaches for ● EURO 2012 (Groll and Abedieh, 2013) ● World Cup 2014 (Groll, Schauberger and Tutz, 2015) ● EURO 2016 Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 5 / 22

  7. Univariate Model for International Soccer Tournaments y ijk ∣ x ik , x jk ∼ Po ( λ ijk ) i , j ∈ { 1 ,..., n } , i ≠ j log ( λ ijk ) = β 0 + ξ ik − δ jk n : Number of teams y ijk : Number of goals scored by team i against opponent j at tournament k x ik , x jk : Covariate vectors of team i and opponent j varying over tournaments e.g. EURO 2012 (Groll and Abedieh, 2013): ξ ik = x T β ξ + b i β ik β δ jk = x T β δ + b j β jk β Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 6 / 22

  8. Univariate Model for International Soccer Tournaments y ijk ∣ x ik , x jk ∼ Po ( λ ijk ) i , j ∈ { 1 ,..., n } , i ≠ j log ( λ ijk ) = β 0 + ξ ik − δ jk n : Number of teams y ijk : Number of goals scored by team i against opponent j at tournament k x ik , x jk : Covariate vectors of team i and opponent j varying over tournaments e.g. World Cup 2014 (Groll, Schauberger and Tutz, 2015): ξ ik = x T β + att i β ik β δ jk = x T β + def j β jk β Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 6 / 22

  9. Univariate Model for International Soccer Tournaments y ijk ∣ x ik , x jk ∼ Po ( λ ijk ) i , j ∈ { 1 ,..., n } , i ≠ j log ( λ ijk ) = β 0 + ξ ik − δ jk n : Number of teams y ijk : Number of goals scored by team i against opponent j at tournament k x ik , x jk : Covariate vectors of team i and opponent j varying over tournaments e.g. World Cup 2014 (Groll, Schauberger and Tutz, 2015): ξ ik = x T β + att i β ik β δ jk = x T β + def j β jk β ⇒ log ( λ ijk ) = β 0 + ( x ik − x jk ) T β β + att i − def j β Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 6 / 22

  10. Correlation between Scores of Both Teams Dixon and Coles (1997) compared marginal distributions of scores with joint distribution ⇒ correlation! Source: Dixon and Coles (1997) ⇒ Introduction of additional dependence parameter But: They did not compare conditional distributions! ⇒ Their linear predictors are not independent! Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 7 / 22

  11. The Bivariate Poisson Distribution ∼ Po ( λ k ) , k = 1 , 2 , 3 , λ k > 0 ind . X k ⇒ Y 1 = X 1 + X 3 and Y 2 = X 2 + X 3 follow a joint bivariate Poisson distribution ( Y 1 , Y 2 ) ∼ Po 2 ( λ 1 ,λ 2 ,λ 3 ) Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 8 / 22

  12. The Bivariate Poisson Distribution ∼ Po ( λ k ) , k = 1 , 2 , 3 , λ k > 0 ind . X k ⇒ Y 1 = X 1 + X 3 and Y 2 = X 2 + X 3 follow a joint bivariate Poisson distribution ( Y 1 , Y 2 ) ∼ Po 2 ( λ 1 ,λ 2 ,λ 3 ) Probability function: P Y 1 , Y 2 ( y 1 , y 2 ) = P ( Y 1 = y 1 , Y 2 = y 2 ) exp (−( λ 1 + λ 2 + λ 3 )) λ y 1 λ y 2 min ( y 1 , y 2 ) k = ( y 1 k )( y 2 k ) k ! ( λ 3 ) ∑ 1 2 y 1 ! y 2 ! λ 1 λ 2 k = 0 Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 8 / 22

  13. The Bivariate Poisson Distribution ∼ Po ( λ k ) , k = 1 , 2 , 3 , λ k > 0 ind . X k ⇒ Y 1 = X 1 + X 3 and Y 2 = X 2 + X 3 follow a joint bivariate Poisson distribution ( Y 1 , Y 2 ) ∼ Po 2 ( λ 1 ,λ 2 ,λ 3 ) Probability function: P Y 1 , Y 2 ( y 1 , y 2 ) = P ( Y 1 = y 1 , Y 2 = y 2 ) exp (−( λ 1 + λ 2 + λ 3 )) λ y 1 λ y 2 min ( y 1 , y 2 ) k = ( y 1 k )( y 2 k ) k ! ( λ 3 ) ∑ 1 2 y 1 ! y 2 ! λ 1 λ 2 k = 0 ● E ( Y 1 ) = λ 1 + λ 3 ● E ( Y 2 ) = λ 2 + λ 3 ● cov ( Y 1 , Y 2 ) = λ 3 Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 8 / 22

  14. The Bivariate Poisson Distribution 0.06 0.08 0.06 0.04 0.04 0.02 0.02 0.00 6 6 0 0 2 4 2 4 4 4 2 2 6 6 0 0 ● λ 1 = 2 ● λ 1 = 1 ● λ 2 = 2 ● λ 2 = 1 ● λ 3 = 0 ● λ 3 = 1 Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 9 / 22

  15. Re-parametrization of bivariate Poisson distribution Replace λ 1 = γ 1 γ 2 and λ 2 = γ 1 γ 2 : P Y 1 , Y 2 ( y 1 , y 2 ) = P ( Y 1 = y 1 , Y 2 = y 2 ) ( γ 1 γ 2 ) y 2 2 ) + λ 3 ))( γ 1 γ 2 ) y 1 k min ( y 1 , y 2 ) = exp (−( γ 1 ( γ 2 + γ − 1 ( y 1 )( y 2 ) k ! ( λ 3 ) ∑ k k γ 2 y 1 ! y 2 ! k = 0 1 γ 1 = exp ( β 0 ) γ 2 = exp ( ˜ β ) x T β β λ 3 = exp ( α 0 + ∣ ˜ x ∣ T α α ) α x = x 1 − x 2 . with ˜ Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 10 / 22

  16. Re-parametrization of bivariate Poisson distribution Replace λ 1 = γ 1 γ 2 and λ 2 = γ 1 γ 2 : P Y 1 , Y 2 ( y 1 , y 2 ) = P ( Y 1 = y 1 , Y 2 = y 2 ) ( γ 1 γ 2 ) y 2 2 ) + λ 3 ))( γ 1 γ 2 ) y 1 k min ( y 1 , y 2 ) = exp (−( γ 1 ( γ 2 + γ − 1 ( y 1 )( y 2 ) k ! ( λ 3 ) ∑ k k γ 2 y 1 ! y 2 ! k = 0 1 γ 1 = exp ( β 0 ) ⇒ λ 1 = exp ( β 0 + ˜ β ) x T β β γ 2 = exp ( ˜ β ) ⇒ λ 2 = exp ( β 0 − ˜ β ) x T β x T β β β λ 3 = exp ( α 0 + ∣ ˜ x ∣ T α α ) α x = x 1 − x 2 . with ˜ Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 10 / 22

  17. Bivariate Poisson Model for Football Results ( y ik , y jk )∣ x ik , x jk ∼ Po 2 ( γ 1 ,γ ijk 2 ,λ ijk 3 ) ● γ 1 = exp ( β 0 ) ● γ ijk 2 = exp (( x ik − x jk ) T β β ) β ● λ ijk 3 = exp ( α 0 + ∣ x ik − x jk ∣ T α α ) α Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 11 / 22

  18. Bivariate Poisson Model for Football Results ( y ik , y jk )∣ x ik , x jk ∼ Po 2 ( γ 1 ,γ ijk 2 ,λ ijk 3 ) ● γ 1 = exp ( β 0 ) ● γ ijk 2 = exp (( x ik − x jk ) T β β ) β ● λ ijk 3 = exp ( α 0 + ∣ x ik − x jk ∣ T α α ) α � ⇒ Framework of the so-called Generalized Additive Model for Location, Scale and Shape (GAMLSS; Rigby and Stasinopoulos, 2005) Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 11 / 22

  19. Boosting for GAMLSS ● R -package gamboostLSS (Hofner, Mayr and Schmid, 2015) ● Allows for variable selection within GAMLSS framework ● Provides a large number of pre-specified distributions – Negative binomial distribution – Zero-inflated Poisson distribution – ... Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 12 / 22

  20. Boosting for GAMLSS ● R -package gamboostLSS (Hofner, Mayr and Schmid, 2015) ● Allows for variable selection within GAMLSS framework ● Provides a large number of pre-specified distributions – Negative binomial distribution – Zero-inflated Poisson distribution – ... ● Mostly restricted to univariate responses, first approach for bivariate normal distribution from Andreas Mayr ● Users can specify new distributions (also bivariate) by providing – loss/risk function → neg. log-likelihood – neg. gradient of loss function → score function – possibly suitable offsets for linear predictors Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 12 / 22

  21. Boosting for GAMLSS ● R -package gamboostLSS (Hofner, Mayr and Schmid, 2015) ● Allows for variable selection within GAMLSS framework ● Provides a large number of pre-specified distributions – Negative binomial distribution – Zero-inflated Poisson distribution – ... ● Mostly restricted to univariate responses, first approach for bivariate normal distribution from Andreas Mayr ● Users can specify new distributions (also bivariate) by providing – loss/risk function → neg. log-likelihood – neg. gradient of loss function → score function – possibly suitable offsets for linear predictors ⇒ We implemented bivariate Poisson distribution Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 12 / 22

  22. Application to UEFA Europoean Championship 2016 Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 13 / 22

  23. Covariates ● Economic Factors: – GDP per capita – population Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 14 / 22

  24. Covariates ● Economic Factors: – GDP per capita – population ● Sportive Factors: – Home advantage – ODDSET odds – market value – FIFA rank – UEFA points Groll et al. (MathSport 2017) A Sparse Model for the EURO 2016 14 / 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend