The Player Kernel
Lucas Maystre, Victor Kristof, Antonio González Ferrer, Matthias Grossglauser School of Computer and Communication Sciences, EPFL MLSA workshop @ ECML-PKDD — September 19th, 2016
The Player Kernel Lucas Maystre , Victor Kristof, Antonio Gonzlez - - PowerPoint PPT Presentation
The Player Kernel Lucas Maystre , Victor Kristof, Antonio Gonzlez Ferrer, Matthias Grossglauser School of Computer and Communication Sciences, EPFL MLSA workshop @ ECML-PKDD September 19th, 2016 Context Our entry to the EURO 2016 Prediction
Lucas Maystre, Victor Kristof, Antonio González Ferrer, Matthias Grossglauser School of Computer and Communication Sciences, EPFL MLSA workshop @ ECML-PKDD — September 19th, 2016
Our entry to the EURO 2016 Prediction Competition, Challenge 1 Task: probabilistic prediction of match outcomes
2
predictor ⇥0.37 0.31 0.32⇤ Italy Belgium "Italy wins" "Belgium wins" "It's a draw"
Key challenges with national teams:
3
P(u v) = 1 1 + exp[(su sv)] M national teams. Team u has "strength" su
1
= 1 1 + exp(−s>x) s1 . . . sM . . . 1 . . . −1 . . .
4
Source: http://www.estadao.com.br/ infograficos/onde-atuam-os-736- jogadores-da-copa-2010,esportes,280906
Many players play against each other in club competitions Can we transfer information from club matches to international matches?
Club matches and international matches share the same parameters. Embed teams in the space of players.
5
su = X
i∈Lu
˜ si P(u v) = 1 1 + exp[(su sv)] The number of parameters explodes. Seems like it will lead to statistical and computational issues. = 1 1 + exp(−˜ s>z) ˜ s1 . . . . . . ˜ sP . . . 1 1 1 . . . −1 −1 −1 . . .
6
prior distribution
(e.g. Gaussian)
likelihood posterior distribution
Statistical issues solved? not clear Computational issues solved? no Keep a distribution over parameters instead of optimizing the likelihood.
In fine, we are only interested in
7
p(˜ s>z | D) "strength" difference Accurate estimation of all parameters is not necessary Inference can be done in the dual space f(z) = ˜ s>z p(f | D) ∝ p(D | f) × p(f) Cov[f(z), f(z0)] = σ2z>z0 f(z) ∼ GP[0, k(z, z0)] The player kernel!
Linear Regression Logistic Regression Kernel Regression Kernel Classification Bayesian Linear Regression Bayesian Logistic Regression GP Regression GP Classification
Kernel Bayesian Classification
Credit: Zoubin Ghahramani
8
9
Collected data on 24 887 matches from main football competitions over the last 10 years. 33 157 distinct players appear in the dataset.
10
11
Logarithmic loss against competing approaches in 2008, 2012 and 2016. 2008 2012 2016
Rao and Kupper (1967) proposed the following extension.
13
A draw is (essentially) equivalent to one win and one loss. P(u v) = 1 1 + exp[(su sv α)] P(u ⌘ v) = 1 P(u v) P(v u) / P(u v) · P(v u)