The Player Kernel Lucas Maystre , Victor Kristof, Antonio González Ferrer, Matthias Grossglauser School of Computer and Communication Sciences, EPFL MLSA workshop @ ECML-PKDD — September 19th, 2016
Context Our entry to the EURO 2016 Prediction Competition , Challenge 1 Task: probabilistic prediction of match outcomes "Italy wins" "Belgium wins" Italy ⇥ 0 . 37 0 . 32 ⇤ 0 . 31 Belgium predictor "It's a draw" 2
Starting point 0 . . M national teams. Team u has "strength" s u . 1 1 1 . = P ( u � v ) = . 1 + exp( − s > x ) 1 + exp[ � ( s u � s v )] . − 1 1 . . . s 1 0 . . 0 . s M Key challenges with national teams: 1. They play few matches every year: recent data is sparse 2. Their squad change frequently: old data is stale 3
Inspiration Many players play against each other in club competitions Can we transfer information from club matches to international matches? Source: http://www.estadao.com.br/ infograficos/onde-atuam-os-736- jogadores-da-copa-2010,esportes,280906 4
0 Main idea . . . 1 X Embed teams in the space of players . s u = ˜ 1 s i 1 i ∈ L u . . 1 1 . = P ( u � v ) = s > z ) 1 + exp( − ˜ 1 + exp[ � ( s u � s v )] − 1 ˜ s 1 − 1 . . . − 1 Club matches and international matches . . . . share the same parameters . . . ˜ 0 s P The number of parameters explodes . Seems like it will lead to statistical and computational issues. 5
Bayesian approach Keep a distribution over parameters instead of optimizing the likelihood. p (˜ s | D ) ∝ p ( D | ˜ s ) × p (˜ s ) prior distribution posterior distribution likelihood (e.g. Gaussian) Statistical issues solved? not clear Computational issues solved? no 6
Dual viewpoint In fine , we are only interested in "strength" di ff erence s > z | D ) p (˜ Accurate estimation of all parameters is not necessary s > z p ( f | D ) ∝ p ( D | f ) × p ( f ) f ( z ) = ˜ Cov[ f ( z ) , f ( z 0 )] = σ 2 z > z 0 Inference can be done in the dual space The player kernel! f ( z ) ∼ GP [ 0 , k ( z , z 0 )] 7
The cube Classification Linear Logistic Regression Regression Bayesian Kernel Bayesian Bayesian Linear Logistic Regression Regression Kernel Kernel Regression Classification GP GP Regression Classification 8 Credit: Zoubin Ghahramani
Dataset Collected data on 24 887 matches from main football competitions over the last 10 years . 33 157 distinct players appear in the dataset. 9
10
Results Logarithmic loss against competing approaches in 2008, 2012 and 2016. 2008 2012 2016 11
Ternary outcomes Rao and Kupper (1967) proposed the following extension. 1 P ( u � v ) = 1 + exp[ � ( s u � s v � α )] P ( u ⌘ v ) = 1 � P ( u � v ) � P ( v � u ) / P ( u � v ) · P ( v � u ) A draw is (essentially) equivalent to one win and one loss . 13
Recommend
More recommend