Pairwise Comparisons with Flexible Time-Dynamics
Lucas Maystre, Victor Kristof, Matthias Grossglauser KDD Research Track 2 — August 6th, 2019
Pairwise Comparisons with Flexible Time-Dynamics Lucas Maystre , - - PowerPoint PPT Presentation
Pairwise Comparisons with Flexible Time-Dynamics Lucas Maystre , Victor Kristof, Matthias Grossglauser KDD Research Track 2 August 6 th , 2019 Pairwise comparison data Association football example: Questions we might want to ask: Team 1
Lucas Maystre, Victor Kristof, Matthias Grossglauser KDD Research Track 2 — August 6th, 2019
2 Team 1 Team 2 Score France Portugal 2-5 Luxembourg Greece 0-3 ... ... ... Turkey Slovakia 1-1 Bulgaria Kosovo 0-1
Association football example: Questions we might want to ask: “How can we quantify the skill of France?” “How likely is South Korea to beat Germany?” → we need pairwise comparison models.
3
(i,j)∈D
Given data , the posterior distribution is D
<latexit sha1_base64="UP3JbDQ68thUsUruLlvz9QvaE80=">ACAXicbVDLSgMxFL1TX7W+qi7dBIvgqsxUQZcFRVxWsA+YDiWTpm1oMhmSjFCGrvwGt7p2J279Epf+iZl2Frb1QOBwzr3ckxPGnGnjut9OYW19Y3OruF3a2d3bPygfHrW0TBShTSK5VJ0Qa8pZRJuGU47saJYhJy2w/FN5refqNJMRo9mEtNA4GHEBoxgYyW/K7AZEczT2mvXHGr7gxolXg5qUCORq/80+1LkgaGcKx1r7nxiZIsTKMcDotdRNY0zGeEh9SyMsqA7SWeQpOrNKHw2ksi8yaKb+3Uix0HoiQjuZRdTLXib+5/mJGVwHKYvixNCIzA8NEo6MRNn/UZ8pSgyfWIKJYjYrIiOsMDG2pYUrocg68ZYbWCWtWtW7qNYeLiv1u7ydIpzAKZyDB1dQh3toQBMISHiBV3hznp1358P5nI8WnHznGBbgfP0CWKuX+A=</latexit>latent skill of i “i wins over j” [Zermelo, 1928] [Thurstone, 1927] Germany Brazil France Switzerland Iceland
0.68 1.09 1.18
4 Date Team 1 Team 2 Score 1923-09-15 France Portugal 2-5 1923-09-16 Luxembourg Greece 0-3 ... ... ... ... 2018-06-21 Turkey Slovakia 1-1 2018-06-21 Bulgaria Kosovo 0-1
Data come with a timestamp. Questions we might want to ask: “How strong was France in 1972? In 2018?” “How likely is South Korea to beat Germany today?” → we need dynamic models.
covariance function, defines time dynamics Skill becomes a (latent) stochastic process
5
6
Brownian motion Smooth dynamics Mean-reverting, stationary dynamics Discontinuities
7
8
M
i=1
(i,j,t)∈D
∈D
i
si = ⇥si(t1) · · · si(tN)⇤
<latexit sha1_base64="un8hfWeWeuxcMVRXm40vYRjEgNQ=">ACPXicdVBNSxBEO3xI35E42qOXhqXiF6WmfVrPQhCUDwFBVeFnWXo7qldG7t7hu4acRn2F+VH5DfkmOTkwZt49ZreDyWG5EHB470qurxXEmHYfgjmJicmn43Mzs3/35h8cNSZXnlwmWFdAUmcrsFWcOlDTQRIkKrnILTHMFl/zm8C/vAXrZGbOsZdDW7OukR0pGHopqRzFXJeun0h6QGMOXWlKrhlaedenLpEbmESbdJ3GIs3QeTLSvmzSGEz62pUqmFtux7u7OzSsBYO4Um0X9qNGg0VqpkjNOkch+nmSg0GBSKOdeKwhzbJbMohYL+fFw4yJm4YV1oeWqYBtcuh+/26SevpLSTWV8G6VD9c6Jk2rme5r7T3ft/vYG4r+8VoGdRruUJi8QjBgt6hSKYkYH2dFUWhCoep4wYaW/lYprZplAn/CbLVwPMnl5nP6fXNRr0VatfrZdPTwepzNLVska2SAR2SOH5ISckiYR5Cv5Tn6SX8G34CF4DJ5GrRPBeOYjeYPg+Tc96K6f</latexit>≈ N[si(t) | ˜ µit, ˜ σit] · N[sj(t) | ˜ µjt, ˜ σjt]
<latexit sha1_base64="vhqPj3YNxFSmOXfdHY34d2eg7wM=">ACg3icdZHNatAEMdXStOm7pebHnLoZVtTSGkxkuK0ziFgKJScigt1ErCEGK3Wzia7WrE7KjVCT5AnzDGP0Dfo+iOlDu7Awp/fO3MZKUFoPgxvO3Hmw/fLTzuPXk6bPnL9ovd0+trgzjI6alNucZWC5FwUcoUPLz0nBQmeRn2dWXuf/sJzdW6OIHzkqeKJgWYiIYoENp+zqGsjT6F40V4AUDWX9rxjYV+/jeIZHTGIXMeR2rqklrgc3Hv8SKqYIlTGjMco3ilxuLnK5qYiDSdruBN1eFBwefqJBN1iYE+FRdNDv03BFOmRlw7R9G+eaVYoXyCRYOw6DEpMaDAomedOK8tLYFcw5WMnC1DcJvViaQ1950hOJ9q4VyBd0H8zalDWzlTmIudT2fu+OdzkG1c46Se1KMoKecGWjSaVpKjp/AI0F4YzlDMngBnh/krZBRhg6O601iVTjdvJ3eD0/+I06oYH3eh7rzP4utrODnlN3pJ9EpLPZEBOyJCMCO/vT2Pem/8bf+DH/m9ZajvrXJekTXzj/8A3JvGvQ=</latexit>min
{µi,Σi} KL(qkp)
<latexit sha1_base64="iPYonIaDYxGMw+1+OvGyQ9HAOQ=">ACNXicdVDLSgMxFM3UV62vqks3wSJUkDJTrdaFIAgi6ELRVqEpQyZNazDJjElGKMN8jR/hN7jVlQsXgrj1F0wfghU9EDg517uvSeIONPGdV+czNj4xORUdjo3Mzs3v5BfXKrMFaE1kjIQ3UVYE05k7RmOH0KlIUi4DTy+DmoOdf3lGlWSgvTDeiTYE7krUZwcZKfn4PCSb9BCUQBSJBIk59tjHg56wjsP1ClKYQCWyulUiOT9LiLUR1qgyM1v18wS1tld1KZRu6JbcPS7zd8ma1Cr2hUgBDnPr5N9QKSyoNIRjrRueG5lmgpVhNM0h2JNI0xucIc2LJVYUN1M+memcM0qLdgOlX3SwL76syPBQuCGxlb1v92+uJf3mN2LSrzYTJKDZUksGgdsyhCWEvM9hihLDu5ZgopjdFZJrDAxNtmRKYFIbSbfh8P/Sb1c8jZL5bOtwv7hMJ0sWAGroAg8sAP2wRE4BTVAwD14BE/g2XlwXp1352NQmnGPctgBM7nF6E5rHc=</latexit>t∈D
Alternative viewpoint: q(si)/p(si)
<latexit sha1_base64="Niwxf5N6ktGkqtMvVoSczJmhTHM=">ACEXicdVDLSgMxFM3UV62vUVfiJliEuqkz01briCIywr2Ae0wZNK0DU1mxiQjlKH4EX6DW127E7d+gUv/xPQhWtEDgXPuZebe/yIUaks691ILSwuLa+kVzNr6xubW+b2Tl2GscCkhkMWiqaPJGE0IDVFSPNSBDEfUYa/uB87DduiZA0DK7VMCIuR72AdilGSkueuXeTa/s8kSOPHsFjGH1Xnpm18kXHKpVOoJW3JtDEPnMK5TK0Z0oWzFD1zI92J8QxJ4HCDEnZsq1IuQkSimJGRpl2LEmE8AD1SEvTAHEi3WRywgeaqUDu6HQL1Bwov6cSBCXcsh93cmR6svf3lj8y2vFqlt2ExpEsSIBni7qxgyqEI7zgB0qCFZsqAnCguq/QtxHAmGlU5vb4vORzuTrcPg/qTt5u5B3rorZysUsnTYBwcgB2xwCirgElRBDWBwBx7AI3gy7o1n48V4nbamjNnMLpiD8fYJ1GdJg=</latexit>9
for each item: recompute skill posterior q(si) ∝ p(si) Y
t∈D
N[si(t) | ˜ µit, ˜ σit]
<latexit sha1_base64="lxaOwzSndQ6Hg5WXDrkZkBsuEnU=">ACcHicbZHfaxQxEMez26q12nqtL4IPRg/hlHLs1kL7WFDEJ6ngtYXbZclmc9ehSTYms+IR8of6Ivgv+BeY+4HaqwOB73xmhkm+qY0Eh1n2PUk3Nu/cvbd1f/vBw53dR729/XPXdpaLEW9lay9r5oQELUYIKMWlsYKpWoqL+vrtvH7xVgHrf6MyNKxaYaJsAZRlT1vn0ZFLXyLlTwihbGtgZbatZYU3mkBWhaKIZXnEn/LoS/ycwdhUMDYraGiBIBvhC9WFygOGgz/EwVSxJSyrXj8bZougt0W+En2yirOq97NoWt4poZFL5tw4zwyWnlkELkXYLjonDOPXbCrGUWqmhCv9wqFAX0bS0Elr49FIF/TfCc+UczNVx875q9x6bQ7/Vxt3ODkpPWjTodB8uWjSRpdnNtNG7Co5xFwbiFeFfKr5hlHOn3NhSqxA9ydcduC3OD4f5m+Hhp6P+6fuVO1vkKXlBiQnx+SUfCBnZEQ4+ZFsJjvJbvIrfZI+S58vW9NkNfOY3Ij09W80r61</latexit>Using EP [Minka, 2001] or CVI [Khan et al., 2017] for each observation: approximate by p(i j | t)
<latexit sha1_base64="qRHQ1ApPURFlF+yLthqTURjPM5Q=">AC3icbVDLSsNAFJ3UV62vaJduBotQNyVRQZcFQVxWsA9oQplMJu3YmUmYmQgh9BP8Bre6didu/QiX/onTNgvbeuDC4Zx7OZcTJIwq7TjfVmltfWNzq7xd2dnd2z+wD486Kk4lJm0cs1j2AqQIo4K0NdWM9BJEA8Y6Qbjm6nfSJS0Vg86CwhPkdDQSOKkTbSwK4mdQo9lWIMH6HaQj12cCuOQ1nBrhK3ILUQIHWwP7xwhinAiNGVKq7zqJ9nMkNcWMTCpeqkiC8BgNSd9QgThRfj57fgJPjRLCKJZmhIYz9e9FjrhSGQ/MJkd6pJa9qfif1091dO3nVCSpJgLPg6KUQR3DaRMwpJgzTJDEJbU/ArxCEmEtelrISXgE9OJu9zAKumcN9yLxvn9Za15W7RTBsfgBNSBC65AE9yBFmgDLwAl7Bm/VsvVsf1ud8tWQVN1WwAOvrF89omi8=</latexit>N[si(t) | ˜ µit, ˜ σit] × N[sj(t) | ˜ µjt, ˜ σjt]
<latexit sha1_base64="jzqa72eB/rp8TWV3hoLieLc9Msk=">ACiHicbVFNS+RAEO1EXU3VGPemgcHBRkSFxB9yYuiCdRcFSYDqHS6Rlbu5PYXVkYQv7D/r097p/Y8/Z8I5OQcPjVdWr6ldJoaTFIPj+QuLS1+WV1Yba+sbX781N7fubV4aLro8V7l5TMAKJTPRYlKPBZGgE6UeEhefo7yD7+EsTLP7nBYiEjDIJN9yQEdFTd/t5kGfOKgqu6Z2N5gIeUaZlShlKlomK6rONKYn30xlg50DAhI8pYo81eX0sYd2h6azg83zB53mCjoziZivoBOgn0E4BS0yjZu4+ZelOS+1yJArsLYXBgVGFRiUXIm6wUorCuAvMBA9BzNwO0bV2Lma7jsmpf3cuJchHbPvOyrQ1g514ipHv7IfcyNyXq5XYv8sqmRWlCgyPhnULxXFnI7OQFNpBEc1dAC4kW5Xyp/AEd3rJkpia6dJ+FHBz6D+NO+L1zfHvSOr+curNCdsgeOSAhOSXn5IrckC7h5J+36+17b/hB/6p/2NS6nvTnm0yE/7Ff6eAx8I=</latexit>Using SSM reformulation [Hartikainen & Särkka., 2010] Converges in a few linear time iterations
10
[O'Hagan, 1978] [Hartikainen & Särkkä, 2010]
si1 si2 siN
t1 t2 tN
y1 y2 yN
. . . . . . . . .
si(t)
<latexit sha1_base64="x7p0NRkuZR9xEtilabuOBlcOSqU=">AB/HicbVBNSwMxEJ2tX7V+VT16CRahXspuFfRYEMRjBfsB7VKyabaNTbJLkhXKUn+DVz17E6/+F4/+E9N2D7b1wcDjvRlm5gUxZ9q47reTW1vf2NzKbxd2dvf2D4qHR0dJYrQBol4pNoB1pQzSRuGU7bsaJYBJy2gtHN1G89UaVZJB/MOKa+wAPJQkawsVJT91jZnPeKJbfizoBWiZeREmSo94o/3X5EkGlIRxr3fHc2PgpVoYRTieFbqJpjMkID2jHUokF1X46u3aCzqzSR2GkbEmDZurfiRQLrcisJ0Cm6Fe9qbif14nMeG1nzIZJ4ZKMl8UJhyZCE1fR32mKDF8bAkmitlbERlihYmxAS1sCcTEZuItJ7BKmtWKd1Gp3l+WardZOnk4gVMogwdXUIM7qEMDCDzC7zCm/PsvDsfzue8NedkM8ewAOfrF7WdlVQ=</latexit>is a Gaussian process
¯ sk
i1
¯ sk
i2
¯ sk
iN
t1 t2 tN
y1 y2 yN
. . . . . . . . .
K K K
¯ si(t)
<latexit sha1_base64="oYHaWVI03f5Y7WMRqtsZsGY0mHE=">ACXicbVDLSsNAFJ3UV62vVJduBotQNyWpgi4LgrisYB/QhDCZTtqhM5MwM1FKyBf4DW517U7c+hUu/ROnbRbaeuDC4Zx7OZcTJowq7ThfVmltfWNzq7xd2dnd2z+wq4dFacSkw6OWSz7IVKEUE6mpG+okiIeM9MLJ9czvPRCpaCzu9TQhPkcjQSOKkTZSYFe9EMnMC3m8jygdX0W2DWn4cwBV4lbkBo0A7sb28Y45QToTFDSg1cJ9F+hqSmJG84qWKJAhP0IgMDBWIE+Vn89dzeGqUIYxiaUZoOFd/X2SIKzXlodnkSI/VsjcT/MGqY6u/IyKJNVE4EVQlDKoYzjrAQ6pJFizqSEIS2p+hXiMJMLatPUnJeS56cRdbmCVdJsN97zRvLuotW6KdsrgGJyAOnDBJWiBW9AGHYDBI3gGL+DVerLerHfrY7FasoqbI/AH1ucPVByanw=</latexit>is a Gauss-Markov process
11 Dataset N Timespan ATP tennis 618,934 1991-2017 NBA basketball 67,642 1946-2018 FIFA football 19,158 1908-2018 Chess 7,169,202 1475-2017 StarCrafu WoL 61,657 — StarCrafu HotS 28,582 —
Measure predictive performance, compare against:
si si + λ ∂ ∂si log p(i j)
<latexit sha1_base64="Q1Yku/zj+QIhvwBs5CImO7/USw=">ACP3icbZBLS8NAFIUn9VXrK+rSzWARKkJqDLgqAuK9gHNCFMJpN27OTBzEQoIT/JH+FvcCXowpU7cevOSRvEtl4IfJx7Dndy3JhRIQ3jRSstLa+srpXKxubW9s7+u5eR0QJx6SNIxbxnosEYTQkbUklI72YExS4jHTd0W+7z4QLmgU3slxTOwADULqU4ykhz9WjgUWgMiBczpBFpMhT0ELZ8jnFox4pIilv1SbsuUKxrAuKaiIsEY3h87etWoG5OBi2AWUAXFtBz93fIinAQklJghIfqmEUs7zY9gRrKlQgSIzxCA9JXGKACDud/HAGj5TiQT/i6gslnKh/EykKhBgHrnIGSA7F/C4X/9v1E+lf2CkN40SE8P+QmDMoJ5e9CjnGDJxgoQ5lS9FeIhUk1J1fHMFTfIVCfmfAOL0GnUzdN64/as2rwq2imDA3AIasAE56AJbkALtAEGj+AZvI37Un70D61r6m1pBWZfTAz2vcPXBSvrQ=</latexit>si(t) ≡ si
<latexit sha1_base64="g3Jxmsj1WYQihZpIiU5iUk/iFS0=">ACXicbVDLSsNAFJ34rPWV6tLNYBHqpiRV0GVBEJcV7APaECbT23boZBJnJpUS+gV+g1tduxO3foVL/8Rpm4VtPXDh3HPu5V5OEHOmtON8W2vrG5tb27md/O7e/sGhXThqCiRFOo04pFsBUQBZwLqmkOrVgCQMOzWB4M/WbI5CKReJBj2PwQtIXrMco0Uby7YLyWUmf4w48JmyETefbRafszIBXiZuRIspQ8+2fTjeiSQhCU06UartOrL2USM0oh0m+kyiICR2SPrQNFSQE5aWz1yf4zChd3IukKaHxTP27kZJQqXEYmMmQ6IFa9qbif1470b1rL2UiTjQIOj/USzjWEZ7mgLtMAtV8bAihkplfMR0QSag2aS1cCcKJycRdTmCVNCpl96Jcub8sVm+zdHLoBJ2iEnLRFaqiO1RDdUTRE3pBr+jNerberQ/rcz6ZmU7x2gB1tcvPcuZ9A=</latexit>si(t + ∆t) = si(t) + N[0, σ2∆t]
<latexit sha1_base64="pKOKUmrC3cec+EMhZH9mhDQU3b0=">ACNnicbVDLSgMxFM34fjvq0k2wCIpSZqgG0VQxJUoWCu0Y7mTpm0wmRmSO0IZ+jd+hN/gVjduBfi1k8wU4uo9UDgcM653JsTJlIY9LxnZ2h4ZHRsfGJyanpmdm7eXVi8NHGqGS+zWMb6KgTDpYh4GQVKfpVoDiqUvBLeHOZ+5ZrI+LoAjsJDxS0ItEUDNBKdXf1MUa0g1aO+ISgeI63aM9bT0XFWCbgcxOu1Vvk9aMaCm4Ln2Hg7pb8IpeD3SQ+H1SIH2c1d3XWiNmqeIRMgnGVH0vwSADjYJ3p2qpYnwG6gxauWRqC4CbLeP7t01SoN2oy1fRHSnvpzIgNlTEeFNpkfbv56ufifV02xuRtkIkpS5BH7WtRMJcWY5qXRhtCcoexYAkwLeytlbdDA0Fb7a0uourYT/28Dg+SyVPS3iqXz7cLBcb+dCbJMVsga8ckOSAn5IyUCSN35IE8kifn3nlx3pz3r+iQ059ZIr/gfHwCryWpiQ=</latexit>12
Dataset Constant Elo TrueSkill kickscore Covariance function ATP tennis 0.581 0.563 0.563 0.552 Affine + Wiener NBA basketball 0.692 0.634 0.634 0.630 Constant + Matérn 1/2 World football 0.929 0.950 0.937 0.926 Constant + Matérn 1/2 ChessBase small 1.030 1.035 1.030 1.026 Constant + Wiener
kickscore outperforms baselines on all datasets Best dynamics vary across datasets
13
Multi-threaded implementation in Go
Impact of variational approximation
p(s1, . . . , sM | D) /
M
Y
i=1
p(si) Y
(i,j,t)∈D
p(i j | t) ⇡ q(s1, . . . , sM) . = Y
i
N(si | µi, Σi)
<latexit sha1_base64="0FzKYe2SC9AVUb1RrhRqQKX6Jg=">AC/HicfVJLb9QwEHZSHmV5dAtHLhYr0FZarZKCBekSnDg0qoVbFtpvUSO4916azu7SBWUfgNXOHMDXHlv3DsP2GyCaUPxEi2x98vGMUyOF81H0KwhXrl2/cXP1Vuf2nbv31r9/dXljGRyXuT1MqeNSaD7ywkt+aCynKpX8ID1+VdsPnDrRK7f+YXhE0VnWkwFox6gZD1YMX2SqtJVSTzARGa5d3A2yDYmSmSwUX/EqCxfVxsdDPKEGJsbn+P6zJS4Jc4rt5v47NcYuOPrS8GeD7AHgChz6eqwFtg4grG8Lwhqp1Iy0ANxH/EJ/+primGAMhPWjrxl2GnOiumfQbciCrg3qQgb8VM0brWpNuLhtFS8FUlbpUeamU36Z4CKysU15J6tw4joyflNR6wSvOqRw3FB2TGd8DKqmirtJuZxWhR8DkuFpbmFpj5fo+YiSKucWKgXP+i3usq0G/2UbF376YlIKbQrPNWuIpoXEMKh69DgTljMvF6BQZgXUitkRtZR5+CAXWFJVQU/iyx24quxvDuOnw829Z72tnbY7q+gheoT6KEbP0RZ6g3bRCLFgHnwOvgRfw0/ht/B7+KNxDYM25gG6IOHP3wyU69g=</latexit>predictive accuracy is preserved
Extensions of the model
likelihoods (Poisson, Gaussian, ...)
14
15
16
https://kickoff.ai
Key contributions:
flexible time-dynamics
17
pip install kickscore Try it out: