Feature Extraction and Aggregation for Predicting the Euro 2016 - - PowerPoint PPT Presentation

feature extraction and aggregation for predicting the
SMART_READER_LITE
LIVE PREVIEW

Feature Extraction and Aggregation for Predicting the Euro 2016 - - PowerPoint PPT Presentation

Feature Extraction and Aggregation for Predicting the Euro 2016 Maryam Tavakol Hamid Zafartavanaelmi, and Ulf Brefeld Riva del Garda, Sep 19, 2016 Agenda Introduction Feature Extraction Prediction & Learning Performance


slide-1
SLIDE 1

Feature Extraction and Aggregation for Predicting the Euro 2016

Maryam Tavakol Hamid Zafartavanaelmi, and Ulf Brefeld

Riva del Garda, Sep 19, 2016

slide-2
SLIDE 2

Agenda

  • Introduction
  • Feature Extraction
  • Prediction & Learning
  • Performance Analysis
  • Summary

2

slide-3
SLIDE 3

Introduction

3

slide-4
SLIDE 4

Feature Extraction

  • Based on available data from the past tournaments
  • General country data
  • FIFA ranking, FIFA points, UEFA ranking, etc.
  • Normalising features using min and max rescaling —keep

the order

4

slide-5
SLIDE 5

Feature Extraction

  • Player specific data
  • Market value, age, num of matches/goals, etc.
  • Obtaining the current squads
  • Goal/play ratio —host advantage for France
  • Averaging for all players of a team
  • Normalising features using min and max rescaling

5

slide-6
SLIDE 6

Add a New Feature

6

slide-7
SLIDE 7

Club Division

Juventus Club rank = 2 Lazio

Club rank = 212

7

slide-8
SLIDE 8

Team-Club Harmony

(Normalised Club rank) x (num of players)

8

Country Num of Players Club Club Rank Spain 5 Barcelona 1 Italy 6 Juventus 2 France 2 Juventus 2 Germany 5 Bayern Munich 4 Belgium 3 Liverpool 42 Poland 3 Legia 52 Portugal 4 Sporting CP 179 Wales 3 Crystal Palace 0* Iceland 2 Hammarby 0*

slide-9
SLIDE 9

Prediction

  • A score per country is defined as a weighted sum
  • f features, i.e., linear function
  • The probabilities are computed based on obtained

scores

9

si = θ>

i xi

slide-10
SLIDE 10

Prediction

10

Win probability for team i Lose probability for team j Probability of draw

slide-11
SLIDE 11

Learning

  • Capture the outcome probabilities from the head to

head record of pair of countries

  • Germany vs. France: 27 times
  • 10 win for Germany, 12 for France and 5 draw

11

pwG = 10 27, pwF = 12 27, pd = 5 27

slide-12
SLIDE 12

Learning

  • Converting probabilities to scores
  • Obtaining parameters from the closed form solution
  • f ridge regression problem

12

ˆ θ = (X>X + I)1X>ˆ s

slide-13
SLIDE 13

Performance Analysis

  • Compare prediction results to actual tournament
  • utcome
  • Until Quarter-Final (QF)
  • Evaluation by multi class logarithmic loss

13

Logloss = − 1 N

N

X

i=1 M

X

j=1

yij ∗ log(pij)

slide-14
SLIDE 14

Overal Performance

  • Error of prediction for 45 matches before QF
  • Average error: 1.3187

14

log loss

slide-15
SLIDE 15

Insufficient Data

  • Relation of performance with amount of historical

data

15

Num of historical data Error per country

log loss num

slide-16
SLIDE 16

Sufficient Data

  • Reduction of error from 1.3187 to 1.1129 for teams

with more than 4 historical records

16

log loss

slide-17
SLIDE 17

Role of Past Euros

  • Eliminating teams with less than 2 appearance in

past Euro cups, error: 0.9680

17

log loss

slide-18
SLIDE 18

Baseline

  • Compare to a simple baseline (based on FIFA

ranking only)

18

log loss

slide-19
SLIDE 19

Summary

  • Collecting data
  • Feature extracting/cleaning
  • New feature: team-club harmony
  • Learn a linear model
  • Effect of historical data on the performance

19

slide-20
SLIDE 20

Questions?

Thanks for your attention

Email: tavakol@leuphana.de