Predicting biathlon shooting performance using machine learning - - PowerPoint PPT Presentation

predicting biathlon shooting performance
SMART_READER_LITE
LIVE PREVIEW

Predicting biathlon shooting performance using machine learning - - PowerPoint PPT Presentation

Predicting biathlon shooting performance using machine learning Thomas Maier 1 , Daniel Meister 2 , Severin Trsch 1 , Jon Peter Wehrlin 1 1 Eidgenssische Hochschule fr Sport Magglingen EHSM 2 Datahouse AG Introduction Shooting is crucial


slide-1
SLIDE 1
slide-2
SLIDE 2
slide-3
SLIDE 3
slide-4
SLIDE 4

Predicting biathlon shooting performance using machine learning

Thomas Maier1, Daniel Meister2, Severin Trösch1, Jon Peter Wehrlin1

1Eidgenössische Hochschule für Sport Magglingen EHSM 2Datahouse AG

slide-5
SLIDE 5
slide-6
SLIDE 6

Introduction

  • Shooting is crucial for end ranking (~50%)

(Luchsinger et al. 2017)

  • Influence of fatigue and biomechanical parameters

(Hoffmann et al. 1992; Sattlecker et al. 2017)

  • Shooting mode, athlete level, variation in performance

(Luchsinger et al. 2017; Skattebo & Losnegard 2017)

  • How predictable are individual shots?
slide-7
SLIDE 7

Data

  • World Cup, World Championships und Olympic Games

(only single athlete categories)

  • From HoRa, supplier of target system
  • Training data:

Test data: 2012/13 – 2015/16 2016/17 Total of 152’640 shots

slide-8
SLIDE 8

Data … as PDF

xkcd

slide-9
SLIDE 9
slide-10
SLIDE 10

Tidy data

One row for each shot

slide-11
SLIDE 11

Reorganise data with dplyr

slide-12
SLIDE 12

Gather data

slide-13
SLIDE 13

Feature Engineering (29 Variables)

slide-14
SLIDE 14

Rolling functions with zoo

slide-15
SLIDE 15

Analysis

Exploratory Data Analysis

  • 95% Confidence limits
  • Pearson Correlations
  • Chi-squared- / Mann-Whitney-U-Tests

Machine Learning

  • LogReg: logistic regression using only 1 input-variable
  • XGB: extreme gradient boosting with trees
  • NNet: artifical neural network
slide-16
SLIDE 16

LogReg XGB NNet

Sequential trees to fit errors

  • f previous trees
slide-17
SLIDE 17

Training Prediction Training Prediction Training Prediction Prediction Training Training data Test data

Time

Time sliced cross-validation

slide-18
SLIDE 18

Caret – ML model wrapper

slide-19
SLIDE 19
slide-20
SLIDE 20

Final model configurations

slide-21
SLIDE 21

Results – Exploratory Analysis

Hit rate varies between: Athletes > disciplines > shooting modes > shot number

slide-22
SLIDE 22

Results – ML Models

All models show low predictive power Complex models show about the same performance as LogReg

slide-23
SLIDE 23

Discussion

  • Largest differences in hit rates between athletes
  • Individual preceding mode-specific hit rate holds almost all predictive

information

  • Individual shots can be modelled as Bernoulli trial

→ explains observed variation

  • High random influence in competition results (± 1-2 hits / competition)
slide-24
SLIDE 24

Selina was really concentrated today, so she was able to access her true potential. She is a professional athlete! Irene was losing her confidence midway where she started to think too much, the pressure was too high on the last two shots.

A Swiss coach Another Swiss coach xkcd

slide-25
SLIDE 25

The hot hand [in basketball] is a massive and widespread cognitive illusion.

Daniel Kahneman

slide-26
SLIDE 26
slide-27
SLIDE 27

Final thoughts…

  • Not everyone understands probabilities / randomness
  • Not everyone is interested in the complexity of your models
  • Coaches / customers / executives / the public …

… are interested in stories and specific instructions

slide-28
SLIDE 28

Thomas Maier Senior Data Scientist Datahouse AG Alte Börse - Zürich 044 289 92 63 thomas.maier@datahouse.ch

slide-29
SLIDE 29