Supervised Learning
Part 1 — Theory
Sven Krippendorf Workshop on Big Data in String Theory Boston, 01.12.2017
Supervised Learning Part 1 Theory Sven Krippendorf Workshop on Big - - PowerPoint PPT Presentation
Supervised Learning Part 1 Theory Sven Krippendorf Workshop on Big Data in String Theory Boston, 01.12.2017 Content Theory Applications: Mathematica Discussion Def: Supervised Learning Supervised learning is the machine
Part 1 — Theory
Sven Krippendorf Workshop on Big Data in String Theory Boston, 01.12.2017
Supervised learning is the machine learning task of inferring a function from labelled training data. Workflow:
5 10 x
5 10 y
Classify data into two classes: Class 1: above line Class 2: below line Input: data points
5 10 x
5 10 y
Which line?
separating the data sets:
5 10 x
5 10 y
w.xi − b ≥ 1
2 |w|
w.xi − b ≤ −1
constraints, dual problem using Lagrange-multipliers. This problem then can be dealt with using quadratic programming algorithms.
(Mathematica, Matlab, Python, etc.)
5 10 x
5 10 y
5 10 x
5 10 y
5 10 x
5 10 y
Different representation of data via kernel map:
{x, y} → {x2, y2} {x, y} → {x2, y}
20 40 60 80 100 20 40 60 80 100 20 40 60 80
1 2 3 4 5 6
distinguish classes.
50 100 T 100 200 300 400 500 #
(x − µ0)Σ−1
0 (x − µ0) + log |Σ0| − (x − µ1)Σ−1 1 (x − µ1) − log |Σ1| < threshold
neighbours.
k
more noise less noise boundaries clear boundaries less clear
5 10
5 10
2 4 6
2 4 6 8 10
2 4 6
2 4 6 8
from true output.
… input layer hidden layer
d, w, b d, w, b
softmax(di) = edi P
j edj
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 0 0 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 0 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 2 3 4 5 6 7
0.5
100000 200000 300000 0.1 0.2 0.3 0.4 0.5
6 1 2 3
WdP1 = X23Y31Z12 − X12Y31Z23 + X36Y62Z23 − X23Y62Z36 −X36Y23Z12Φ61 + X12Y23Z36Φ61
1 2 1 3 2 3 1 3 1 2 1 2 1
Part 2 — Applications
Sven Krippendorf Workshop on Big Data in String Theory Boston, 01.12.2017
… simply because it’s quick for me and I assume people are familiar with it.
Part 3 — Discussion
Sven Krippendorf Workshop on Big Data in String Theory Boston, 01.12.2017