Statistical Machine Learning
Lecture 04: Optimization Refresher
Kristian Kersting TU Darmstadt
Summer Term 2020
- K. Kersting based on Slides from J. Peters· Statistical Machine Learning· Summer Term 2020
1 / 65
Statistical Machine Learning Lecture 04: Optimization Refresher - - PowerPoint PPT Presentation
Statistical Machine Learning Lecture 04: Optimization Refresher Kristian Kersting TU Darmstadt Summer Term 2020 K. Kersting based on Slides from J. Peters Statistical Machine Learning Summer Term 2020 1 / 65 Todays Objectives Make
1 / 65
2 / 65
3 / 65
4 / 65
5 / 65
i=1 (yi − φ (xi)⊺ θ)2
6 / 65
7 / 65
8 / 65
9 / 65
10 / 65
11 / 65
12 / 65
13 / 65
14 / 65
15 / 65
16 / 65
17 / 65
18 / 65
19 / 65
20 / 65
21 / 65
22 / 65
23 / 65
24 / 65
25 / 65
26 / 65
27 / 65
28 / 65
29 / 65
30 / 65
1 − θ2 2
1 = θ∗ 2 = 1 2λ∗ = 1 2
31 / 65
32 / 65
1In words: Add the constraints to the objective function using nonnegative Lagrange multipliers. Then solve for the primal variables θ that minimize this. The solution gives the primal variables λ as functions of the Lagrange multipliers. Now maximize this with respect to the dual variables under the derived constraints
33 / 65
y
x φ (x, y) ≤ min x max y
θ max λ≥0 L (θ, λ) ≥ max λ≥0 min θ L (θ, λ)
34 / 65
35 / 65
36 / 65
37 / 65
38 / 65
39 / 65
40 / 65
Variable θ2 Variable θ1
−10 10 −5 5 10 15 50 100 150 200
Variable θ2 Variable θ1
rosenbrocks function
−2 2 −2 −1 1 2 3 4 10 20 30 40
1
41 / 65
Variable θ2 Variable θ1 −10 10 −5 5 10 15 50 100 150 200
42 / 65
Variable θ2 Variable θ1 −10 10 −5 5 10 15 50 100 150 200
Variable θ2 Variable θ1 −10 10 −5 5 10 15 50 100 150 200
43 / 65
44 / 65
45 / 65
46 / 65
47 / 65
48 / 65
49 / 65
50 / 65
Variable θ2 Variable θ1 −10 10 −5 5 10 15 50 100 150 200
51 / 65
Variable θ2
Variable θ1 −2 2 −2 −1 1 2 3 4 10 20 30 40
52 / 65
53 / 65
54 / 65
Variable θ2 Variable θ1 −10 −5 5 10 15 −5 5 10 15 50 100 150 200 Variable θ2
Variable θ1 −2 2 −2 −1 1 2 3 4 10 20 30 40
55 / 65
nHδθj = 0 for 0 ≤ j < n
56 / 65
57 / 65
quadratic function
Variable θ2 Variable θ1 −10 10 −5 5 10 15 50 100 150 200
quadratic function
Variable θ2 Variable θ1 −10 10 −5 5 10 15 50 100 150 200
quadratic function
Variable θ2 Variable θ1 −10 10 −5 5 10 15 50 100 150 200
58 / 65
Variable θ2
Variable θ1 −2 2 −2 −1 1 2 3 4 10 20 30 40
59 / 65
60 / 65
61 / 65
62 / 65
63 / 65
64 / 65
65 / 65