Basics of Numerical Optimization: Preliminaries
Ju Sun
Computer Science & Engineering University of Minnesota, Twin Cities
February 11, 2020
1 / 24
Basics of Numerical Optimization: Preliminaries Ju Sun Computer - - PowerPoint PPT Presentation
Basics of Numerical Optimization: Preliminaries Ju Sun Computer Science & Engineering University of Minnesota, Twin Cities February 11, 2020 1 / 24 Supervised learning as function approximation Underlying true function: f 0
1 / 24
2 / 24
f∈H
2 / 24
f∈H
W
2 / 24
f∈H
W
2 / 24
f∈H
W
2 / 24
3 / 24
4 / 24
5 / 24
5 / 24
5 / 24
5 / 24
5 / 24
6 / 24
6 / 24
6 / 24
∂xj (x)
6 / 24
∂xj (x)
6 / 24
g is differentiable at x and
g
g2(x)
7 / 24
g is differentiable at x and
g
g2(x)
7 / 24
g is differentiable at x and
g
g2(x)
f (x) ∇h (f (x)) .
7 / 24
∂f 2 ∂xj∂xi (x) .
∂xj
∂xi
8 / 24
∂f 2 ∂xj∂xi (x) .
∂xj
∂xi
∂f 2 ∂xj∂xi (x) and ∂f 2 ∂xi∂xj (x) exist and both are
∂xj∂xi (x)
∂f 2 ∂xj∂xi (x).
8 / 24
∂f 2 ∂xj∂xi (x) .
∂xj
∂xi
∂f 2 ∂xj∂xi (x) and ∂f 2 ∂xi∂xj (x) exist and both are
∂xj∂xi (x)
∂f 2 ∂xj∂xi (x).
8 / 24
∂f 2 ∂xj∂xi (x) .
∂xj
∂xi
∂f 2 ∂xj∂xi (x) and ∂f 2 ∂xi∂xj (x) exist and both are
∂xj∂xi (x)
∂f 2 ∂xj∂xi (x).
∂f 2 ∂xj∂xi (x) exist and are continuous, f
8 / 24
9 / 24
2) as δ → 0.
9 / 24
2) as δ → 0.
9 / 24
2) as δ → 0.
F )
9 / 24
i=1 1 k!f (k) (x) δk.
10 / 24
i=1 1 k!f (k) (x) δk.
10 / 24
i=1 1 k!f (k) (x) δk.
2 δ, Hδ with H symmetric satisties that
2)
2
10 / 24
i=1 1 k!f (k) (x) δk.
2 δ, Hδ with H symmetric satisties that
2)
2
10 / 24
11 / 24
F 12 / 24
F
12 / 24
F
12 / 24
F
12 / 24
F
12 / 24
dtf (x + tv)
13 / 24
dtf (x + tv)
13 / 24
dtf (x + tv)
13 / 24
dtf (x + tv)
13 / 24
dtf (x + tv)
13 / 24
dtf (x + tv)
u2
2
13 / 24
u2
2
14 / 24
u2
2
14 / 24
15 / 24
x f (x) s. t. x ∈ C.
16 / 24
x f (x) s. t. x ∈ C.
16 / 24
x f (x) s. t. x ∈ C.
16 / 24
x f (x) s. t. x ∈ C.
16 / 24
Credit: study.com
x∈Rn f (x)
17 / 24
Credit: study.com
x∈Rn f (x)
17 / 24
18 / 24
18 / 24
18 / 24
18 / 24
18 / 24
19 / 24
19 / 24
19 / 24
19 / 24
19 / 24
19 / 24
19 / 24
20 / 24
20 / 24
Credit: Wikipedia
20 / 24
Credit: Wikipedia
20 / 24
Credit: Wikipedia
20 / 24
Credit: Wikipedia
20 / 24
Credit: Wikipedia
20 / 24
21 / 24
21 / 24
22 / 24
22 / 24
2
2
22 / 24
2
2
2
2
22 / 24
2
2
2
2
2
2
22 / 24
2
2
2
2
2
2
2
2
2
2
22 / 24
2
2
2
2
2
2
2
2
2
2
22 / 24
[Boyd and Vandenberghe, 2004] Boyd, S. and Vandenberghe, L. (2004). Convex
[Coleman, 2012] Coleman, R. (2012). Calculus on Normed Vector Spaces. Springer New York. [Hiriart-Urruty and Lemar´ echal, 2001] Hiriart-Urruty, J.-B. and Lemar´ echal, C. (2001). Fundamentals of Convex Analysis. Springer Berlin Heidelberg. [Kawaguchi, 2016] Kawaguchi, K. (2016). Deep learning without poor local minima. arXiv:1605.07110. [Lampinen and Ganguli, 2018] Lampinen, A. K. and Ganguli, S. (2018). An analytic theory of generalization dynamics and transfer learning in deep linear networks. arXiv:1809.10374. [Munkres, 1997] Munkres, J. R. (1997). Analysis On Manifolds. Taylor & Francis Inc. [Zorich, 2015] Zorich, V. A. (2015). Mathematical Analysis I. Springer Berlin Heidelberg. 24 / 24