Calculating Hypergradient
Jingchang Liu November 13, 2019
HKUST 1
Calculating Hypergradient Jingchang Liu November 13, 2019 HKUST 1 - - PowerPoint PPT Presentation
Calculating Hypergradient Jingchang Liu November 13, 2019 HKUST 1 Table of Contents Background Bilevel optimization Forward and Reverse Gradient-Based Hyperparameter Optimization Conclusion Q & A 2 Background Hyperparameter
HKUST 1
2
λ∈D
x∈Rp
3
w∈Rp
L
4
θ∈∈{0,1}P×L C( ˆ
2
′ − X ′w
w∈RP×L
L
5
x
′),
6
∂x ?
∂2f ∂x∂y and fYY = ∂2f ∂y 2 , 7
∂y
d dx ∂f (x,g(x)) ∂y
8
′(x) = −fXY (x, g(x))−1fYY (x, g(x)).
yyf (x, y) ∈ Rn×n and fYY = ∂ ∂x ∇yf (x, y) ∈ Rn, 9
λ∈D
x∈Rp
1,2h
1h
10
11
i=1 is summable, then this implies the
i=1 is positive and verifies ∞
12
13
λ∈Λ f (λ),
14
λ∈Λ f (λ),
15
d λ is the d × m matrix.
d λ , we rewrite it as
16
17
18
λ,s1,...,sT
T
19
20
∂L ∂st = 0 and ∂L ∂sT = 0
∂λ = T
T
21
22
23
24
25
26