Dynamic Stochastic Optimization
Bill (Lin-Liang) Wu
University of Toronto linliang.wu@mail.utoronto.ca
July 3, 2014
Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 1 / 18
Dynamic Stochastic Optimization Bill (Lin-Liang) Wu University of - - PowerPoint PPT Presentation
Dynamic Stochastic Optimization Bill (Lin-Liang) Wu University of Toronto linliang.wu@mail.utoronto.ca July 3, 2014 Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 1 / 18 Overview Theory 1 Introduction
Bill (Lin-Liang) Wu
University of Toronto linliang.wu@mail.utoronto.ca
July 3, 2014
Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 1 / 18
1
Theory Introduction Definitions Methodology
2
Applications Production-consumption model Portfolio allocation
3
More Theory! Dynamic programming principle Verification theorem Viscosity solutions
Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 2 / 18
Study of optimization problems subject to randomness Deterministic vs Non-Deterministic optimization Deterministic:
No randomness involved of how future states develop Always produce same output from given initial condition
Non-Deterministic:
Dynamical systems subject to random perturbations Subjectivity of people
Goal: Find an optimal control to optimize some performance criterion
Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 3 / 18
Definition (Probability Space)
(Ω, F, P) Ω : sample space F : σ -field P : probability measure
Definition (Measurable)
Function f : Ω → [−∞, ∞] measurable if {f ∈ B} ∈ F, ∀B ∈ B(R), B(R) = ∩{F : F is a σ-field on R and I ⊂ F, I = (a, b) ⊂ R}
Definition (Stochastic process)
Sequence of random variables, an F-measurable function: X(n) : Ω → R for n ≥ 0, denoted as X = (X(n))n≥0
Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 4 / 18
Definition (Filtration)
Sequence of σ-fields Fn such that Fn ⊂ F and Fn ⊂ Fn+1. A process X is adapted if each X(n) is Fn-measurable
Definition (Brownian motion)
Mapping W : [0, ∞) × Ω → R for some probability space (Ω, F, P) measurable with respect to the product σ-field B([0, ∞]) × F = σ{B × A : B ∈ B([′, ∞)), A ∈ F such that
1 W (0) = 0, a.s.(P) 2 For 0 ≤ s < t < ∞, W (t) − W (s) has normal distribution with mean
zero and standard deviation √t − s
3 For all m and all 0 ≤ t1 ≤ t2 ≤ · · · tm, the increments
W (tn+1) − W (tn), n0, 1, . . . m − 1 are independent.
Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 5 / 18
Definition (Stochastic differential equation (SDE))
dX(t) = a(t, X(t))dt + b(t, X(t))dW (t) where a(t, x), b(t, x) : R2 → R and integral form: X(t) = X(0) + t
0 a(s, X(s))ds +
t
0 b(s, X(s))dW (s) where the first
integral is either Riemann or Lebesgue and the second is an Ito integral
Theorem (Ito Formula)
If F : R → R is of class C 2, then F(W (t)) − F(0) − t
0 F ′(W (s))dW (s) + 1 2
t
0 F ′′(W (s))ds
Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 6 / 18
There are four components in a stochastic optimization problem. State of the system: Start with a dynamical system which evolves
(Ω, F, P) The state of the system is denoted by Xt(w) at time t for a world scenario w ∈ Ω. Then the evolution of the system is characterized by the mapping t → Xt(w) through a stochastic differential equation in particular the geometric Brownian motion. Control: The evolution t → Xt of the system is influenced by a control α whose value is given at time t and is called the admissible
Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 7 / 18
Performance Criterion: The goal is to optimize over all admissible controls the functional J(X, α) = E[ ∞ e−θtf (Xt, αt)dt] where f is the reward function and θ > 0 is the discount factor. The value function is defined as V = sup
α J(X, α)
The main goal in a stochastic optimization problem is to find an
function.
Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 8 / 18
Hamilton-Jacobi-Bellman: Given our state system X = {X(t) : t ≥ 0} which is characterized by dX(t) = µ(X(t), α(t))dt + σ(X(t), α(t))dB(t) for Brownian motion B = {B(t) : t ≥ 0}, then the HJB equation is given by: sup
α {f (x, α) + µ(x, α)Vx(x) + σ2(x, α)
2 Vxx(x) − θV (x)} = 0 for value function V (x).
Hard to solve analytically so usually approximated numerically like finite difference method (Research in numerical analysis and PDE to find suitable methods)
Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 9 / 18
Model for production unit:
Capital value Kt which evolves according investment rate It in capital and the price St per unit of capital given by: dKt = Kt dSt
St + Itdt
Debt Lt of production unit evolves in terms of interest rate r, the consumption rate Ct and productivity rate Pt of capital: dLt = rLtdt − Kt
St dPt + (It + Ct)dt
Choose dynamic model for (St, Pt)
SDE:
dSt + µStdt + σ1StW 1
1
dPt = bdt + σ2dW 2
t
where (W1, W2) is 2D Brownian motion on filtered probability space (Ω, F, F = ((Ft)t, P) and µ, b, σ1, σ2 are constants and σ1, σ2 > 0.
Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 10 / 18
Dynamics:
Net value of production unit is Xt = Kt − Lt Impose constraints Kt ≥ 0, Ct ≥ 0, Xt ≥ 0, t ≥ 0 Control variables: Denote kt = Kt
Xt , ct = Ct Xt for investment and
consumption Dynamics of controlled system: dXt = Xt[kt(µ − r + b
St ) + (r − ct)]dt + ktXtσ1W 1 t + kt Xt St σ2dW 2 t
dSt = µStdt + σ1StdW 1
t
Given discount factor β > 0, utility function U the objective is to determine the optimal investment and consumption for the production unit: sup(k,c) E[ ∞ e−βtU(ct, Xt)dt]
Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 11 / 18
Model:
Financial market consisting of a riskless asset with strictly positive price process S0 representing the savings account and n risky assets of price process S representing stocks Invest in this market at time t and the number of shares invested in savings account is Xt−αtSt
S0
t
SDE: Self-financed wealth process evolves according to dXt = (Xt − αSt) dS0
t
S0
t + αtdSt
Dynamics:
Control is the process α and the portfolio allocation problem is choose best investment in these assets Denoting U for the utility function, we thus need supα E[U(Xt)]
Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 12 / 18
Goal: Primary goal of stochastic control problem is to find an optimal control Definition:
Dynamic programming is an optimization technique Comparison with divide and conquer: Both techniques split their input into parts, find subsolutions to the parts and then synthesize larger solutions from small ones Divide and conquer: split input at prespecified deterministic points (eg: always in the middle) Dynamic programming: splits its input at every possible split points rather than at pre-specified points. After trying all split points, it determines which split point is optimal
Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 13 / 18
Idea:
Question: How can we use dynamic programming principle to compute an optimal control? Answer: Partition the time interval into smaller chunks and optimize
Answer: Calculation of optimal control becomes pointwise minimization HJB Equation: Letting t → 0 is how we get the HJB equation: Assume that V is smooth enough and apply Ito’s formula between 0 and t.
Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 14 / 18
Model:
Consider infinite horizon discounted cost function J(x, α) = E( ∞ e−θsf (Xs, α(Xs)))ds for discount factor θ. Value function for this stochastic control problem is V (x) = infα J(x, α) where infimum is taken over all control functions. Then for control function α∗, we have for x ∈ R, t ∈ (0, ∞) : V (x) = J(x, α∗) = E( t
0 e−θsf (X ∗ s , α∗(X ∗ s ))ds) + E(
∞
t
e−θsf (X ∗
s ), α(X ∗ s )ds) =
⇒ V (x) = inf
α E[
t f (X α
s , α(X u s ))ds + e−θtV (X α t )]
(Dynamic programming principle)
Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 15 / 18
Idea:
Recap: We start with a control problem and then derive the HJB equation given by: sup
α {f (x, α) + µ(x, α)Vx(x) + σ2(x, α)
2 Vxx(x) − θV (x)} = 0 for value function V (x) Verification Theorem: If there exists a smooth solution to the HJB equation, this candidate coincides with the value function and it is unique
Proof: Ito’s formula: It is the assumption of a smooth solution so that we can use this formula
Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 16 / 18
Problem:
In the classical approach, we must assume a priori that the value function is smooth enough Value function in general not smooth
Procedure:
Direct method: Begin with control problem, first prove the dynamic programming principle Using DPP prove that value function is solution of HJB where the solution defined in weaker sense of viscosity solutions Establish that value function is unique solution of the HJB equation From this: we show that the HJB equation completely characterizes the value function of the control problem and thus use this characterization to find an optimal control
Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 17 / 18
Bill (Lin-Liang) Wu (University of Toronto) CUMC Conference 2014 July 3, 2014 18 / 18