max plus stochastic processes and control w h fleming
play

Max-plus Stochastic Processes and Control W.H. Fleming, Brown - PowerPoint PPT Presentation

Max-plus Stochastic Processes and Control W.H. Fleming, Brown University 1 1. Introduction, historical background 2. Max-plus expectations 3. Max-plus SDEs and large deviations 4. Max-plus martingales and differential rule 5. Dynamic


  1. Max-plus Stochastic Processes and Control W.H. Fleming, Brown University 1

  2. 1. Introduction, historical background 2. Max-plus expectations 3. Max-plus SDEs and large deviations 4. Max-plus martingales and differential rule 5. Dynamic programming PDEs and variational inequalities 6. Max-plus stochastic control I: terminal cost 7. Max-plus optimal control II: max-plus additive running cost 8. Merton optimal consumption problem 2

  3. Historical Background a) Optimal deterministic control Pontryagin’s principle, Bellman’s dynamic pro- gramming principle (1950s) b) Two-player, zero-sum differential games Isaacs pursuit-evasion games (1950s) c) Stochastic control Deterministic control theory ignores time varying disturbances in dynamics Stochastic differential equations models 3

  4. Dynamic programming/PDE methods (1960s) Changes of probability measure-Girsanov

  5. d) Freidlin-Wentzell large deviations theory Small random perturbations, rare events (late 1960s) e) H -infinity control theory (1980s) Disturbances not modeled as stochastic processes, min-max viewpoint 4

  6. Stochastic vs deterministic views of uncertainty v ∈ Ω an ”uncertainty” J ( v ) a “criterion” or “cost” Stochastic view: J a random variable on (Ω , F , P ) Evaluate E [( F ( J )] Nonstochastic view: Evaluate max J ( v ) v 5

  7. Less conservative viewpoint: evaluate [ q ( v ) + J ( v )] = E + ( J ) max v q ( v ) “likelihood” of v q ( v ) ≤ 0 , q ( v 0 ) = 0 6

  8. Connection between stochastic and nonstochastic views F ( J ) = F θ ( J ) = e θJ , θ a risk sensitivity parameter p θ ( v ) probability of v p θ ( v ) ∼ e − θq ( v ) θ →∞ θ − 1 log E e θJ = E + ( J ) � � lim 7

  9. 2. Max-plus expectations Max-plus addition and multiplication −∞ ≤ a, b < ∞ a ⊕ b = max( a, b ) a ⊗ b = a + b 8

  10. Maslov idempotent probability calculus Q ( A ) = sup q ( v ) v ∈ A max-plus probability of A ⊂ Ω E + ( J ) = ⊕ v [ q ( v ) ⊗ J ( v )] max-plus expectation of J Max-plus linearity E + ( J 1 ⊕ J 2 ) = E + ( J 1 ) ⊕ E + ( J 2 ) E + ( c ⊗ J ) = c ⊗ E + ( J ) 9

  11. 3. Max-plus stochastic differential equations and large deviations Fleming Applied Math. Optimiz. 2004 x ( s ) ∈ R n solution to the ODE dx ( s ) = f ( x ( s )) ds + g ( x ( s )) v ( s ) ds, t ≤ s ≤ T v ( s ) ∈ R d x ( t ) = x, v ( · ) a disturbance control function 10

  12. v ( · ) ∈ Ω = L 2 ([ t, T ]; R d ) � T q ( v ) = − 1 t | v ( s ) | 2 ds 2 J ( v ) = J ( x ( · )) � T  J ( x ( · )) − 1   E + [ J ( x ( · ))] = sup t | v ( s ) | 2 ds  2 v ( · ) Example 1: J ( x ( · )) = ℓ ( x ( T )) terminal cost Example 2: J ( x ( · )) = max [ t,T ] ℓ ( x ( s )) max-plus ad- ditive running cost 11

  13. Assumptions: f, g, ℓ ∈ C 1 f x , g, g x , ℓ, ℓ x bounded Connection with large deviations X θ ( s ) solution to the SDE dX θ ( s ) = f ( X θ ( s )) ds + θ − 1 2 g ( X θ ( s )) dw ( s ) , t ≤ s ≤ T X θ ( t ) = x w ( s ) d -dimension Brownian motion 12

  14. In Example 1 θ →∞ θ − 1 log E e θℓ ( X θ ( T )) = E + [ ℓ ( x ( T ))] � � lim In Example 2 � T θ →∞ θ − 1 log E t e θℓ ( X θ ( s )) ds = E + [max lim [ t,T ] ℓ ( x ( s ))] If L = e ℓ , then L θ = e θℓ . 13

  15. 4. Max-plus martingales and differential rule Conditional likelihood of v , given A ⊂ Ω q ( v | A ) = q ( v ) − sup q ( ω ) , if v ∈ A ω ∈ A = −∞ if v �∈ A v τ = v | [ t,τ ] � T q ( v | v τ ) = − 1 τ | v ( s ) | 2 ds 2 M ( s ) = M ( s, v s ) is a max-plus martingale if E + [ M ( s ) | v τ ] = M ( τ ) , t ≤ τ < s ≤ T 14

  16. Max-plus differential rule H ( x, p ) = f ( x ) · p + 1 2 | pg ( x ) | 2 , x, p ∈ R n If φ ∈ C 1 b ([0 , T ] × R n ), x ( s ) a solution to the ODE on [ t, T ] with t ≥ 0 dφ ( s, x ( s )) = [ φ t ( s, x ( s )) + H ( x ( s ) , φ x ( s, x ( s ))] ds + dM ( s ) � s  ζ ( r ) · v ( r ) − 1   2 | ζ ( r ) | 2  dr M ( s ) = t ζ ( r ) = φ x ( r, x ( r )) g ( x ( r )) M ( s ) is a max-plus martingale 15

  17. Backward PDE φ t + H ( x, φ x ) = 0 If φ satisfies the backward PDE, M ( s ) = φ ( s, x ( s )) is a max-plus martingale. Taking τ = t, s = T φ ( t, x ) = E + tx [ φ ( T, x ( T )] = E + tx [ ℓ ( x ( T ))] 16

  18. 5. Dynamic programming PDEs and varia- tional inequalities A) Terminal cost problem: value function W ( t, x ) = E + tx [ ℓ ( x ( T )] Dynamic programming principle � s  − 1   τ | v ( r ) | 2 dr + W ( s, x ( s )) W ( τ, x ( τ )) = sup  2 v ( · ) is equivalent to W ( s, x ( s ) a max-plus martingale 17

  19. W is Lipschitz continuous and satisfies the back- ward PDE almost everywhere and in the viscosity sense 0 = W t + H ( x, W x ) , 0 ≤ t ≤ T, x ∈ R n W ( T, x ) = ℓ ( x ) 18

  20. B) Max-plus additive running cost value function � T � � V ( t, x ) = E + t ℓ ( x ( s ) ds ⊕ tx   = E +  max [ t,T ] ℓ ( x ( s ))   tx  Since E + tx is max-plus linear [ t,T ] E + V ( t, x ) = max tx [ ℓ ( x ( s ))] Dynamic programming principle � s V ( t, x ) = E + � � ( ⊕ t ℓ ( x ( r )) dr ) ⊕ V ( s, x ( s )) tx 19

  21. V is Lipschitz continuous and satisfies almost ev- erywhere and in viscosity sense 0 = max[ ℓ ( x ) − V ( t, x ) , V t + H ( x, V x )] , 0 ≤ t ≤ T, x ∈ R n V ( T, x ) = ℓ ( x ) Idea of proof: Both terms on right are ≤ 0 Two cases: ℓ ( x ) = V ( t, x ) OK ℓ ( x ) < V ( t, x ) standard control argument 20

  22. Infinite time horizon bounds Take t = 0, T large W ( x ) ∈ C 1 , ℓ ( x ) ≤ W ( x ) , H ( x, W x ( x )) ≤ 0 ⇒ V (0 , x ; T ) ≤ W ( x ) Equivalently: For 0 ≤ s ≤ T , x = x (0) � s ℓ ( x ( s )) ≤ 1 0 | v ( r ) | 2 dr + W ( x ) 2 A nonlinear H -infinity control inequality 21

  23. Example f (0) = 0 , x · f ( x ) ≤ − c | x | 2 , c > 0 0 ≤ ℓ ( x ) ≤ M | x | 2 , W ( x ) = K | x | 2 , M ≤ K, � g � 2 K ≤ c 22

  24. 6. Max-plus stochastic control I : terminal cost Fleming-Kaise-Sheu Applied Math Optimiz. 2010 x ( s ) ∈ R n state u ( s ) ∈ U control ( U compact) v ( s ) ∈ R d disturbance control dx ( s ) = f ( x ( s ) , u ( s )) ds + g ( x ( s ) , u ( s )) v ( s ) ds, t ≤ s ≤ T x ( t ) = x 23

  25. Control u ( s ) chosen “depending on v ( · ) past up to s ” Terminal cost criterion: minimize E + tx [ ℓ ( x ( T ))]

  26. Corresponding risk sensitive stochastic control prob- lem: choose a progressively measurable control to minimize e θℓ ( X θ ( T )) � � E tx As θ → ∞ , obtain a two player differential game. Minimizing player chooses u ( s ) Maximizing player chooses v ( s ) 24

  27. Game payoff � T P ( t, x ; u, v ) = − 1 t | v ( s ) | 2 ds + ℓ ( x ( T )) 2 Want the upper differential game value (not the lower value).

  28. Illustrative example (Merton terminal wealth prob- lem) x ( s ) > 0 wealth at time s u ( s ) fraction of wealth in risky asset 1 − u ( s ) fraction of wealth in riskless asset 25

  29. Riskless interest rate = 0 dx ( s ) = x ( s ) u ( s )[ µ + νv ( s )] , t ≤ s ≤ T ds x ( t ) = x f ( x, u ) = µxu g ( x, u ) = νxu 26

  30. Usual terminal wealth problem, parameter θ : choose u ( s ) to minimize e θℓ ( X θ ( T )) � � E tx Take HARA utility, parameter − θ ≪ 0. ℓ ( x ) = − log x, x − θ = e − θ log x � s log x ( s ) = log x + t u ( r )[ µ + νv ( r )] dr � T ˜ P ( t, x ; u, v ) = − log x + P ( u ( r ) , v ( r )) dr t P ( u, v ) = − u ( µ + νv ) − 1 2 v 2 ˜ 27

  31.  − µu + 1   2 ν 2 u 2 ˜ min max P ( u, v ) = min  u v u = − µ 2 2 ν 2 Minimum when u = u ∗ = µ ν 2 The optimal control is u ( s ) = u ∗ for all s . � = − log x − Λ( T − t ) E + � − log x ∗ ( T ) Λ = µ 2 / 2 ν 2 is the max-plus optimal growth rate 28

  32. Elliott-Kalton upper and lower differential game values Elliott-Kalton strategy α for minimizer (progres- sive strategy) u ( s ) = α [ v ]( s ) v ( r ) = ˜ v ( r ) a.e. in [ t, s ] ⇒ α [ v ]( r ) = α [˜ v ]( r ) a.e. in [ t, s ] Γ EK = { EK strategies α } 29

  33. The lower game value is E + inf tx [ ℓ ( x ( T ))] = inf sup P ( t, x ; α [ v ] , v ) α ∈ Γ EK α ∈ Γ EK v ( · ) We want the upper game value Γ = { EK strategies : α [ v ]( s ) is left continuous with limits on right } α ∈ Γ E + W ( t, x ) = inf tx [ ℓ ( x ( T )] is the upper EK value. It is Lipschitz continuous and satisfies (viscosity sense) the Isaacs PDE 30

  34. u ∈ U H u ( x, W x ) , t ≤ T 0 = W t + min W ( t, x ) = ℓ ( x ) H u ( x, p ) = f ( x, u ) · p + 1 2 | pg ( x, u ) | 2  pg ( x, u ) v − 1   2 | v | 2 = f ( x, u ) · p + max  v ∈ R d Recipe for optimal control policy u ∗ ( s, x ( s )) ∈ arg min u ∈ U H u ( x ( s ) , W x ( s, x ( s )))) 31

  35. Merton terminal wealth problem with non-HARA utility H u ( x, p ) = µxup + ν 2 2 x 2 u 2 p 2 H u ( x, p ) = − µ 2 min 2 ν 2 = − Λ u W ( t, x ) = ℓ ( x ) − Λ( T − t ) µ u ∗ ( x ) = − ν 2 ℓ x ( x ) Example : Exponential utility ℓ ( x ) = − x xu ∗ ( x ) = µ ν 2 32

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend