Time Inconsistent Optimal Control and Mean Variance Optimization - PDF document

Time Inconsistent Optimal Control and Mean Variance Optimization Tomas Bj¨ ork Stockholm School of Economics Agatha Murgoci Copenhagen Business School Xunyu Zhou Oxford University Conference in honour of Walter Schachermayer Wien 2010 – Typeset by Foil T EX – 1

Contents • Recap of DynP. • Problem formulation. • Discrete time. • Continuous time. • Example: Dynamic mean-variance optimization. – Typeset by Foil T EX – 2

Standard problem We are standing at time t = 0 in state X 0 = x 0 . �� T � max E h ( s, X s , u s ) dt + F ( X T ) u 0 dX t = µ ( t, X t , u t ) dt + σ ( t, X t , u t ) dW t For simplicty we assume that • X is scalar. • The adapted control u t is scalar with no restrictions. We denote this problem by P We restrict ourselves to feedback controls of the form u t = u ( t, X t ) . – Typeset by Foil T EX – 3

Dynamic Programming We embed the problem P in a family of problems P tx P tx : �� T � max E t,x h ( s, X s , u s ) dt + F ( X T ) u t dX s = µ ( t, X s , u s ) ds + σ ( s, X s , u s ) dW s , X t = x The original problem corresponds to P 0 ,x 0 . – Typeset by Foil T EX – 4

Bellman We now have the Bellman optimality principle , which says that the family {P t,x ; t ≥ 0 , x ∈ R } are time consistent . More precisely: If ˆ u is optimal on the time interval [ t, T ] , then it is also optimal on the sub-interval [ s, T ] for every s with t ≤ s ≤ T . We can easily derive the Hamilton-Jacobi-Bellman equation HJB: � h ( t, x, u ) + µ ( t, x, u ) V x ( t, x ) + 1 � 2 σ 2 ( t, x, u ) V xx ( t, x ) V t ( t, x ) + sup = 0 , u V ( T, x ) = F ( x ) – Typeset by Foil T EX – 5

Three Disturbing Examples Hyperbolic discounting (Ekeland-Lazrak-Pirvu) �� T � max E t,x ϕ ( s − t ) h ( c s ) ds + ϕ ( T − t ) F ( X T ) u t Mean variance utility (Basak-Chabakauri) E t,x [ X T ] − γ max 2 V ar t,x ( X T ) u Endogenous habit formation � � X T �� max E t,x ln x − β u dX t = [ rX t + ( α − r ) u t ] dt + σu t dW t – Typeset by Foil T EX – 6

Moral • These types of problems are not time consistent. • We cannot use DynP. • In fact, in these cases it is unclear what we mean by “optimality”. Possible ways out: • Easy way: Dismiss the problem as being silly. • Pre-commitment: Solve (somehow) the problem P 0 ,x 0 and ignore the fact that later on, your “optimal” control will no longer be viewed as optimal. • Game theory: Take the time inconsistency seriously. View the problems as a game and look for a Nash equilibrium point. We use the game theoretic approach. – Typeset by Foil T EX – 7

Our Basic Problem � � max E t,x [ F ( x, X T )] + G x, E t,x [ X T ] u dX s = µ ( X s , u s ) ds + σ ( X s , u s ) dW s , X t = x This can be extended considerably. For simplicity we will consider the easier problem � � max E t,x [ F ( X T )] + G E t,x [ X T ] u – Typeset by Foil T EX – 8

The Game Theoretic Approach • This is a bit delicate to formalize in continuous time. • Thus we turn to discrete time, and then go to the limit. – Typeset by Foil T EX – 9

Discrete Time Given: A controlled Markov process { X n : n = 0 , 1 , . . . T } At any time n we can change the transition probabilities for X n → X n +1 by choosing a control value u ∈ R . Players: • For each point in time n there is a player – “player No n ” or “ P n ”. • P n chooses the feedback control law u n ( X n ) . • A sequence of control laws u 0 , . . . , u T − 1 is denoted by u . • Given a sequence u of control laws, the value function for P n is defined by J n ( x, u ) = E n , x [ F ( X u E n , x [ X u � � T )] + G T ] – Typeset by Foil T EX – 10

Subgame Perfect Nash Equilibrium The value function for P n was defined by J n ( x, u ) = E n,x [ F ( X u E n,x [ X u � � T )] + G T ] We see that J n ( x, u ) depends on ( n, x ) and u n , u n +1 , . . . , u T − 1 . Definition: • The control law ˆ u is an equilibrium strategy if the following hold for each fixed n . – Assume that P k use ˆ u k ( · ) for k = n +1 , . . . , T − 1 . – Then it is optimal for player No n to use ˆ u n ( · ) . • The equilibrium value function is defined by V n ( x ) = J n ( x, ˆ u ) – Typeset by Foil T EX – 11

The infinitesimal operator Let { f n ( · ) } T n =0 be a sequence of real valued functions. Def: For a fixed control value u ∈ R , the infinitesimal operator A u , is defined by ( A u f ) n ( x ) = E [ f n +1 ( X n +1 ) − f n ( x ) | X n = x, u n = u ] Def: For a fixed control law u , the A u , is defined by ( A u f ) n ( x ) = E [ f n +1 ( X n +1 ) − f n ( x ) | X n = x, u n = u n ( x )] – Typeset by Foil T EX – 12

Important Idea It turns out that a fundamental role is played by the function sequence f n defined by X ˆ u � � f n ( x ) = E n,x T where ˆ u is the equilibrium strategy. The process f n ( X n ) is of course a martingale under the equilibrium control ˆ u so we have A ˆ u f n ( x ) = 0 , f T ( x ) = x. – Typeset by Foil T EX – 13

Extending HJB Proposition: The equilibrium value function V n ( x ) and the function f n ( x ) satisfy the system u { A u V n ( x ) − A u ( G ◦ f ) n ( x ) + ( H u f ) n ( x ) } sup = 0 , V T ( x ) = F ( x ) + G ( x ) A ˆ u f n ( x ) = 0 , f T ( x ) = x. ( H u f ) n ( x ) = G X u X ˆ u � � � �� E n,x f n +1 − G ( f n ( x )) , f n ( x ) = E n,x n +1 T – Typeset by Foil T EX – 14

Continuous Time The discrete time results extend immediately to continuous time. • Now X is a controlled continuous time Markov process with controlled infinitesimal generator 1 A u g ( t, x ) = lim g ( t + h, X u � � � � E t,x t + h ) − g ( t, x ) h h → 0 • The extended HJB is now an equation with time step [ t, t + h ] . • Divide the discrete time HJB equations by h and let h → 0 . – Typeset by Foil T EX – 15

Extended HJB Continuous Time Conjecture: The equilibrium value function satisfies the system u { A u V ( t, x ) − A u ( G ◦ f ) ( t, x ) + G ′ ( f ( t, x )) · A u f ( t, x ) } sup = 0 , A ˆ u f ( t, x ) = 0 , V ( T, x ) = F ( x ) + G ( x ) f ( T, x ) = x. Note the fixed point character of the extended HJB. – Typeset by Foil T EX – 16

General Problem �� T � � � max E t,x C ( t, x, X s , u s ) ds + F ( t, x, X T ) + G t, x, E t,x [ X T ] u t – Typeset by Foil T EX – 17

The general case Z T Z T A uV ( A ucs ) t ( x, x ) ds + ( A ucs,x ) t ( x ) ds “ ” sup { ( t, x ) + C ( x, x, u ) − u ∈U t t A uf A ufx ” ( t, x ) − A u ( G ⋄ g ) ( t, x ) + H ug “ ” “ “ ” − ( t, x, x ) + ( t, x ) } = 0 , A ˆ u fy ( t, x ) = 0 , A ˆ u g ( t, x ) = 0 , ( A ˆ u cs,y ) t ( x ) = 0 , 0 ≤ t ≤ s V ( T, x ) = F ( x, x ) + G ( x, x ) , cs,y C ( x, y, ˆ ( x ) = u s ( x )) , s f ( T, x, y ) = F ( y, x ) , g ( T, x ) = x. – Typeset by Foil T EX – 18

Optimal for what? • In continuous time, it is not immediately clear how to define an equilibrium strategy. • We follow Ekeland et al and define the equilibrium using spike variations. – Typeset by Foil T EX – 19

HJB as a Necessary Condition Conjecture: Assume that there exists and equilibrium control ˆ u and define V and f as above. The V and f satisfies the extended HJB system. Note: It is probably very hard to prove this, due to technical problems. We do however have a converse result. – Typeset by Foil T EX – 20

Verification Theorem Assume that V , f and ˆ Theorem: u satisfies the extended HJB system. Then V is the equilibrium value function and ˆ u is the equilibrium control. Proof: Not very hard, but a bit harder than for standard DynP. – Typeset by Foil T EX – 21

A useful Lemma Consider a functional J ( t, x, u ) = E t,x [ F ( x, X u T )] + G ( x, E t,x [ X u T ]) . and denote the equilibrium control and value function by ˆ u and V respectively. Let ϕ ( x () be a given deterministic real valued function and consider the functional J ϕ ( t, x, u ) = ϕ ( x ) { E t,x [ F ( x, X u T )] + G ( x, E t,x [ X u T ]) } Denoting the equilibrium control and value function by u ϕ and V ϕ respectively we have ˆ u ϕ ( t, x ) ˆ = u ( t, x ) , ˆ V ϕ ( t, x ) = ϕ ( x ) V ( t, x ) – Typeset by Foil T EX – 22

Practical handling of the theory • Make a parameterized Ansatz for V . • Make a parameterized Ansatz for f . • Plug everything into the extended HJB system and hope to obtain a system of ODEs for the parameters in the Ansatz. • Alternatively, compute Lie symmetry groups. – Typeset by Foil T EX – 23

Basak’s Example (in a simple version) dS t = αS t dt + σS t dW t , dB t = rB t dt X t = portfolio value process u = amount of money invested in risky asset Problem: E t,x [ X T ] − γ max 2 V ar t,x ( X T ) u dX t = [ rX t + ( α − r ) u t ] dt + σu t dW t This corresponds to our standard problem with F ( x ) = x − γ G ( x ) = γ 2 x 2 , 2 x 2 – Typeset by Foil T EX – 24

Extended HJB � � [ rX t + ( α − r ) u ] V x + 1 2 σ 2 u 2 V xx − γ 2 σ 2 u 2 f 2 V t + sup = 0 x u V ( T, x ) = x A ˆ u f = 0 f ( T, x ) = x Ansatz: V ( t, x ) = g ( t ) x + h ( t ) f ( t, x ) = A ( t ) x + B ( t ) – Typeset by Foil T EX – 25

Time Inconsistent Optimal Control and Mean Variance Optimization - PDF document

Time Inconsistent Optimal Control and Mean Variance Optimization Tomas Bj ork Stockholm School of Economics Agatha Murgoci Copenhagen Business School Xunyu Zhou Oxford University Conference in honour of Walter Schachermayer Wien 2010

MAP for Gaussian mean and variance Conjugate priors Mean: Gaussian prior Variance:

Variance Will Perkins January 22, 2013 Variance Definition The variance of a random variable X

Estimating Variance under Estimating Mean . . . Interval and Fuzzy Estimating Variance . . .

Computer control of gene expression: Robust setpoint tracking of protein mean and variance using

Feb 27: Expectation, Variance, and Standard Deviation In-class Midterm Exam MOVED to 3/10

Exposing Inconsistent Search Results with Bobble Nick Feamster Georgia Tech Wenke Lee, Xinyu Xing,

Inverse problems and control optimal in non-linear mechanics C. Stolz 1 2 Introduction

Time-Consistent Mean-Variance Portfolio Selection in Discrete and Continuous Time Christoph

Analysis of variance and regression December 4, 2007 Variance component models Variance

Alex Psomas: Lecture 18. Random Variables: Variance 1. Variance 2. Distributions Variance Flip

Variance = E[I 2 ] 2pE[I] + p 2 = E[I] 2p p + p 2 = 2 2 = p-2p+ p pq variance.1

Part 23 Optimal Control: Examples 142 Definition of optimal control problems Commonly

High Warehouse Racks: Optimal Feedback Control and High Warehouse Racks: Optimal Feedback Control

Strategic Implications of Competing For Consumers with Time Inconsistent Preferences Alexei

Optimal Control Theory The theory Optimal control theory is a mature mathematical discipline

Optimal Control Theory The theory Optimal control theory is a mature mathematical discipline

Java: Learning to Program with Robots Chapter 05: More Decision Making Chapter Objectives

A Preliminary Investigation on Optimizing Charm++ for Homogeneous Multi-core Machines Chao Mei

Optimal Operation of Transient Gas Transport Networks Kai Hoppmann-Baum Combinatorial

A Petri net-based notation for normative modeling: evaluation on deontic paradoxes 1 6 J u n

Explaining Inconsistent Code Muhammad Numair Mansur Introduction 50% of the time in

Incentives and Behavior Prof. Dr. Heiner Schumacher KU Leuven 11. Exploiting Consumers Prof.

The Trickle Algorithm Analysis, Use, and Implementation Philip Levis Computer Systems Lab

Proving inconsistency: Towards a better Maltsev CSP algorithm Ross Willard Univ. Waterloo