[PDF] - Optimal investment and hedging under partial information Michael PDF Document

SLIDE 1

Optimal investment and hedging under partial information

Michael Monoyios Mathematical Institute, University of Oxford www.maths.ox.ac.uk/~monoyios Tutorial Lectures for the session on Inverse and Partial Information Problems at the Special Semester on Stochastics with Emphasis on Finance Johan Radon Institute for Computational and Applied Mathematics Linz Austria September 2008 September 2, 2008

1 Introduction 2 2 Filtering theory 4 2.1 Observation model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Innovations process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2.1 The Innovations Conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 Signal process model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.4 Fundamental filtering equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.4.1 Linear observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.4.2 Linear observations and linear signal . . . . . . . . . . . . . . . . . . . . . 9 2.5 Multi-dimensional Kalman-Bucy filter . . . . . . . . . . . . . . . . . . . . . . . . 11 3 Merton problem with uncertain drift 12 3.1 Full information case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.1.1 Portfolio optimisation via convex duality . . . . . . . . . . . . . . . . . . . 13 3.2 Partial information case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4 Optimal hedging of basis risk with partial information 18 4.1 Basis risk model: full information case . . . . . . . . . . . . . . . . . . . . . . . . 18 4.1.1 Perfect correlation case . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.1.2 Incomplete case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2 Partial information case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.2.1 Choice of prior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.2.2 Two-dimensional Kalman-Bucy filter . . . . . . . . . . . . . . . . . . . . . 23 4.2.3 Optimal hedging with random drifts . . . . . . . . . . . . . . . . . . . . . 25 4.2.4 The primal problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.2.5 Dual problem and optimal hedge . . . . . . . . . . . . . . . . . . . . . . . 27 4.2.6 Stochastic control representation of the indifference price . . . . . . . . . 30 4.2.7 Analytic approximation for the indifference price . . . . . . . . . . . . . . 30 1

SLIDE 2

1 INTRODUCTION 2 5 Investment with inside information and drift uncertainty 31 5.1 Linear filtering on an expanded filtration . . . . . . . . . . . . . . . . . . . . . . . 32 5.2 Computing the information drift . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5.3 Optimal investment for an insider with drift parameter uncertainty . . . . . . . . 36 5.3.1 Anticipative Brownian information . . . . . . . . . . . . . . . . . . . . . . 37

Abstract We first give an exposition of filtering theory, then consider the Merton optimal invest- ment problem when the agent does not know the drift parameter of the underlying stock. This is taken to be a random variable with a Gaussian prior distribution, which is updated via a Kalman filter. The resulting problem of optimal investment with a random drift can be treated as a full information problem, and an explicit solution is possible. We then treat an incomplete market hedging problem. A claim on a non-traded asset is hedged using a correlated traded asset, and the hedger is once again uncertain of the true values of the drifts of each asset. After filtering, the resulting problem with random drifts is solved in the case that each asset’s prior distribution has the same variance. Analytic approximations for the optimal hedging strategy are obtained. Finally, we examine an optimal investment problem with inside information, in which the insider does not know the true drift of the

stock. Explicit solutions are possible, after first enlarging the filtration to accommodate the

insider’s additional knowledge, then filtering the asset price drift.

1 Introduction

These lectures examine some problems of optimal investment, and of optimal hedging of a con- tingent claim in an incomplete market, when the agent’s information set is restricted to stock price observations, possibly augmented by some additional information related to the terminal value of a stock price. In classical models of financial mathematics, one usually specifies a probability space (Ω, F, P) equipped with a background filtration F = (Ft)0≤t≤T , and then writes down some stochastic process S = (St)0≤t≤T for an asset price, such that S is adapted to the filtration F. A typi- cal example would be the Black-Scholes (henceforth, BS) model of a stock price, following the geometric Brownian motion dSt = σSt(λdt + dBt), (1) where B is an F-Brownian motion and the volatility σ > 0 and the Sharpe ratio λ are assumed to be known constants. Of course, this is a strong assumption that the agent is assumed to be able to observe the Brownian motion (henceforth, BM) process B, as well as the stock price process

S. We refer to this as a full information scenario. In this case, an agent would use F-adapted

trading strategies in S. We shall relax the full information assumption. We shall assume that the agent can only

bserve the stock price process, and not the Brownian motion B, and that the constants σ, λ are

not (heroically) assumed to be known. The agent’s trading strategies must also be adapted to the

bservation filtration ˆ

F := ( ˆ Ft)0≤t≤T generated by S. We refer to this as a partial information scenario. In this case, the parameter λ would be regarded as an unknown constant whose value needs to be determined from price data. In principle, one would also have to apply this philosophy to the volatility σ, but we shall make the approximation that price observations are continuous, so that σ can be computed from the quadratic variation [S]t of the stock price, since we have d[S]t dt = σ2S2

t .

(2) One way to model the uncertainty in our knowledge of λ is to consider it as an F-adapted process, or as a random variable (measurable with respect to F0) with a given initial distribution (the prior distribution), which is updated in the face of new price information, that is, as the

bservation filtration ˆ

F evolves. This is an example of a filtering problem, which is to compute the best estimate of a random variable given observations up to time t ∈ [0, T], and hence given

SLIDE 3

1 INTRODUCTION 3 the sigma algebra ˆ Ft, t ∈ [0, T] In the case of the BS model (1), where we model λ as a random variable, we might be interested in computing the conditional expectation ˆ λt := E[λ| ˆ Ft]. We shall see that the effect of filtering is that the model (1) may be replaced by a model specified

n the filtered probability space (Ω, ˆ

FT , ˆ F, P) and written as dSt = σSt(ˆ λtdt + d ˆ Bt), (3) where ˆ B is an ˆ F-BM. This model may now be treated as a full information model, since both ˆ B and ˆ λ are ˆ F-adapted processes. The price we have paid for restoring a full information scenario is that the constant parameter λ has been replaced by a random process ˆ λ. The process of replacing a partial information model with an effective full information model is usually only achievable in special circumstances, such as Gaussian prior distributions and certain linearity properties in the relation between the observable and unobservable processes, as we shall see in the next section. In the rest of these lectures, we first give an exposition of filtering theory, culminating in the linear filtering case (the Kalman-Bucy filter). We then apply the results to the Merton problem

f optimal investment, which seeks a trading strategy to maximise expected utility of terminal
wealth. The problem is then to choose a trading strategy π = (πt)0≤t≤T , the wealth placed in

stock at each time t ∈ [0, T], to maximise the functional J(x; π) := E

x +

T πt St dSt

,

where x is the initial capital at time 0. The value function is the maximum expected utility expressed as a function of x: u(x) := sup

π∈A

J(x; π), (4) where A denotes a set of admissible trading strategies, which will depend on whether we have a full information model, a partial information model, or one with some other relevant information structure. The main distinction between a full information model and a partial information model is that:

in a full information setting, π is F-adapted, where F is (typically) the Brownian filtration

generated by B, and we take S to follow the stochastic differential equation (1), with σ, λ known constants;

in a partial information setting, π is ˆ

F-adapted, where ˆ F is the observation filtration gener- ated by S, and we take S to follow the stochastic differential equation (3), with σ a constant given by (2) and ˆ λ an ˆ F-adapted process. We shall see how to characterise ˆ λ using filtering theory. We shall then move on to the hedging of a claim in an incomplete market setting under partial

information. Specifically, we shall consider a basis risk model involving the optimal hedging of

a contingent claim on a non-tradeable asset Y using a traded stock S, correlated with Y , when the hedger is restricted to trading strategies in S that are adapted to the observation filtration ˆ F generated by the asset prices. In the full information case the asset prices are correlated log-Brownian motions given by dSt = σSt(λdt + dBt), dYt = βYt(θdt + dWt), where the Brownian motions B, W are correlated with correlation ρ ∈ [−1, 1]. The parameters σ > 0, λ, β > 0 and θ are assumed constant. This market is complete when the correlation is perfect, but incomplete otherwise. A number

f studies have used exponential indifference valuation methods to hedge the claim in an optimal

SLIDE 4

2 FILTERING THEORY 4 manner in a full information scenario, and we outline these results before moving on to the partial information case, where we assume the hedger does not know with certainty the drifts of S and Y . Finally we return to the Merton optimal investment problem (4) when the agent is assumed to have some additional information in the form of knowledge of the value of a random variable I, whose value is measurable with respect to time-T information.

2 Filtering theory

A good reference for this section is Chapter VI.8 of Rogers & Williams [21]. The setting is a probability space (Ω, F, P) equipped with a filtration F = (Ft)0≤t≤T . All

ur processes are assumed to be F-adapted. Note that F is not the observation filtration. Let

us call F the background filtration. We consider two processes, both taken to be one-dimensional (for simplicity):

a signal process U = (Ut)0≤t≤T which is not directly observable (so unobservable);
an observation process O = (Ot)0≤t≤T , which is observable and somehow correlated with

U, so that by observing O we can say something about the distribution of U. Let ˆ F := ( ˆ Ft)0≤t≤T denote the observation filtration generated by O. That is, ˆ Ft := σ(Os; 0 ≤ s ≤ t), 0 ≤ t ≤ T. The filtering problem is to compute the conditional distribution of the signal Ut, t ∈ [0, T], given

bservations up to that time. Or, equivalently, to compute the conditional expectation

E[f(Ut)| ˆ Ft], 0 ≤ t ≤ T, where f : R → R is some test function. To proceed further, we need to specify some particular model for the signal and observation processes.

2.1 Observation model

Let B = (Bt)0≤t≤T be an F-Brownian motion, let H = (Ht)0≤t≤T be an F-adapted process satisfying E T H2

t dt < ∞,

and we shall assume the observation process O is of the form Ot = t Hudu + Bt, 0 ≤ t ≤ T. (5) The typical situation will be where Ht = h(t, Ut), a deterministic function h : [0, T] × R → R

f time and the current signal value. The yet more specialised situation (that we shall mainly

focus on) will be the linear case when h(t, x) = G(t)x, with G(·) a deterministic function. Then Ht = G(t)Ut and the observation process stochastic differential equation (SDE) is dOt = G(t)Utdt + dBt, (linear observation model).

2.2 Innovations process

Introduce the notation ˆ φt := E[φt| ˆ Ft], for any process φ. Define the ˆ F-adapted innovations process Nt := Ot − t ˆ Hudu, 0 ≤ t ≤ T. (6) Proposition 1 The innovations process N is an ˆ F-Brownian motion.

SLIDE 5

2 FILTERING THEORY 5 Proof From (5) and (6) we have Nt = t (Hu − ˆ Hu)du + Bt. With s ≤ t, we have E[Nt| ˆ Fs] − Ns = E t

s

(Hu − ˆ Hu)du + Bt

ˆ

Fs

=

E t

s

E[Hu| ˆ

Fu] − ˆ Hu

du
ˆ

Fs

+ E
E(Bt − Bs|Fs)| ˆ

Fs

= 0,

using the Tower property of conditional expectation. So N is is continuous ˆ F-martingale with quadratic variation [N]t = [B]t = t, so N is an ˆ F-Brownian motion.

2.2.1

The Innovations Conjecture Denote by FN := (FN

t )0≤t≤T the filtration generated by N, so that FN t

:= σ(Ns; s ≤ t). Since N is ˆ F-adapted, we have (FN

t ) ⊆ ( ˆ

Ft). For linear systems, we shall see that we also have FN

t

= ˆ Ft, 0 ≤ t ≤ T, (7) so that in this case the observations and the innovations represent the same information (because there is an invertible map that derives one from the other). The “innovations conjecture” (of Kailath) is that the identity (7) holds in general, but we now know that this is not the case. However, the following positive and very useful result is true. Theorem 1 Every local ˆ F-martingale M admits a representation of the form Mt = M0 + t ΦsdNs, 0 ≤ t ≤ T, where Φ is ˆ F-adapted and T

0 Φ2 tdt < ∞ a.s. If M happens to be a square-integrable martingale,

then Φ can be chosen so that E T

0 Φ2 tdt < ∞.

To prove this result, recall the following well-known result on representation of local mar- tingales with respect to a Brownian filtration. See Theorems 3.4.2 and 3.4.15 in Karatzas and Shreve [11] for more details on this and other classical results in stochastic calculus. Theorem 2 (Local martingale representation) Let W be a Brownian motion and let FW denote its natural filtration. Every local martingale M with respect to FW admits a representation

f the form

Mt = M0 + t bsdWs, 0 ≤ t < ∞, for an FW -adapted process satisfying t

0 b2 sds < ∞ almost surely for every 0 < t < ∞.

In particular, every such M has continuous sample paths. If M happens to be a square-integrable martingale (EM 2

t < ∞, ∀t ≥ 0), then b can be chosen so that E

t

0 b2 sds < ∞ for every 0 < t < ∞.

Note that, if only the innovations conjecture (7) were true in general, then Theorem 1 would follow directly from Theorem 2. As the innovations conjecture is not true in general, we shall prove the theorem by performing a measure change that turns O into a Brownian motion, then apply Theorem 2, then invert the change of measure to revert back to the innovations process N.

SLIDE 6

2 FILTERING THEORY 6 Proof of Theorem 1 We carry this out only in the case of bounded H, and for the case when M is a martingale, to present the ideas with the minimum of technical fuss. If H is bounded, then so is ˆ H, so the process Zt := E(− ˆ H · N)t, 0 ≤ t ≤ T, is an ˆ F-martingale. By the Girsanov Theorem, since N is an ˆ F-BM under P, then the process Nt + t ˆ Hsds = Ot, that is, the observation process, is a ( ˜ P, ˆ F)-BM under the probability measure ˜ P defined on (Ω, ˆ F) by d ˜ P dP

ˆ

Ft

= Zt, 0 ≤ t ≤ T. Notice that the inverse likelihood ratio is dP d ˜ P

ˆ

Ft

= Λt := 1 Zt = exp t ˆ HsdNs + 1 2 ˆ H2

s ds

=

exp t ˆ HsdOs − 1 2 ˆ H2

s ds

=

E( ˆ H · O)t. The stochastic differential equations for Z, Λ are therefore Zt = 1 − t Zs ˆ HsdNs, Λt = 1 + t Λs ˆ HsdOs. (8) Using the so-called Bayes rule, for s < t and an ˆ Ft-measurable random variable X: ˜ E[X| ˆ Fs] = 1 Zs E[ZtX| ˆ Fs], we find that, since M is a P-martingale, then ΛM is a ˜ P-martingale: for s < t, we have ˜ E[ΛtMt| ˆ Fs] = 1 Zs E[ZtΛtMt| ˆ Fs] = ΛsMs. An application of Theorem 2 gives a representation of the form ΛtMt = Λ0M0 + t ΨsdOs = Λ0M0 + t Ψs(dNs + ˆ Hsds). (9) Now from (9), (8) and the integration by parts formula,1 we obtain Mt = (ΛtMt)Zt = (ΛtMt)Zt + t (ΛsMs)dZs + t Zsd(ΛsMs) + [ΛM, Z]t = M0 + t ΛsMs(−Zs ˆ HsdNs) + t ZsΨs(dNs + ˆ Hsds) − t Zs ˆ HsΨsd[N]s = M0 + t (ZsΨs − Ms ˆ Hs)dNs = M0 + t ΦsdNs, for Φ = ZΨ − M ˆ H.

1For any two processes X, Y ,

XtYt = X0Y0 + Z t XsdYs + Z t YsdXs + [X, Y ]t.

SLIDE 7

2 FILTERING THEORY 7

To proceed further, we now need yet more structure, this time on the signal process U.

2.3 Signal process model

We take the signal process to be of the form Ut = U0 + t b(s, Us)ds + t σ(s, Us)dWs, (10) where W is a Brownian motion correlated with B in the observation model (5) d dt[W, B]t = ρ, ρ ∈ [−1, 1], and the functions b : [0, T] × R → R and σ : [0, T] × R → R satisfy the Lipschitz and linear growth conditions |b(t, x) − b(t, y)| + |σ(t, x) − σ(t, y)| ≤ K|x − y|, ∀x, y ∈ R, |b(t, x)| + |σ(t, x)| ≤ K(1 + |x|), ∀x ∈ R, for some real K > 0. Then there exists a unique process U that satisfies (10). Let f ∈ C2

0(R) be a twice continuously differentiable function with compact support2. The

generator of U is At given by Atf(x) = b(t, x)f ′(x) + 1 2σ2(t, x)f ′′(x). For brevity we use the notation ft ≡ f(Ut), (Af)t ≡ Atf(Ut), 0 ≤ t ≤ T. By Itˆ

’s formula we have that

M f

t := ft − f0 −

t (Af)sds = t σ(s, Us)f ′(Us)dWs (11) is an F-local martingale, and in fact a square-integrable martingale if f is of compact support. We assume this is the case. The cross-variation of M f with B is [M f, B]t = t σ(s, Us)f ′(Us)d[W, B]s = t ρσ(s, Us)f ′(Us)ds =: t αf

sds,

(12) where we have defined the process αf by αf

t := ρσ(t, Ut)f ′(Ut),

0 ≤ t ≤ T.

2.4 Fundamental filtering equation

The fundamental filtering theorem is the following. Theorem 3 For the observation and signal process models of (5) and (10) we have, for every f ∈ C2

0(R), and with ft ≡ f(Ut), using the notation ˆ

φt := E[φt| ˆ Ft] for any process φ, the fundamental filtering equation ˆ ft = ˆ f0 + t ( Af)sds + t

fsHs − ˆ

fs ˆ Hs + ˆ αf

s

dNs,

0 ≤ t ≤ T. (13) The proof will require the following lemma. Lemma 1 Consider two F-adapted processes V, C with E|Vt| < ∞, ∀t ∈ [0, T] and E T

0 |Ct|dt <

∞. If Jt := Vt − t

0 Csds is an F-martingale, then

ˆ Jt := ˆ Vt − t ˆ Csds is an ˆ F-martingale.

2So f is zero outside of a compact set.

SLIDE 8

2 FILTERING THEORY 8 Proof For s ≤ t, writing t

0 Cudu =

s

0 Cudu +

t

s Cudu and using the fact that J is an F-

martingale, we have E

Vt −

s Cudu − t

s

Cudu

Fs
= Vs −

s Cudu ⇒ E

Vt −

t

s

Cudu

Fs
= Vs,

(14) which we shall use shortly. Now consider E[ ˆ Jt| ˆ Fs] = E

ˆ

Vt − t ˆ Cudu

ˆ

Fs

=

E

ˆ

Vt − s ˆ Cudu − t

s

ˆ Cudu

ˆ

Fs

=

E

ˆ

Vt − t

s

ˆ Cudu

ˆ

Fs

−

s ˆ Cudu = E

E[Vt| ˆ

Ft] − t

s

E[Cu| ˆ Fu]du

ˆ

Fs

−

s ˆ Cudu = E[Vt| ˆ Fs] − t

s

E[Cu| ˆ Fs]du − s ˆ Cudu = E[E[Vt|Fs] ˆ Fs] − t

s

E[E[Cu|Fs] ˆ Fs]du − s ˆ Cudu = E

E
Vt −

t

s

Cudu

Fs
ˆ

Fs

−

s ˆ Cudu = E[Vs| ˆ Fs] − s ˆ Cudu (using (14) = ˆ Vs − t

s

ˆ Cudu = ˆ Js.

Proof of Theorem 3

Recall from (11) that ft = f0 + t (Af)sds + M f

t ,

(15) where M f

t =

t

0 σ(Us)f ′(Us)dWs is an F-martingale. So, using Lemma 1 and Theorem 1 we have

ˆ M f

t := ˆ

ft − ˆ f0 − t ( Af)sds = ˆ F-martingale =: t ΦsdNs, (16) for a suitable ˆ F-adapted process Φ such that E T

0 Φ2 tdt < ∞. We want to compute Φ, to establish

that Φt = ftHt − ˆ ft ˆ Ht + ˆ αf

t ,

0 ≤ t ≤ T. (17) This will be accomplished by computing E[ftOt| ˆ Ft] = ˆ ftOt in two ways and comparing the results.

SLIDE 9

2 FILTERING THEORY 9 On the one hand, using (15), (5), (12) and the integration by parts formula, ftOt = f0O0 + t fsdOs + t Osd fs + [f, O]t = t fs(Hsds + dBs) + t Os

(Af)sds + dM f

s

+ [M f, B]t

= t

fsHs + Os(Af)s + αf

s

ds + F-martingale.

So by Lemma 1,

ftOt = ˆ

ftOt = t

fsHs + Os(

Af)s + ˆ αf

s

ds + ˆ

F-martingale. (18) On the other hand, from (16), (6) and the integration by parts formula, we obtain ˆ ftOt = ˆ f0O0 + t ˆ fsdOs + t Osd ˆ fs + [ ˆ f, O]t = t ˆ fs( ˆ Hsds + dNs) + t Os

(

Af)sds + ΦsdNs

+

t Φsds = t

ˆ

fs ˆ Hs + Os( Af)s + Φs

ds + ˆ

F-martingale. (19) Comparing (18) and (19), the difference between the bounded variation parts is a continuous martingale of bounded variation, so is constant, and is null at zero, so is identically zero, and therefore (17) holds.

2.4.1

Linear observations Take Ht = h(t, Ut) = G(t)Ut and f(x) = xk, k = 1, 2, . . .. Then we obtain from (13): ˆ Ut = ˆ U0 + t

b(s, Us)ds +

t

G(s)
U 2

s − ( ˆ

Us)2 + ρ σ(s, Us)

dNs, (k = 1),

(20)

U k

t

=

U k

0 + k

t

b(s, Us)U k−1

s

+ 1 2(k − 1)

σ2(s, Us)U k−2

s

ds

+ t

G(s)
U k+1

s

− ˆ Us U k

s

+ kρ
σ(s, Us)U k−1

s

dNs, k = 2, 3, . . . .

(21) Equations (20) and (21) convey the complexity of non-linear filtering. To solve the equation for the kth conditional moment, one needs to know the (k + 1)th conditional moment as well as E[g(s, Us)| ˆ Fs] for g(s, x) = b(s, x)xk−1, g(s, x) = σ2(s, x)xk−2, g(s, x) = σ(s, x)xk−1. This means the computation of conditional moments cannot be achieved by induction on k and the problem is inherently infinite dimensional except in the linear case. 2.4.2 Linear observations and linear signal Now take h(t, x) = G(t)x as before, and b(t, x) = A(t)x, σ(t, x) = C(t), for deterministic functions A(·), C(·), and suppose that the signal process has a Gaussian initial distribution, so that the signal and observation processes follow dUt = A(t)Utdt + C(t)dWt, U0 ∼ N(µ, v), dOt = G(t)Utdt + dBt, O0 = 0,

SLIDE 10

2 FILTERING THEORY 10 with U0 independent of B and of W. The two-dimensional process (X, Y ) is then Gaussian, so the conditional distribution of Ut given ˆ Ft is normal (so, in particular, is completely characterised by its mean and variance), with mean ˆ Ut := E[Ut| ˆ Ft] and variance var[Ut| ˆ Ft] =: Vt = E[(Ut − ˆ Ut)2| ˆ Ft] = U 2

t −

ˆ

Ut 2 . Notice that the initial values are ˆ U0 = E[U0| ˆ F0] = EU0 = µ, and V0 = E[(U0 − ˆ U0)2| ˆ F0] = E[(U0 − µ)2] = var(U0) = v. The problem then boils down to finding an algorithm for computing the sufficient statistics ˆ Ut, Vt from their initial values ˆ U0 = µ, V0 = v. Now, from (20) we obtain, along with the initial condition ˆ U0 = µ, the SDE d ˆ Ut = A(t) ˆ Utdt + (G(t)Vt + ρC(t)) dNt, ˆ U0 = µ. (22) From (21) with k = 2 we obtain d U 2

t =

C2(t) + 2A(t)

U 2

t

dt +
G(t)
U 3

t − ˆ

Ut U 2

t

+ 2ρC(t) ˆ

Ut

dNt,
U 2

0 − µ2 = v.

(23) But for a normal random variable X ∼ N(m, Σ2), we have E[X3] = m(m2 + 3Σ2), whence

U 3

t = E[U 3 t | ˆ

Ft] = E[Ut| ˆ Ft]

E[Ut| ˆ

Ft] 2 + 3var[Ut| ˆ Ft]

= ˆ

Ut

ˆ

Ut 2 + 3Vt

,

and therefore

U 3

t − ˆ

Ut U 2

t = ˆ

Ut

ˆ

Ut 2 + 3Vt − U 2

t

= 2Vt ˆ

Ut. Using this, (22), (23) and the Itˆ

formula, we obtain

dVt = d

U 2

t −

ˆ

Ut 2 =

C2(t) + 2A(t)

U 2

t

dt +
G(t)
2Vt ˆ

Ut

+ 2ρC(t) ˆ

Ut

dNt − 2 ˆ

Utd ˆ Ut − d[ ˆ U]t, which simplifies to the non-stochastic Riccati equation dVt dt = (1 − ρ2)C2 + 2 (A(t) − ρC(t)G(t)) Vt = G2(t)V 2

t ,

V0 = v. (24) In other words, the conditional variance Vt is a deterministic function of t, and given by the solution of (24). Thus, there is in fact only one sufficient statistic, the conditional mean ˆ Ut which satisfies the linear SDE (22), which is the celebrated Kalman-Bucy filter. We summarise all this below. Theorem 4 (One-dimensional Kalman-Bucy filter) On a filtered probability space (Ω, F, F, P), with F = (Ft)0≤t≤T , let U be an F-adapted signal process satisfying dUt = A(t)Utdt + C(t)dWt, 0 ≤ t ≤ T,

SLIDE 11

2 FILTERING THEORY 11 and let O be an F-adapted observation process satisfying dOt = G(t)Utdt + dBt, 0 ≤ t ≤ T, where W, B are F-Brownian motions with correlation ρ, and the coefficients A(·), C(·), G(·) are deterministic functions satisfying T

|A(t)| + C2(t) + G2(t)
dt < ∞.

Define the observation filtration ˆ F := ( ˆ Ft)0≤t≤T by ˆ Ft := σ(Os; 0 ≤ s ≤ t). Suppose U0 is an F0-measurable random variable, and that the distribution of U0 is Gaussian with mean µ and variance v, independent of W and B. Then the conditional expectation ˆ Ut := E[Ut| ˆ Ft] for 0 ≤ t ≤ T satisfies d ˆ Ut = A(t) ˆ Utdt + [G(t)Vt + ρC(t)] dNt, ˆ U0 = µ, where N is the innovations process, an ˆ F-Brownian motion satisfying dNt = dOt − G(t) ˆ Utdt, and Vt = var[Ut| ˆ Ft] is the conditional variance, which is independent of ˆ Ft and satisfies the deterministic Riccati equation dVt dt = (1 − ρ2)C2(t) + 2 [A(t) − ρC(t)G(t)] Vt − G2(t)V 2

t ,

V0 = v. Remark 1 (Validity of innovations conjecture for linear systems) It is now straightfor- ward to confirm the validity of the innovations conjecture FN

t

= ˆ Ft, 0 ≤ t ≤ T, for linear systems. The solution ˆ U of the SDE (22) is adapted to the filtration FN of the driving Brownian motion N, so (F ˆ

U t ) ⊆ (FN t ), where F ˆ U t = σ( ˆ

Us; 0 ≤ s ≤ t). So from the relation Ot = Nt + s

0 G(s) ˆ

Us we see that O is FN-adapted, i.e. ( ˆ Ft) ⊆ (FN

t ). Because the reverse inclusion (FN t ) ⊆ ( ˆ

Ft) always holds, the two filtrations are the same.

2.5 Multi-dimensional Kalman-Bucy filter

A multi-dimensional version of the Kalman-Bucy filter can be derived along similar lines to the

ne-dimensional case. We state the result below.

Theorem 5 Consider a filtered probability space (Ω, F, F, P), with F = (F)0≤t≤T , and two F- adapted processes U, O as given below. Let U = (Ut)0≤t≤T be an n-dimensional signal process satisfying dUt = A(t)Utdt + C(t)dWt, U0 ∼ N(µ, v), (linear signal), (25) where U0 ∼ N(µ, v) denotes an n-dimensional F0-measurable Gaussian vector with mean µ ∈ Rn and covariance matrix v ∈ Rn × Rn, independent of the d-dimensional Brownian motion W, and where A(t) ∈ Rn × Rn, C(t) ∈ Rn × Rd. Let O = (Ot)0≤t≤T be an m-dimensional observation process satisfying dOt = G(t)Utdt + D(t)dBt, Z0 = 0, (linear observations), where G(t) ∈ Rm×Rn, C(t) ∈ Rm×Rk, and B is a k-dimensional Brownian motion independent

f W and U0.

We assume that A, C, G, D are bounded on bounded intervals, that DDT is non-singular, and that (D(t)D(t)T)−1 is bounded on every bounded t-interval.

SLIDE 12

3 MERTON PROBLEM WITH UNCERTAIN DRIFT 12 Let ˆ F = ( ˆ Ft)0≤t≤T denote the observation filtration generated by O, so that ˆ Ft = σ(Os; 0 ≤ s ≤ t). The conditional expectation vector ˆ Ut := E[Ut| ˆ Ft], 0 ≤ t ≤ T, satisfies the SDE d ˆ Ut = A(t) ˆ Utdt + VtGT(t)

D(t)DT(t)

−1 dOt − G(t) ˆ Utdt

,

ˆ U0 = µ, = A(t) ˆ Utdt + VtGT(t)

D(t)DT(t)

−1 dNt, ˆ U0 = µ, (26) where N is the innovations process, defined by Nt := Ot − t G(s) ˆ Usds, 0 ≤ t ≤ T, and satisfying Nt = t D(s)d ˆ Bs, (27) where ˆ B is a standard k-dimensional ˆ F-Brownian motion. The error Ut − ˆ Ut is independent of ˆ Ft and the error covariance Vt := E[(Ut − ˆ Ut)(Ut − ˆ Ut)T| ˆ Ft] = var[Ut| ˆ Ft], satisfies the deterministic matrix Riccati equation dVt dt = A(t)Vt + VtAT(t) − VtGT(t)(D(t)DT(t))−1G(t)Vt + C(t)CT(t), V0 = v. We make two remarks.

Notice that by (27) we can rewrite (26) as

d ˆ Ut = A(t) ˆ Utdt + VtGT(t)

D(t)DT(t)

−1 D(t)d ˆ Bt, ˆ U0 = µ, which is a linear SDE of the same type as (25).

Since U, ˆ

U satisfy (25), (26) and U0 is Gaussian, then Ut, ˆ Ut are Gaussian vectors for each t, and the error Ut − ˆ Ut is also Gaussian: Ut − ˆ Ut has mean 0 and covariance Vt, and Law(Ut| ˆ Ft) = N( ˆ Ut, Vt).

3 Merton problem with uncertain drift

A stock price process S = (St)0≤t≤T follows dSt = σSt(λdt + dBt), (28)

n a complete probability space (Ω, F, P) equipped with a filtration F := (Ft)0≤t≤T , and B =

(Bt)0≤t≤T is an F-Brownian motion. In a full information model, σ > 0 and λ are assumed to be known constants. In a partial information model with continuous observations, we no longer assume that we know the values of the parameters in the stochastic differential equation (SDE) (28), and we restrict trading strategies to be ˆ F-adapted, where ˆ F is the filtration generated by the stock price. Then σ is still known from the quadratic variation of S but the value of λ is not. In the partial information case, we shall model the uncertainty in λ by treating it as a random variable with a Gaussian prior distribution, λ ∼ N(λ0, v0) (the normal probability law of mean λ0 and variance v0), independent of B. We consider a Merton optimal investment problem where an agent with a power utility function U(x) = xγ γ , 0 < γ < 1, may invest a portion of his wealth in shares and the remaining wealth in a cash account with zero interest rate (for simplicity).

SLIDE 13

3 MERTON PROBLEM WITH UNCERTAIN DRIFT 13

3.1 Full information case

In this case, with σ, λ in (28) treated as known constants, the agent may use an F-adapted strategy θ = (θt)0≤t≤T , where θt is the proportion of wealth invested in shares at time t, an F-adapted process satisfying T

0 θ2 t dt < ∞ almost surely. The F-adapted wealth process X = (Xt)0≤t≤T

then follows dXt = σθtXt(λdt + dBt). (29) We define the set of admissible strategies A as those strategies θ such that Xt ≥ 0 almost surely for all t ∈ [0, T]. The objective is to maximise expected utility of terminal wealth E[U(XT )|X0 = x],

ver all admissible strategies.

Let Q denote the unique martingale measure for this market. The change of measure mar- tingale Z := (Zt)0≤t≤T is given by Zt := dQ dP

Ft

= E(−λ · B)t = exp

−λBt − 1

2λ2t

,

and satisfies the SDE dZt = −λZtdBt, 0 ≤ t ≤ T. Under Q, the process BQ defined by BQ

t := Bt + λt,

is a Brownian motion. Then we can write (29) as dXt = σθtXtdBQ

t ,

(30) so X is a local Q-martingale. Using the Itˆ

formula we write the solution to (30) given X0 = x

as Xt = xE(σθ · BQ)t, 0 ≤ t ≤ T, so by the Novikov condition X will be a Q-martingale provided that θ satisfies the integrability condition E exp 1 2σ2 T θ2

t dt < ∞.

Let us assume this is the case from now on, and we shall see that the optimal strategy does indeed satisfy this integrability condition. 3.1.1 Portfolio optimisation via convex duality Given a continuous, increasing, concave utility function U(·), define its convex conjugate V : R+ → R by V (η) := sup

x∈dom(U)

[U(x) − xη], η > 0. (31) Note that (31) is equivalent to the bidual relation U(x) = inf

y∈R+[V (y) + xy],

x ∈ dom(U). (32) We shall employ the classical dual approach to portfolio optimisation, which is summarised by the following theorem. See Karatzas [9] for a proof (though we shall justify the crucial results).

SLIDE 14

3 MERTON PROBLEM WITH UNCERTAIN DRIFT 14 Theorem 6 In a complete market model with continuous wealth process X and with unique martingale measure Q, define the primal value function by u(x) := sup

θ∈A

EU(XT ), x ∈ dom(U), (33) and define the dual value function by v(η) := EV

η dQ

dP

,

η > 0. Then we have

1. u(x) and v(η) are conjugate:

v(η) = sup

x∈dom(U)

[u(x) − xη], u(x) = inf

η>0[v(η) + xη],

so that u′(x) = η (equivalently, v′(η) = −x);

2. The optimal terminal wealth in (33) is X∗

T satisfying

U ′(X∗

T ) = η dQ

dP , equivalently, X∗

T = I

η dQ

dP

;
3. The following properties for u′(x) and v′(η) hold true:

u′(x) = EU ′(X∗

T ),

v′(η) = EQV ′

η dQ

dP

.

(34) To get an idea of how this theorem is obtained, we consider the maximisation of the objective functional EU(XT ) subject to the constraint EQXT = E[ZT XT ] = x via the Lagrangian L(XT , η) := EU(XT ) + η(x − E[ZT XT ]) = ηx + E [U(XT ) − ηZT XT ] ≤ ηx + EV (ηZT ) = ηx + v(η). where η > 0 is a Lagrange multiplier whose role is to enforce the constraint. The first order condition with respect to the terminal wealth gives that the optimal terminal wealth X∗

T satisfies

U ′(X∗

T ) = ηZT ,

r

X∗

T = I(ηZT ),

where I is the inverse of U ′. Substituting this into the constraint fixes the multiplier η as satisfying E[ZT I(ηZT )] = x. For power utility, U(x) = xγ/γ, we have V (η) = −ηq q , q = − γ 1 − γ , I(η) = η−(1−q). Applying these results we obtain the optimal terminal wealth as X∗

T = (ηZT )−(1−q).

SLIDE 15

3 MERTON PROBLEM WITH UNCERTAIN DRIFT 15 The constraint gives the relation between the Lagrange multiplier and the initial wealth x as η−(1−q)E[Zq

T ] = x.

For t ∈ [0, T] an easy computation using the explicit expression for ZT gives E[Zq

T |Ft] = Zq t exp

−1

2q(1 − q)λ2(T − t)

,

(35) so that η is given explicitly by η−(1−q) = x exp 1 2q(1 − q)λ2T

.

The optimal wealth process X∗ = (X∗

t )0≤t≤T is a Q-martingale, so for t ≤ T we have

X∗

t

= EQ[X∗

T |Ft]

= 1 Zt E[ZT X∗

T |Ft]

= x exp 1 2q(1 − q)λ2t

Z−(1−q)

t

, (36)

n using the explicit forms of η and X∗

T along with (35).

The optimal trading strategy θ∗ is given by applying Itˆ

’s formula to compute dX∗

t and noting

that the coefficient of dBt is equal to σθ∗

t X∗ t . This gives that the optimal strategy is to keep a

constant proportion of wealth in the stock: θ∗

t =

λ σ(1 − γ). (37) The constant proportion on the right-hand side of (37) is called the Merton proportion.

3.2 Partial information case

Recall the stock price process dSt = σSt(λdt + dBt) = σStdξt, In the partial information case the agent is restricted to using ˆ F-adapted trading strategies, where ˆ F is the observation filtration: ˆ F := ( ˆ Ft)0≤t≤T , with ˆ Ft := σ{ξu, 0 ≤ u ≤ t} = σ{Su, 0 ≤ u ≤ t}, and where the observation process is ξ = (ξt)0≤t≤T , defined by ξt := 1 σ t dSu Su = λt + Bt, (38) corresponding to noisy observations of λ, with B representing the noise. The agent is restricted to using ˆ F-adapted strategies, so with continuous observations σ is a known constant and λ is an unknown constant, and hence modelled as a random variable. Let us assume that the distribution

f λ is Gaussian, λ ∼ (λ0, v0), independent of B.

We are faced with a Kalman-Bucy type filtering problem whose unobservable signal process is the market price of risk λ. The signal process SDE is dλ = 0, (39) and the observation process SDE is (38).

SLIDE 16

3 MERTON PROBLEM WITH UNCERTAIN DRIFT 16 We apply Theorem 4 to the signal process λ in (39) and observation process ξ in (38), so that A(t) = C(t) = 0 and G(t) = 1 for all t ∈ [0, T]. Then the optimal filter ˆ λt := E[λ| ˆ Ft], 0 ≤ t ≤ T, satisfies dˆ λt = vtd ˆ Bt, ˆ λ0 = λ0, (40) where vt := E[(λ − ˆ λt)2| ˆ Ft], 0 ≤ t ≤ T, the conditional variance, satisfies the Riccati equation dvt dt = −v2

t ,

(41) with initial value v0, so that vt = v0 1 + v0t. (42) The process ˆ B is an ˆ F-Brownian motion, the innovations process, satisfying d ˆ Bt = dξt − ˆ λtdt, 0 ≤ t ≤ T. (43) Using this in (40), the optimal filter can also be written in terms of the observable ξ as ˆ λt = λ0 + v0ξt 1 + v0t . (44) The effect of the filtering is that the agent is now investing in a stock with dynamics given by dSt = σStdξt which, using (43), becomes dSt = σSt(ˆ λtdt + d ˆ Bt), 0 ≤ t ≤ T. (45) The (ˆ F-adapted) wealth process X0 then follows dX0

t = σθ0 t X0 t (ˆ

λtdt + d ˆ Bt), 0 ≤ t ≤ T, (46) where θ0

t is the proportion of wealth invested in shares at time t, an ˆ

F-adapted process satisfying T

θ0

t

2 dt < ∞ almost surely, and such that X0

t ≥ 0 almost surely for all t ∈ [0, T]. Denote by

A0 the set of such admissible strategies. The objective is to maximise expected utility of terminal wealth E[U(X0

T )|X0 0 = x],

ver all admissible strategies. This may now be treated as a full information problem, which we

solve via duality techniques. Let Q0 denote the unique martingale measure for this market. The change of measure mar- tingale Z0 := (Z0

t )0≤t≤T is given by

Z0

t := dQ0

dP

ˆ

Ft

= E(−ˆ λ · ˆ B)t = exp

−

t ˆ λsd ˆ Bs − 1 2 t ˆ λ2

sds

,

(47) and satisfies the SDE dZ0

t = −ˆ

λtZ0

t d ˆ

Bt, 0 ≤ t ≤ T. (48) We may write Z0

t = f(t, ˆ

λt) where f(t, x) is a smooth function f : [0, T] × R → R+, and apply Itˆ

’s formula along with the SDE (40) for ˆ

λt to give dZ0

t =

ft(t, ˆ

λt) + 1 2v2

t fxx(t, ˆ

λt)

dt + vtfx(t, ˆ

λt)d ˆ Bt, 0 ≤ t ≤ T, (49) with subscripts of f denoting partial derivatives. Equating (48) and (49) yields the partial differential equations (PDEs) vtfx(t, x) = −xf(t, x), ft(t, x) + 1 2v2

t fxx(t, x)

= 0,

SLIDE 17

3 MERTON PROBLEM WITH UNCERTAIN DRIFT 17 with f(0, ·) = Z0

0 = 1. The solution to these PDEs gives Z0 t in the form

Z0

t =

v0 vt 1

2

exp

− 1

2 ˆ λ2

t

vt − λ2 v0

,

0 ≤ t ≤ T. (50) By convex duality, the primal value function u0(x) := sup

θ0∈A0

E[U(X0

T )|X0 0 = x],

x > 0, (51) and dual value function v0(y) := E[˜ (U)(yZ0

T )],

y > 0, (52) are convex conjugates: v0(y) = sup

x>0

(u0(x) − xy), (53) where V (y) := supx>0[U(x) − xy] is the conjugate of U. For power utility, V is given by V (y) = −yq q , q = − γ 1 − γ . (54) Hence the dual value function is given by v0(y) = −yq q H0, where H0 := E

Z0

T

q , and the primal value function is given by u0(x) = xγ γ H1−γ . (55) The optimal terminal wealth X0,∗

T , attained by adopting the strategy that achieves the supremum

in (51), is given by U ′(X0,∗

T ) = yZ0 T , or, with I ≡ −V ′ denoting the inverse of U ′,

X0,∗

T

= −V ′(yZ0

T ),

y = u′

0(x).

Hence X0,∗

T

= x H0

Z0

T

−(1−q) . (56) The optimal wealth process X0,∗ is a (Q0, ˆ F)-martingale, so X0,∗

t

= EQ0[X0,∗

T | ˆ

Ft] = 1 Z0

t

E[Z0

T X0,∗ T | ˆ

Ft], (57) where EQ0 denotes expectation under Q0. From (40) we deduce that for t ≤ T, conditional on ˆ Ft, ˆ λT is normally distributed, ˆ λT ∼ N(ˆ λt, vt − vT ). For a normally distributed random variable Y ∼ N(m, s2), we have E exp(cY 2) = 1 √ 1 − 2cs2 exp

cm2

1 − 2cs2

,

(58) so that, given the explicit expression (50) for Z0

t , the constant H0 and the right-hand-side of (57)

can be computed in closed form. Then, H0 is given by H0 = (1 + v0T)q 1 + qv0T 1/2 exp

−1

2 q(1 − q)λ2

0T

1 + qv0T

,

(59)

SLIDE 18

4 OPTIMAL HEDGING OF BASIS RISK WITH PARTIAL INFORMATION 18 so that (55) gives the value function explicitly, as u0(x) = xγ γ

1 − γ − γv0T

(1 − γ)(1 + v0T) γ/2 1 − γ 1 − γ − γv0T 1/2 exp 1 2 γλ2

0T

1 − γ − γv0T

.

(60) Note that for a well-posed problem we require the risk aversion coefficient γ to satisfy 1 − γ − γv0T > 0 or γ < 1 1 = v0T . For the optimal wealth process, straightforward computations give the formula X0,∗

t

= x Ψ0

t

Ψ0 1

2

exp 1 2(1 − q)(Λ0

t − Λ0 0)

,

(61) where, for t ∈ [0, T], Ψ0

t :=

vt 1 + qvt(T − t), Λ0

t :=

ˆ λ2

t

vt(1 + qvt(T − t)). (62) To compute the optimal trading strategy θ0,∗, we apply the Itˆ

formula to (61) and compare the

coefficient of d ˆ Bt with that in (46) for the case of the optimal wealth process. This gives θ0,∗

t

= ˆ λt σ(1 − γ)

1

1 + qvt(T − t)

.

(63) The classical Merton formula is thus altered in two ways: the constant λ is replaced by its filtered estimate ˆ λt, and the risky asset proportion is decreased by the factor (1 + vt(T − t))−1. As noted by Rogers [20], the the more risk averse the investor, the less likely he is to invest in shares, and as t → T, the optimal strategy gets closer and closer to the Merton rule.

4 Optimal hedging of basis risk with partial information

Now we analyse the hedging of a contingent claim in an example of an incomplete market, a basis risk model, first under a full information assumption, and then under a partial information

scenario. See [14, 15, 16] for more details on these models.

4.1 Basis risk model: full information case

In a full information model, the setting is a filtered probability space (Ω, F, F := (Ft)0≤t≤T , P), where the filtration F is the P-augmentation of that generated by a two-dimensional Brownian motion (B, B⊥). A traded stock price S := (St)0≤t≤T follows a log-Brownian process given by dSt = σSt(λdt + dBt) =: σStdξt, (64) where σ > 0 and λ are known constants. For simplicity, the interest rate is taken to be 0. The process ξ in (64) defined by dξt := λdt + dBt will subsequently play a role as one component

f an observation process in a partial information model, when λ will be treated as a random

variable rather than as a known constant. A non-traded asset price Y := (Yt)0≤t≤T follows the correlated log-Brownian motion dYt = βYt(θdt + dWt) =: βYtdζt, (65) with β > 0 and θ known constants. The Brownian motion W is correlated with B according to d[B, W]t = ρdt, W = ρB +

1 − ρ2B⊥,

ρ ∈ [−1, 1], and the process ζ, given by dζt := θdt+dWt, will act as the second component of an observation process in a partial information model, when θ will be considered a random variable. We shall

SLIDE 19

4 OPTIMAL HEDGING OF BASIS RISK WITH PARTIAL INFORMATION 19 henceforth refer to the Sharpe ratios λ (respectively, θ) as the drift of S (respectively, Y ), for brevity. A European contingent claim pays the non-negative random variable h(YT ) at time T. In what follows we shall consider utility maximisation problems with the additional random terminal endowment nh(YT ), for n ∈ R, and we make the following assumption on the random endowment. Assumption 1 The random endowment nh(YT ) is continuous and bounded below, with finite expectation under any martingale measure. An agent may trade the stock in a self-financing fashion, leading to the portfolio wealth process X = (Xt)0≤t≤T satisfying dXt = σπt(λdt + dBt), where π := (πt)0≤t≤T is the wealth in the stock, representing the agent’s trading strategy, satisfying T

0 π2 t dt < ∞ almost surely.

4.1.1 Perfect correlation case This market is incomplete for |ρ| = 1. If the correlation is perfect, however, the market becomes complete and perfect hedging is possible as shown below (see Monoyios [15] for more details), The minimal martingale measure QM has density process with respect to P given by dQM dP

Ft

= E (−λ · B)t , 0 ≤ t ≤ T. Under QM, (S, Y ) follow dSt = σStdBQM

t

, dYt = β (θ − ρλ) Ytdt + βYtdW QM

t

, where BQM , W QM are correlated Brownian motions under QM. The stock price S is a local QM-martingale , but this is not the case for the non-traded asset, unless we have the perfect correlation case, ρ = 1. In this case Y is effectively a traded asset (as Yt is then a function of St), so the QM-drift of Y vanishes. Therefore, given σ, β, in the ρ = 1 case the drifts are related by θ = λ. In this case the market becomes complete, and perfect hedging is possible. It is easy to show that with ρ = 1, so that W = B, we have Yt = Y0 St S0 β/σ ect c = 1 2σβ

1 − β

σ

Let the claim price process be v(t, Yt), 0 ≤ t ≤ T, where v : [0, T] × R+ → R+ is smooth enough

to apply the Itˆ

formula, so that

dv(t, Yt) =

vt(t, Yt) + AY v(t, Yt)
dt + βYtvy(t, Yt)dWt,

where AY is the generator of the process Y in (65). The replication conditions are Xt = v(t, Yt), 0 ≤ t ≤ T, dXt = dv(t, Yt). Standard arguments then show that to perfectly hedge the claim one must hold ∆t shares of S at t ∈ [0, T], given by ∆t = β σ Yt St ∂v ∂y (t, Yt), (66)

SLIDE 20

4 OPTIMAL HEDGING OF BASIS RISK WITH PARTIAL INFORMATION 20 and the claim pricing function v(t, y) satisfies vt(t, y) + β(θ − λ)yvy(t, y) + 1

2β2y2vyy(t, y) = 0,

v(T, y) = h(y). But with ρ = 1, θ = λ, so we get the BS partial differential equation (PDE), and v(t, Yt) = BS(t, Yt), where BS(t, y) denotes the BS option pricing formula at time t, with underlying asset price y. Therefore, a position in n claims is hedged by ∆(BS)

t

units of S at t ∈ [0, T], where ∆(BS)

t

= −nβ σ Yt St ∂ ∂y BS(t, Yt; β), (67) and where BS(t, y; β) denotes the BS formula at time t for underlying asset price y and volatility β. From our perspective, the salient feature of (67) is that the perfect hedge does not require knowledge of the values of the drifts λ, θ. 4.1.2 Incomplete case Now suppose the correlation is not perfect, so that the market is incomplete. We embed the problem in a utility maximisation framework in a manner that is by now classical. Let the agent have risk preferences expressed via the exponential utility function U(x) = − exp(−αx), x ∈ R, α > 0. The agent maximises expected utility of terminal wealth at time T, with a random endowment

f n units of claim payoff:

J(t, x, y; π) = E[U(XT + nh(YT ))|Xt = x, Yt = y]. The value function is u(n)(t, x, y) ≡ u(t, x, y), defined by u(t, x, y) := sup

π∈A

J(t, x, y; π), (68) u(T, x, y) = U(x + nh(y)). (69) Denote the optimal trading strategy that achieves the supremum in (68) by π∗ ≡ π∗,n, and denote the optimal wealth process by X∗ ≡ X∗,n. The following definitions of utility-based price and hedging strategy are now standard (see [14, 17] for instance). Definition 1 (Indifference price) The indifference price per claim at t ∈ [0, T], given Xt = x, Yt = y, p(t, x, y) ≡ p(n)(t, x, y), is defined by u(n)(t, x − np(n)(t, x, y), y) = u(0)(t, x, y) We allow for possible dependence on t, x, y of p(n) in the above definition, but with exponential preferences it turns out that there is no dependence on x. Definition 2 (Optimal hedging strategy) The optimal hedging strategy for n units of the claim is πH := (πH

t )0≤t≤T given by

πH

t := π∗,n t

− π∗,0

t

, 0 ≤ t ≤ T. The solution to the optimisation problem (68) is well-known, using the so-called distortion tech- nique [23]. See [14] for more details.

SLIDE 21

4 OPTIMAL HEDGING OF BASIS RISK WITH PARTIAL INFORMATION 21 The HJB equation for the value function u is ut + AY u − (λux + ρβyuxy)2 2uxx = 0. The optimal trading strategy π∗ is given by π∗

t = Π∗(t, X∗ t , Yt), where the function Π∗ : [0, T] ×

R × R+ is given by Π∗(t, x, y) := − λux + ρβyuxy σuxx

.

(70) We have the following representation for the value function and indifference price. Proposition 2 [6, 14, 17] The value function u ≡ u(n) and indifference price p ≡ p(n), given Xt = x, Yt = y for t ∈ [0, T], are given by u(n)(t, x, y) = −e−αx− 1

2 λ2(T −t) [F(t, Y )]1/(1−ρ2) ,

F(t, y) = EQM exp

−α(1 − ρ2)nh(YT )
Yt = y
,

(71) p(n)(t, y) = − 1 α(1 − ρ2)n log F(t, y). The function F(t, y) satisfies a linear PDE by virtue of the stochastic representation (71) and the Feynman-Kac theorem. The indifference pricing function p(t, y) ≡ p(n)(t, y) then satisfies pt + β(θ − ρλ)ypy + 1 2β2y2pyy − 1 2β2y2αn(1 − ρ2)(py)2 = 0. Given the above results, it is easy to show that the expression (70) for the optimal control loses its dependence on x and simplifies to Π∗(t, y) := 1 ασb(t, T)

λ +

ρβy 1 − ρ2 Fy F

.

Then, applying Definition 2 gives the optimal hedging strategy for a position in n claims (see [14] for further details of this derivation). Proposition 3 The optimal hedging strategy for a position in n claims is to hold ∆I

t shares at

t ∈ [0, T], given by ∆I

t

= −nρβ σ Yt St ∂p(n) ∂y (t, Yt), (72) p(t, y) = − 1 αn(1 − ρ2) log EQM exp

−α(1 − ρ2)nh(YT )
Yt = y
,

We note that if n = 1 and ρ = 1, we recover the perfect delta hedge (66), and that the claim price then satisfies the BS PDE. The measure QM is the minimal martingale measure for the model, under which (St)0≤t≤T is a local martingale and under which Y follows dYt = βYt

(θ − ρλ)dt + dW QM

t

,

(73) for a QM-Brownian motion W QM . In [14, 15] the hedging strategy in (72) is shown to be superior to the BS-style hedge (67), in terms of the terminal hedging error distribution produced by selling the claim at the appropriate price (the indifference price or the BS price) and investing the proceeds in the corresponding hedging portfolio. But from (73) we see that the exponential hedge requires knowledge of λ, θ, which are impossible to estimate accurately (see Rogers [20] or Monoyios [15]). This can ruin the effectiveness of indifference hedging, as shown in [15]. It is therefore dubious to draw any meaningful conclusions on the effectiveness of utility-based hedging in this model without relaxing the assumption that the agent knows the true values of the drifts.

SLIDE 22

4 OPTIMAL HEDGING OF BASIS RISK WITH PARTIAL INFORMATION 22

4.2 Partial information case

Now we assume the hedger does not know the values of the return parameters λ, θ, so these are considered to be random variables. Equivalently, the agent cannot observe the Brownian motions B, W driving the asset prices, so is required to use strategies adapted to the observation filtration ˆ F generated by asset returns. 4.2.1 Choice of prior We take the the two-dimensional random variable U := λ θ

to have a Gaussian distribution which will be updated as the agent attempts to filter the values
f the drifts from asset observations during the hedging interval [0, T].

The choice of Gaussian prior is motivated by the idea that the agent has some past observa- tions of S, Y before time 0, uses these to obtain classical point estimates of the drifts, and the joint distribution of the estimators is used as the prior in a Bayesian framework. Ultimately, in

rder to obtain explicit solutions, we shall assume that the agent uses observations before time 0
f equal length for both assets. In setting the prior this way, we make the approximation that the

asset price observations are continuous, so that σ, β, ρ are known from the quadratic variation and co-variation of S, Y . This is because our goal here is to focus on the severest problem of drift parameter uncertainty. So, consider, for the moment, an observer with data for S over a time interval of length tS, and for Y over a window of length tY , who considers λ and θ as constants, and records the returns dSt/St and dYt/Yt in order to estimate the values of the drifts. The best estimator of λ is ¯ λ(tS) given by ¯ λ(tS) = 1 tS t0+tS

t0

dSu σSu = λ + Bt0+tS tS ∼ N

λ, 1

tS

,

where N(µ, Σ) denotes the normal probability law with mean µ and variance Σ. The estimator

f λ is normally distributed, with a similar computation for the estimator of θ. The estimator,

(¯ λ, ¯ θ), of the (supposed constant) vector (λ, θ) is bivariate normal. Defining v0 := 1/tS and w0 := 1/tY it is easily checked that ¯ λ ¯ θ

∼ N(M, C0),

where the mean vector M and covariance matrix C0 are given by M = λ θ

,

C0 =

v0

ρ min(v0, w0) ρ min(v0, w0) w0

.

(74) With this in mind, we shall suppose that (λ, θ), now considered as a random variable, is bivariate normal according to λ ∼ N(λ0, v0), θ ∼ N(θ0, w0), cov(λ, θ) = c0 := ρ min(v0, w0), for some chosen values λ0, θ0, typically obtained from past data prior to time 0. This distribution will be updated via subsequent observations of ξt := 1 σ t dSu Su = λt + Bt, ζt := 1 β t dYu Yu = θt + Wt,

ver the hedging interval [0, T].

SLIDE 23

4 OPTIMAL HEDGING OF BASIS RISK WITH PARTIAL INFORMATION 23 4.2.2 Two-dimensional Kalman-Bucy filter We are firmly within the realm of a two-dimensional Kalman filtering problem, which we treat as follows. Define the observation filtration by ˆ F := ( ˆ Ft)0≤t≤T , ˆ Ft = σ(ξs, ζs; 0 ≤ s ≤ t). The observation process, O, and unobservable signal process, U, are defined by O := ξt ζt

0≤t≤T

, U := λ θ

,

satisfying the stochastic differential equations dOt = Udt + DdBt, dU =

,

where D = 1 ρ

1 − ρ2
,

Bt = Bt B⊥

t

.

The optimal filter is ˆ Ut := E[U| ˆ Ft], 0 ≤ t ≤ T, a two-dimensional process defining the best estimates of λ and θ given observations up to time t ∈ [0, T]: ˆ Ut ≡ ˆ λt ˆ θt

:=

E[λ| ˆ Ft] E[θ| ˆ Ft]

,

ˆ λ0 ˆ θ0

=

λ0 θ0

.

(75) The solution to this filtering problem converts the partial information model to a full information model with random drifts, given in the following proposition. To avoid a proliferation of symbols, we abuse notation and write ˆ λt ≡ ˆ λ(t, St) and ˆ θ ≡ ˆ θ(t, Yt) for processes ˆ λ, ˆ θ that will turn out to be functions of time and current asset price. Proposition 4 The partial information model is equivalent to a full information model in which the asset price dynamics in the observation filtration ˆ F are dSt = σSt(ˆ λtdt + d ˆ Bt), (76) dYt = βYt(ˆ θtdt + d ˆ Wt), (77) where ˆ B, ˆ W are ˆ F-Brownian motions with correlation ρ, and the random drifts ˆ λ, ˆ θ are ˆ F-adapted processes. If λ and θ have common initial variance v0, then ˆ λ, ˆ θ are given by ˆ λt ˆ θt

=

λ0 θ0

+

t vu d ˆ Bu d ˆ Wu

,

0 ≤ t ≤ T, (78) where (vt)0≤t≤T is the deterministic function vt := v0 1 + v0t. Equivalently, ˆ λ, ˆ θ are given as functions of time and current asset price by ˆ λt = ˆ λ(t, St) = λ0 + v0ξt 1 + v0t , ˆ θt = ˆ θ(t, Yt) = θ0 + v0ζt 1 + v0t , (79) with ξt = 1 σ log St S0

+ 1

2σt, ζt = 1 β log Yt Y0

+ 1

2βt. (80)

SLIDE 24

4 OPTIMAL HEDGING OF BASIS RISK WITH PARTIAL INFORMATION 24 Proof By the Kalman-Bucy filter, Theorem 5, ˆ U satisfies the stochastic differential equation d ˆ Ut = Ct

DDT −1 (dOt − ˆ

Utdt) =: Ct

DDT −1 dNt,

(81) where (Nt)0≤t≤T is the innovations process, defined by Nt := Ot − t ˆ Usds = ξt − t

0 ˆ

λsds ζt − t

0 ˆ

θsds

=:

ˆ Bt ˆ Wt

,

(82) and ˆ B, ˆ W are ˆ F-Brownian motions with correlation ρ. The deterministic matrix function Ct is the conditional variance-covariance matrix defined by Ct := E

(U − ˆ

Ut)(U − ˆ Ut)T

ˆ

Ft

= E
(U − ˆ

Ut)(U − ˆ Ut)T , (T denoting transpose) where the last equality follows because the error U − ˆ Ut is independent

f ˆ

Ft. Using (82), and writing dSt in terms of dξt, as in (64), gives the dynamics (76) of S in the

bservation filtration; (77) is established similarly.

The matrix C = (Ct)0≤t≤T satisfies the Riccati equation dCt dt = −Ct

DDT −1 Ct,

with C0 given in (74). Then Rt := C−1

t

satisfies the Lyapunov equation dRt dt =

DDT −1 .

Define the elements of the conditional covariance matrix by Ct =: vt ct ct wt

.

Then the filtering equation (81) is a pair of coupled stochastic differential equations: dˆ λt dˆ θt

=

1 1 − ρ2 vt − ρct ct − ρvt ct − ρwt wt − ρct dξt − ˆ λtdt dζt − ˆ θtdt

=

1 1 − ρ2 vt − ρct ct − ρvt ct − ρwt wt − ρct d ˆ Bt d ˆ Wt

.

Solving the Lyapunov equation yields 3 equations for vt, wt, ct: vt vtwt − c2

t

− v0 v0w0 − c2 = t 1 − ρ2 , wt vtwt − c2

t

− w0 v0w0 − c2 = t 1 − ρ2 , (83) ct vtwt − c2

t

− c0 v0w0 − c2 = ρt 1 − ρ2 , where we have written c0 ≡ ρ min(v0, w0) for brevity. Now make the simplification w0 = v0. From the discussion in Section 4.2.1, we see that this corresponds to using past observations over the same length of time, tS = tY , for both S and Y

SLIDE 25

4 OPTIMAL HEDGING OF BASIS RISK WITH PARTIAL INFORMATION 25 in fixing the prior. Then c0 = ρv0, and the solution to the system of equations (83) gives the entries of the matrix Ct as vt = v0 1 + v0t, wt = vt, ct = ρvt. With this simplification, the equation for the optimal filter simplifies to dˆ λt dˆ θt

= vt

dξt − ˆ λtdt dζt − ˆ θtdt

= vt

d ˆ Bt d ˆ Wt

,

(84) which, along with the initial condition in (75), yields (78) and (79). Finally, the expressions in (80) for ξt, ζt follow directly from the solutions of (64) and (65) for S and Y .

Armed with Proposition 4 we may now treat the model as a full information model with

random drift parameters (ˆ λt, ˆ θt), and this is done in the next section. We make the following

bservations:
The formulae ˆ

λ(t, St), ˆ θ(t, Yt) for the random drifts, in terms of current asset price, allow the model to be expressed in Markovian form with only one extra state variable (the stock price S) in the stochastic control problems for the optimal price and hedge compared with the full information case with constant drifts, as we shall see.

The simplification w0 = v0 is made to allow for a simple analytic solution to the marginal
ptimal hedging problem. One can proceed without this assumption, but the formulae

become more complicated. 4.2.3 Optimal hedging with random drifts On the stochastic basis (Ω, ˆ F, ˆ F, P), the wealth process associated with trading strategy π := (πt)0≤t≤T , an ˆ F-adapted process satisfying the integrability condition T

0 π2 t dt < ∞ a.s., is X =

(Xt)0≤t≤T , satisfying dXt = σπt(ˆ λtdt + d ˆ Bt). (85) The class M of local martingale measures for this model consists of measures Q with density processes defined by Zt := dQ dP

ˆ

Ft

= E(−ˆ λ · ˆ B − ψ · ˆ B⊥)t, 0 ≤ t ≤ T, (86) for integrands ψ satisfying t

0 ψ2 udu < ∞ a.s., for all t ∈ [0, T] (it is not hard to show that

t

0 ˆ

λ2

udt < ∞, 0 ≤ t ≤ T). For ψ = 0 we obtain the minimal martingale measure QM.

Under Q ∈ M, ( ˆ BQ, ˆ B⊥,Q) is two-dimensional Brownian motion, where d ˆ BQ

t := d ˆ

BQ

t + ˆ

λtdt, d ˆ B⊥,Q

t

:= d ˆ B⊥

t + ψtdt,

and the asset prices and random drifts satisfy dSt = σStd ˆ BQ

t ,

dYt = βYt[(ˆ θt − ρˆ λt −

1 − ρ2ψt)dt + d ˆ

W Q

t ],

dˆ λt = vt[−ˆ λtdt + d ˆ BQ

t ],

dˆ θt = vt[−(ρˆ λt +

1 − ρ2ψt)dt + d ˆ

W Q

t ],

where ˆ W Q = ρ ˆ BQ +

1 − ρ2 ˆ

B⊥,Q.

SLIDE 26

4 OPTIMAL HEDGING OF BASIS RISK WITH PARTIAL INFORMATION 26 The relative entropy between Q ∈ M and P is defined by H(Q, P) := E dQ dP log dQ dP

=

EQ

−

T ˆ λtd ˆ BQ

t −

T ψtd ˆ B⊥,Q

t

+ 1 2 T

ˆ

λ2

t + ψ2 t

dt
.

Using the Q-dynamics of ˆ λt it is straightforward to establish that EQ t

0 ˆ

λ2

udu < ∞ for all

t ∈ [0, T]. If, in addition, we have the integrability condition EQ t ψ2

udu < ∞,

0 ≤ t ≤ T, (87) then H(Q, P) = EQ

1

2 T

ˆ

λ2

t + ψ2 t

dt
< ∞.

(88) In this case we write Q ∈ Mf, where Mf denotes the set of martingale measures Q with finite relative entropy with respect to P, and we define H(Q, P) := ∞ otherwise. From (88) we note that the minimal entropy measure QE is given by H(QE, P) = EQ

1

2 T ˆ λ2

tdt

,

corresponding to ψ ≡ 0 in (88). This means that the minimal martingale measure and the minimal entropy measure in this model coincide: QE = QM. For an initial time t ∈ [0, T], we define the conditional entropy between Q ∈ M and P by Ht(Q, P) := E ZT Zt log ZT Zt

ˆ

Ft

, 0 ≤ t ≤ T,

(89) satisfying H0(Q, P) ≡ H(Q, P). Provided the integrability condition (87) is satisfied, then Ht(Q, P) = EQ

1

2 T

t

ˆ

λ2

u + ψ2 u

du
ˆ

Ft

,

and we define Ht(Q, P) := ∞ otherwise. In particular, therefore, recalling that ˆ λt ≡ ˆ λ(t, St) is a smooth and Lipschitz function of time and current stock price, and that the Q-dynamics

f ˆ

λt do not depend on ψt for any Q ∈ M, the minimal conditional entropy (Ht(QE, P))0≤t≤T will be a deterministic function of time and stock price, given by Ht(QE, P) ≡ HE(t, St) for a C1,2([0, T] × R+) function HE defined by HE(t, s) := EQE

1

2 T

t

ˆ λ2(u, Su)du

St = s
.

(90) 4.2.4 The primal problem We use an exponential utility function, U(x) = − exp(−αx), x ∈ R, α > 0. The primal value function u ≡ u(n) is defined as the maximum expected utility of wealth at T from trading S and receiving n units of the claim on Y , when starting at time t ∈ [0, T]: u(n)(t, x, s, y) := sup

π∈A

E[U(XT + nh(YT ))|Xt = x, St = s, Yt = y], (91) where A denotes the set of admissible trading strategies. The dynamics of the state variables X, S, Y are given by (85) and (76,77). For starting time 0 we write u(n)(x) ≡ u(n)(0, x, ·, ·).

SLIDE 27

4 OPTIMAL HEDGING OF BASIS RISK WITH PARTIAL INFORMATION 27 The set of admissible strategies is defined as follows. Denote by ∆ := π/S be the adapted S-integrable process for the number of shares held. We follow Becherer [2] and take the space of permitted strategies as A = {∆ : (∆ · S) is a (Q, ˆ F)-martingale for all Q ∈ Mf}, where (∆ · S)t = t

0 ∆udSu is the gain from trading over [0, t], t ∈ [0, T]. Other choices for A are

possible, see for example Schachermayer [22] or Delbaen et al [5]. However, these all lead to the same solution for the dual problem (see [5, 8, 22]), and hence to the same primal solution for the utility maximisation problem. Note that we could also write the value function in (91) as a function of wealth, the non-traded asset price, and the current values of the random drifts: u(t, x, y; ˆ λt, ˆ θt), with the dynamics of the drifts given by (84). We do not pursue this as the dimension of the resulting HJB equation is initially higher than in our formulation, and could be reduced by changing variables using the formulae for the random drifts in (79). Denote the optimal trading strategy by π∗ ≡ π∗,n, and the optimal wealth process by X∗ ≡ X∗,n. The utility-based price and hedge for a position in n claims are defined in the now classical

manner. The indifference price per claim at t ∈ [0, T], given Xt = x, St = s, Yt = y, is p(n) given

by u(n)(t, x − np(n)(t, x, s, y), s, y) = u(0)(t, x, s). The optimal hedging strategy is to hold (∆H

t )0≤t≤T shares of stock at time t, where ∆H t St =:

πH

t St, and πH :=

πH

t

0≤t≤T , is defined by

πH

t := π∗,n t

− π∗,0

t

, 0 ≤ t ≤ T. (92) It is well known that with exponential utility the indifference price is independent of the initial cash wealth x, so we shall write p(n)(t, x, s, y) ≡ p(n)(t, s, y) from now on. For small positions in the claim (or, equivalently, for small risk aversion), we shall later approximate the indifference price by the marginal utility-based price introduced by Davis [4]. This is the indifference price for infinitesimal diversions of funds into the purchase or sale of claims, and is equivalent (as is well-known) to the limit of the indifference price as n → 0. Definition 3 (Marginal price) The marginal utility-based price of the claim at t ∈ [0, T] is ˆ p(t, s, y) defined by ˆ p(t, s, y) := lim

n→0 p(n)(t, s, y).

It is well known that with exponential utility the marginal price is also equivalent to the limit

f the indifference price as risk aversion goes to zero. Under appropriate conditions (satisfied

in this model) it is given by the expectation of the payoff under the optimal measure of the dual problem without the claim. For exponential utility this measure is the minimal entropy measure QE and, as we have already seen, in our model QE = QM, giving the representation ˆ p(t, s, y) = EQM [h(YT )|St = s, Yt = y], as we shall see in the next section. 4.2.5 Dual problem and optimal hedge We attack the primal utility maximisation problem (91) using well-known duality results that are now a classical tool for incomplete market optimisation problems (see the seminal papers by Kramkov and Schachermayer [12] and Karatzas et al [10]). For a problem with the random terminal endowment of a European claim, and with exponential utility, as in this paper, Delbaen et al [5] establish the required duality relations between the primal and dual problems in a semimartingale setting. We shall use these results below to establish a simple algebraic relation (Lemma 2) between the primal value function and the indifference price, which we shall then exploit to derive the representation for the optimal hedging strategy. The dual problem with starting time 0 has value function defined by v(n)(η) := inf

Q∈M E [V (ηZT ) + ηZT nh(YT )] ,

SLIDE 28

4 OPTIMAL HEDGING OF BASIS RISK WITH PARTIAL INFORMATION 28 where Z is the density process in (86) and V is the convex conjugate of the utility function: V (η) := sup

x∈R

[U(x) − xη]. For exponential utility V is given by V (η) = η α

log

η α

− 1
.

Hence the dual value function has the well-known entropic representation v(n)(η) = V (η) + η α inf

Q∈M

H(Q, P) + αnEQh(YT )
.

Denoting the dual minimiser that attains the above infimum by Q∗,n, we observe that Q∗,n ∈ Mf. For a starting time t ∈ [0, T] the dual value function is defined by v(n)(t, η, s, y) := inf

Q∈M E

V
η ZT

Zt

+ η ZT

Zt nh(YT )

St = s, Yt = y
,

(93) and we write v(n)(η) ≡ v(n)(0, η, ·, ·). Lemma 2 The primal value function and indifference price are related by u(n)(t, x, s, y) = u(0)(t, x, s) exp

−αnp(n)(t, s, y)
,

(94) where the value function without the claim is given by u(0)(t, x, s) = − exp

−αx − HE(t, s)
,

(95) and HE(t, s) is the conditional minimal entropy function defined in (90). Proof For brevity, we give the proof for t = 0. The proof for a general starting time follows similar lines, and we make some comments on how to adapt the following argument for that case at the end of the proof. The fundamental duality linking the primal and dual problems in Delbaen et al [5] implies that the value functions u(n)(x) and v(n)(η) are conjugate: v(n)(η) = sup

x∈R

[u(n)(x) − xη], u(n)(x) = inf

η>0[v(n)(η) + xη].

The value of η attaining the above infimum is η∗, given by v(n)

η (η∗) = −x, so that

u(n)(x) = v(n)(η∗) + xη∗, which translates to u(n)(x) = − exp

−αx − inf

Q∈M

H(Q, P) + αnEQh(YT )
.

(96) So, in particular, u(0)(x) = − exp

−αx − H(QE, P)
,

(97) where QE is the minimal entropy measure: QE = Q∗,0 Combining the dual representations (96) and (97) for the primal problems with and without the claim, with the definition of the indifference price, gives the dual representation for the utility-based price in the form p(n) = 1 αn

inf

Q∈M

H(Q, P) + αnEQh(YT )
− H(QE, P)
,

(98)

SLIDE 29

4 OPTIMAL HEDGING OF BASIS RISK WITH PARTIAL INFORMATION 29 which is the representation found in Delbaen et al [5], modified slightly as we have a random endowment of n claims ([5] considered the case n = −1). In particular, for n → 0 or α → 0, we obtain the marginal price of Davis [4]: ˆ p := lim

n→0 p(n) = EQEh(YT ) = EQM h(YT ),

(99) the last inequality following from the equality of QM and QE, as implied by (88). From (96)–(98), the relation between the primal value functions and indifference price then follows immediately, as u(n)(x) = − exp

−αx − H(QE, P) − αnp(n)

= u(0)(x) exp

−αnp(n)

. Similarly, a corresponding relation for a starting time t ∈ [0, T] may also be derived. This is achieved using the definition (93) of the dual value function for an initial time t ∈ [0, T], the conjugacy of u(n)(t, x, s, y) and v(n)(t, η, s, y) and the definitions (89) and (90) of the conditional entropy and conditional minimal entropy.

Using Lemma 2 we obtain the following representation for the optimal hedging strategy

associated with the indifference price. In what follows we assume that the indifference price is a suitably smooth function of (t, s, y), so that (given Lemma 2) we may assume the primal value function is smooth enough to be a classical solution of the associated Hamilton-Jacobi-Bellman (HJB) equation. This smoothness property is confirmed in [16]. Theorem 7 The optimal hedge for a position in n claims is to hold ∆H

t units of S at t ∈ [0, T],

where ∆H

t = −n

p(n)

s (t, St, Yt) + ρβ

σ Yt St p(n)

y (t, St, Yt)

.

Remark 2 We note the extra term in the hedging formula compared with the corresponding full information result (72). The drift parameter uncertainty results in additional risk, manifested as dependence of the indifference price on the stock price, and hence the derivative with respect to the stock price appears in the theorem. Proof The HJB equation associated with the primal the value function is u(n)

t

+ max

π

AX,S,Y u(n) = 0, where AX,S,Y is the generator of (X, S, Y ) under P. Performing the maximisation over π yields the optimal Markov control as π∗,n

t

= π∗,n(t, X∗,n

t

, St, Yt), where π∗,n(t, x, s, y) = − ˆ λu(n)

x

+ σsu(n)

xs + ρβyu(n) xy

σu(n)

xx

,

and where the arguments of the functions on the right-hand-side are omitted for brevity. For the case n = 0 there is no dependence on y in the value function u(0), and we have π∗,0

t

= π∗,0(t, X∗,0

t

, St), where π∗,0(t, x, s) = − ˆ λu(0)

x

+ σsu(0)

xs

σu(0)

xx

.

Applying the definition (92) of the optimal hedging strategy along with the representations (94) and (95) from Lemma 2 for the value functions, gives the result.

SLIDE 30

4 OPTIMAL HEDGING OF BASIS RISK WITH PARTIAL INFORMATION 30 4.2.6 Stochastic control representation of the indifference price The dual representation (98) of p(n) gives the price of the claim at time 0 as the value function

f a control problem:

p(n) = inf

ψ EQ

1

2αn T ψ2

t dt + h(YT )

,

to be minimised over control processes (ψt)0≤t≤T , such that Q ∈ Mf, and with dynamics for S, Y given by dSt = σStd ˆ BQ

t ,

dYt = βYt[(ˆ θ(t, Yt) − ρˆ λ(t, St) −

1 − ρ2ψt)dt + d ˆ

W Q

t ].

For a starting time t ∈ [0, T] we have p(n)(t, s, y) = inf

ψ EQ

1

2αn T

t

ψ2

udu + h(YT )

St = s, Yt = y
.

The HJB dynamic programming PDE associated with p(n)(t, s, y) is p(n)

t

+ AQM

S,Y p(n) + infψ

1

2αnψ2 − β

1 − ρ2ψyp(n)

y

= 0,

p(T, s, y) = h(y), where AQM

S,Y is generator of (S, Y ) under minimal measure:

AQM

S,Y f(t, s, y) = β(ˆ

θ(t, y) − ρˆ λ(t, s))yfy + 1 2ssfss + 1 2β2y2fyy + ρσβsyfsy. The optimal Markov control is ψ∗,n

t

≡ ψ∗,n(t, St, Yt), where ψ∗,n(t, s, y) = αn

1 − ρ2βyp(n)

y (t, s, y),

and note that ψ∗,0 = 0. Substituting back into the HJB equation, we find that p(n) is expected to solve the semi-linear PDE p(n)

t

+ AQM

S,Y p(n) − 1 2αn(1 − ρ2)β2y2

p(n)

y

2 = 0, p(n)(T, s, y) = h(y). (100) We note that for n = 0 this becomes a linear PDE for the marginal price ˆ p, so that by the Feynman-Kac Theorem we have ˆ p(t, s, y) = EQM

t,s,yh(YT ),

(101) consistent with the general result (99). We shall see that in this case the marginal price is given by a BS-type formula. 4.2.7 Analytic approximation for the indifference price To obtain analytic results we approximate the indifference price by the marginal price in (101). The marginal price (and hence the associated trading strategy) can be computed in analytic form since, under QM, log YT is Gaussian. We have the following result. Proposition 5 Under QM, conditional on St = s, Yt = y, log YT ∼ N(m, Σ2), where m ≡ m(t, s, y) and Σ2 ≡ Σ2(t) are given by m(t, s, y) = log y + β

ˆ

θ(t, y) − ρˆ λ(t, s) − 1 2β

(T − t)

Σ2(t) =

1 + (1 − ρ2)vt(T − t)
β2(T − t)

SLIDE 31

5 INVESTMENT WITH INSIDE INFORMATION AND DRIFT UNCERTAINTY 31 Proof This is established by computing the SDEs for Y and for ˆ θt − ρˆ λt under QM. Indeed, applying the Itˆ

formula to log Yt under QM, we obtain, for t < T,

log YT = log Yt + β T

t

ˆ

θu − ρˆ λu

du − 1

2β2(T − t) + β T

t

d ˆ W QM

u

, (102) where ˆ W QM is a Brownian motion under QM. The dynamics of ˆ θt − ρˆ λt under QM are d(ˆ θt − ρˆ λt) =

1 − ρ2vtd ˆ

B⊥,QM

t

, where ˆ B⊥,QM is a QM-Brownian motion perpendicular to that driving the stock, related to ˆ W QM by ˆ W QM = ρ ˆ BQM +

1 − ρ2 ˆ

B⊥,QM , and where ˆ BQM is the Brownian motion driving S. Hence, for u > t, after changing the order of integration in a double integral, we obtain T

t

ˆ

θu − ρˆ λu

du =
ˆ

θt − ρˆ λt

(T − t) +
1 − ρ2

T

t

vu(T − u)d ˆ B⊥,QM

u

. This can be inserted into (102) to yield the desired result.

We are thus able to obtain BS-style formulae for the price and hedge. For a put option of

strike K we easily obtain the following explicit formulae for the marginal price and the associated

ptimal hedging strategy, where Φ denotes the standard cumulative normal distribution function.

Corollary 1 With m and Σ as in Proposition 5, define b ≡ b(t, s, y) by m = log y + b − 1 2Σ2. Then the marginal price at time t ∈ [0, T] of a put option with payoff (K − YT )+ is ˆ p(t, St, Yt), where ˆ p(t, s, y) = KΦ(−d1 + Σ) − yebΦ(−d1), d1 = 1 Σ

log

y K

+ b + 1

2Σ2

.

The optimal hedging strategy given by Theorem 7 with ˆ p as an approximation to the indifference price is ˆ ∆t ≡ ˆ ∆(t, St, Yt), where ˆ ∆(t, s, y) = nρβ σ y sebΦ(−d1). (103) In [16] these results are used to conduct a simulation study of the effectiveness of the optimal hedge under partial information (that is, with Bayesian learning about the drift parameters of the assets), compared with the BS-style hedge and the optimal hedge without learning. The results show that optimal hedging combined with a filtering algorithm to deal with drift parameter uncertainty can indeed give improved hedging performance over methods which take S as a perfect proxy for Y , and over methods which do not incorporate learning via filtering.

5 Investment with inside information and drift uncertainty

In the last topic of these lectures we consider the Merton optimal investment problem of Section 3, but with the added feature that the agent has some additional information at time 0, in the form of an FT -measurable random variable I, where F = (Ft)o≤t≤T is the background filtration. This work is based on [3]. We first expand the filtration F by σ(I), denoting the expanded filtration by FI. We then write the stock price dynamics under this expanded filtration which allows us to apply a filtering

SLIDE 32

5 INVESTMENT WITH INSIDE INFORMATION AND DRIFT UNCERTAINTY 32 algorithm that estimates the unknown drift parameter based on stock price observations as well as the anticipative knowledge possessed by the insider, represented by the random variable I. The filtering algorithm is an extension to the Kalman-Bucy filter presented earlier. It results in filtering equations whose dynamics are the same as the usual Kalman-Bucy equations. The main difference is the background filtration under which the filter is applied and this leads to several modifications to the initial conditions of the linear system. The goal is to restrict the insider’s admissible trading strategies to be adapted to ˆ FI, the filtration generated by stock returns, but expanded by the anticipative information. In expanding a filtration, a crucial step is to identify the information drift, ν, a drift added to an F-Brownian motion that creates an FI-Brownian motion. We give a lemma that allows us to compute ν explicitly. The rest of this section is organised as follows. We first set up a linear system under an expanded filtration and present a (linear) filtering theorem that allows us to estimate an unob- servable signal process based on an observation process plus a piece of anticipative information known at the beginning of the time horizon. The development of the theorem is along the lines

f standard linear filters.

We then derive a lemma using a theorem in Mansuy and Yor [13] (the result stems originally from Theorem 2.1 in Jacod [7]) that gives an explicit expression for ν in terms of the regular conditional density of I. We give an example where I is (noisy) knowledge of the terminal value BT of a Brownian motion, and compute ν explicitly. We then consider the investment problem to maximise expected utility of terminal wealth where I is the terminal value of an F-Brownian motion distorted by noise and the agent’s trading strategy is required to be ˆ FI-adapted. For an extension of these results to the case where I represents noisy knowledge of the terminal stock price ST , see [3].

5.1 Linear filtering on an expanded filtration

We present a filtering algorithm applied to a linear system where estimation of the signal process is based on current stock price observations as well as a piece of anticipative information. The development of this filter is very much along the lines of classical linear filters, with the exception that the background filtration contains a signal process, and observation process (as in the classical case) but in addition incorporates knowledge of an FT -measurable random variable I. A stock price process S = (St)0≤t≤T is assumed to follow dSt = σSt(λdt + dBt) =: σStdξt, 0 ≤ t ≤ T, (104) where B is a Brownian motion on a complete probability space (Ω, F, P) equipped with a filtration F := (Ft)0≤t≤T . Here, B, S are F-adapted, with λ an F0-measurable random variable. We may take, for instance, Ft := σ(λ, Bs, ξs; 0 ≤ s ≤ t),

r

Ft := σ(λ, ξs; 0 ≤ s ≤ t), since, when we observe λ and ξ, we also observe B. It turns out that it is more convenient to consider the return process ξ as our observation instead of the stock price process S itself. Indeed from (104), it is clear that observations of one is equivalent to the other. Here B is an F-Brownian motion. In a full information setting, λ is a known constant. The aim of the partial information model

f Section 3 was to remove this assumption and infer the conditional distribution of λ given a

smaller observation filtration ˆ F := ( ˆ Ft)0≤t≤T with ˆ Ft := σ(ξs; 0 ≤ s ≤ t), so that λ is estimated using stock price observations. Here, we first expand the background filtration F with an FT -measurable r.v. I. By con- sidering the partial information problem on an expanded filtration, we are incorporating extra information to assist in the estimation of the unknown market price of risk λ. To do this, we reformulate the linear system in terms of quantities adapted to the expanded filtration.

SLIDE 33

5 INVESTMENT WITH INSIDE INFORMATION AND DRIFT UNCERTAINTY 33 Since F incorporates a Brownian filtration, we have various initial expansion results at our disposal (see [1],[7],[13],[19]). Denote the expanded filtration by FI := (FI

t )0≤t≤T with FI t :=

σ(I, λ, Bs, ξs; 0 ≤ s ≤ t). For our purposes it is sufficient to note that, assuming I is such that F-semimartingales remain FI-semimartingales, then there exists an FI-adapted process ν such that Bt := BI

t +

t νsds, 0 ≤ t ≤ T, (105) where BI is an FI-Brownian motion. The process ν is called the information drift. The assump- tion that F-semimartingales remain FI-semimartingales relies on Jacod’s criterion [7] (see also Protter [19], Theorem VI.10), and will be satisfied in examples where I is noisy information on BT or ST . For cases where noise is absent, the expanded filtration FI is defined only up to time T− (Jacod’s criterion is violated at time T). Using (105), the stock price SDE (104) can be written in terms of FI-adapted processes as dSt = σSt

λI

t dt + dBI t

,

0 ≤ t ≤ T, (106) where λI

t := λ+νt. We interpret (106) as the insider’s stock price process (in the case where λ is

a known constant, which is precisely the condition we shall relax later in the paper), with a non- constant market price of risk, on the probability space (Ω, FI

T , P) equipped with the filtration

FI. With the stock price process appropriately written under the expanded filtration FI, we now remove the assumption that λ is a known constant. We model λ as a Gaussian random variable with prior distribution N(λ0, v0) (the normal probability law of mean λ0 and variance v0), independent of B. The process λI is then an unobservable signal process satisfying some linear SDE under FI. We then acknowledge that estimation of the signal λI is based on ˆ FI := ( ˆ FI

t )0≤t≤T

defined by ˆ FI

t := σ(I, ξs; 0 ≤ s ≤ t)

with the observation process ξ satisfying dξt = λI

t dt + dBI t ,

0 ≤ t ≤ T. Our filtering algorithm will convert the partial information model (106) to a full information model of the form dSt = σSt(ˆ λI

t dt + d ˆ

BI

t ),

0 ≤ t ≤ T (107) where ˆ BI is an ˆ FI-Brownian motion and ˆ λI

t is the best estimate of λI t given ˆ

FI

t

ˆ λI

t := E[λI t | ˆ

FI

t ].

It gives us the SDE satisfied by ˆ λI

t and, as it turns out, a Riccati equation satisfied by the error

function R(t) := E[(λI

t − ˆ

λI

t )2| ˆ

FI

t ].

We solve for both of these quantities explicitly. Once we have the model (107), we can treat it like a standard full information model and compute the maximum utility via duality. Our filtering algorithm deviates from standard situations in which one seeks the best estimate

f the signal given observations ξ. Here, we have these observations plus the additional knowledge
f the random variable I. The following theorem gives the desired filtering equations.

Theorem 8 On a filtered probability space (Ω, F, F = (Ft)0≤t≤T , P), let ζ, ξ be F-adapted pro- cesses and et I be an FT -measurable random variable. Denote by FI = (FI

t )0≤t≤T the filtration

F expanded by σ(I). Let ζ be a signal process satisfying dζt = A(t)ζtdt + C(t)dW I

t ,

0 ≤ t ≤ T

SLIDE 34

5 INVESTMENT WITH INSIDE INFORMATION AND DRIFT UNCERTAINTY 34 and let ξ be an observation process satisfying dξt = G(t)ζtdt + dBI

t ,

0 ≤ t ≤ T where W I, BI are FI-Brownian motions with correlation ρ, and the coefficients A(·), C(·), G(·) are deterministic functions satisfying T

|A(t)| + C2(t) + G2(t)
dt < ∞.

Define ˆ FI := ( ˆ FI

t )0≤t≤T by

ˆ FI

t := σ(I, ξs; 0 ≤ s ≤ t).

Suppose the distribution of ζ0 given ˆ FI

0 is Gaussian with mean m and variance Σ, independent

f W I and BI. Then the conditional expectation ˆ

ζt := E[ζt| ˆ FI

t ] for 0 ≤ t ≤ T satisfies

dˆ ζt = A(t)ˆ ζtdt +

G(t)R(t) + ρC(t)
dNt,

ˆ ζ0 = m (108) where N is the innovations process and an ˆ FI-Brownian motion satisfying dNt = dξt − G(t)ˆ ζtdt and R(t) is the error function, which is independent of ˆ FI

t and satisfies the Riccati equation

dR(t) dt = (1 − ρ2)C2(t) + 2

A(t) − ρC(t)G(t)
R(t) − G2(t)R2(t),

R(0) = Σ. (109) The outline of the proof is as follows. We first show that the innovations process N is an ˆ FI- Brownian motion. Since our system is linear, the filtration ˆ FI is equivalent to the filtration FN,I := (FN,I

t

)0≤t≤T with FN,I

t

:= σ(I, Ns; 0 ≤ s ≤ t). This is the same as standard linear filters. This allows us to express ˆ FI-local martingales as stochastic integrals in terms of the innovations N. The remainder of the proof follows that of classical filtering theory. For brevity, we present the proof that N is an ˆ FI-Brownian motion. The rest of the proof of Theorem 8 follows along the same lines as our earlier proofs. Lemma 3 The innovations process N is an ˆ FI-Brownian motion. Proof of Lemma 3 For s ≤ t E[Nt| ˆ FI

s ] − Ns

= E[Nt − Ns| ˆ FI

s ]

= E t

s G(u)(ζu − ˆ

ζu)ds

ˆ

FI

s

+ E
E[BI

t − BI s|FI s ]

ˆ

FI

s

= 0.

Since Nt = Wt = t, N is an ˆ FI-Brownian motion.

5.2

Computing the information drift

We now present expansion of filtration results that allow us to compute the information drift in (105) explicitly. We begin with the background filtration F = (Ft)0≤t≤T . To expand F with a FT -measurable random variable I, we first consider the process (πt(f))0≤t≤T defined for any bounded Borel function f : R → R as the continuous version of the martingale (E[f(I)|Ft])0≤t≤T : πt(f) := E[f(I)|Ft], 0 ≤ t ≤ T.

SLIDE 35

5 INVESTMENT WITH INSIDE INFORMATION AND DRIFT UNCERTAINTY 35 There then exists a predictable family of measures (µt(dx))0≤t≤T such that πt(f) =

R

f(x)µt(dx). For fixed t ∈ [0, T], the measure µt(dx) is the conditional distribution of I given Ft. We assume the existence of a density function g(t, x, y) for each t ∈ [0, T] such that πt(f) =

R

f(x)µt(dx) =

R

f(x)g(t, x, Bt)dx. (110) To state the enlargement decomposition formula, introduce the F-predictable process ( ˙ πt(f))0≤t≤T such that πt(f) = Ef(I) + t ˙ πs(f)dBs, which exists by the representation property of Brownian martingales as stochastic integrals with respect to B. We assume the existence of a predictable family of measures ( ˙ µt(dx))0≤t≤T such that ˙ πt(f) =

R

f(x) ˙ µt(dx). We suppose that for each t ∈ [0, T] the measure ˙ µt(dx) is absolutely continuous with respect to µt(dx) and define α(t, x) by ˙ µt(dx) = α(t, x)µt(dx). Now suppose we have a continuous F-martingale M given by Mt = t msdBs, 0 ≤ t ≤ T. Then Theorem 1.6 in Mansuy and Yor [13] states that there exists an FI-local martingale M I such that Mt = M I

t +

t α(s, I)dM, Bs, provided that, almost surely, t |α(s, I)|dM, Bs < ∞. In particular, if t

0 |α(s, I)|ds < ∞ a.s. then B decomposes as

Bt = BI

t +

t α(s, I)ds, with BI an FI-Brownian motion. Lemma 4 The information drift is νt = α(t, I), where α(t, x) is given in terms of the conditional density g(t, x, Bt) by α(t, x) = gy(t, x, Bt) g(t, x, Bt) , 0 ≤ t ≤ T. Proof From the definition of α(t, x) we have ˙ πt(f) =

R

f(x)α(t, x)µt(dx) =

R

f(x)α(t, x)g(t, x, Bt)dx.

SLIDE 36

5 INVESTMENT WITH INSIDE INFORMATION AND DRIFT UNCERTAINTY 36 Hence, dπt(f) = ˙ πt(f)dBt =

R

f(x)α(t, x)g(t, x, Bt)dx

dBt,

so that dπ(f), Mt =

R

f(x)α(t, x)g(t, x, Bt)dx

dB, Mt.

(111) But from the defining representation (110), the right-hand side of which is a smooth function of Bt, the Itˆ

formula gives

dπ(f), Mt =

R

f(x)gy(t, x, Bt)dx

dB, Mt,

(112) and comparing (111) with (112) yields the result.

Hence, given I, the information drift can be computed explicitly from the conditional density.

We give an example that we shall use shortly. Example 1 (Noisy information on BT ) The random variable I takes the form I = aBT + (1 − a)ǫ where 0 < a ≤ 1 and ǫ ∼ N(0, 1) independent of B. The conditional distribution of I given Ft is N(aBt, a2(T − t) + (1 − a)2) = N(aBt, a2(Ta − t)), where Ta := T + [(1 − a)/a]2, (113) and so the conditional density is g(t, x, Bt) = 1 a

2π(Ta − t)

exp

− 1

2 (x − aBt)2 a2(Ta − t)

.

(114) So by Lemma 4, α(t, x) = x − aBt a(Ta − t), and the information drift is νt = α(t, I) = I − aBt a(Ta − t) = a(BT − Bt) + (1 − a)ǫ a(Ta − t) . (115) For a = 1, we get νt = BT − Bt T − t , which is a well-known result.

5.3 Optimal investment for an insider with drift parameter uncertainty

We now attack the main problem of this section, the optimal investment decisions of an insider who possesses some noisy information on the final value of the stock price or its driving Brownian

motion. We do not assume the insider observes the underlying Brownian motion driving the

stock price. The insider will therefore be restricted to using ˆ FI-adapted strategies, where ˆ FI is the observation filtration associated with stock returns, but expanded with the information carried by the r.v. I.

SLIDE 37

5 INVESTMENT WITH INSIDE INFORMATION AND DRIFT UNCERTAINTY 37 5.3.1 Anticipative Brownian information We consider an example in which the insider has knowledge at time 0 of the random variable I = aBT + (1 − a)ǫ considered in Example 1 where 0 < a ≤ 1 and ǫ ∼ N(0, 1) independent of B. This problem is artificial in the sense that we assume the insider has knowledge of aBT + (1 − a)ǫ, but we shall then restrict him to using ˆ FB-adapted strategies where ˆ FB denotes the stock return filtration expanded by aBT + (1 − a)ǫ. In other words, we allow the insider to have advance Brownian information, but then assume that he does not observe the Brownian filtration before time T. We write the stock price SDE (104) in terms of FB-adapted processes, with the background filtration FB = (FB

t )0≤t≤T defined by

FB

t := σ(λ, aBT + (1 − a)ǫ, ξs; 0 ≤ s ≤ t).

Using the information drift νt of Example 1, we have dSt = σSt(λB

t dt + dBB t ),

0 ≤ t ≤ T, where BB is an FB-Brownian motion and λB is an FB-adapted process given by λB

t := λ + νt = λ + I − aBt

a(Ta − t) =: h(t, Bt) with h(t, x) = λ + I − ax a(Ta − t). Applying Itˆ

’s formula to λB, we obtain

dλB

t = −

1 Ta − tdBB

t .

(116) With ξ being the returns process in (38), we also have dξt = λB

t dt + dBB t .

(117) We now regard λ as an unknown constant, with prior distribution λ ∼ N(λ0, v0). Then, (λB

t )0≤t≤T is an unobservable signal process following (116), and ξ is an observation process

following (117). In such a filtering framework, define ˆ FB := ( ˆ FB

t )0≤t≤T by

ˆ FB

t := σ(aBT + (1 − a)ǫ, ξs; 0 ≤ s ≤ t).

Then, the best estimate of λB

t

given ˆ FB

t

will be obtained from observations of the process ξ following (117) and the r.v. aBT + (1 − a)ǫ whose value is known at time 0. At time 0, since λ is assumed to be independent of B, we have Law(ˆ λB

0 | ˆ

FB

0 ) = N

λ0 +

I aTa , v0

.

This defines the prior distribution of the signal process λB. Note that since I is ˆ FB

t -measurable,

it does not contribute to the initial variance. Theorem 8 yields that the optimal filter ˆ λB

t := E[λB t | ˆ

FB

t ],

0 ≤ t ≤ T, satisfies the SDE dˆ λB

t =

R(t) −

1 Ta − t

d ˆ

BB

t ,

ˆ λB

0 = λ0 +

I aTa , (118)

SLIDE 38

5 INVESTMENT WITH INSIDE INFORMATION AND DRIFT UNCERTAINTY 38 where ˆ BB

t is the innovations process, an ˆ

FB-Brownian motion related to the observation process by d ˆ BB

t := dξt − ˆ

λB

t dt,

(119) and R(t) is the conditional variance of λB

t :

R(t) := E

λB

t − ˆ

λB

t

2

ˆ

FB

t

,

which satisfies dR(t) dt = 2 Ta − tR(t) − R2(t), R(0) = v0. (120) The Riccati-type ODE (120) is solved by converting it to a linear ODE for 1/R(t), yielding R(t) =

v0

1 + (v0 − (1/Ta))t Ta Ta − t

.

More pertinently, (118) becomes dˆ λB

t = vB t d ˆ

BB

t ,

ˆ λB

0 = λ0 +

I aTa , (121) where we have defined vB

t := R(t) −

1 Ta − t = v0 − (1/Ta) 1 + (v0 − (1/Ta))t, 0 ≤ t ≤ T. (122) Note that (121) is of exactly the same form as (40) with vt replaced by vB

t and with ˆ

Bt replaced by ˆ BB

t . Indeed, vB t plays the role of an ‘effective variance’, and satisfies the same Riccati equation

as vt, but with a modified initial condition: dvB

t

dt = −

vB

t

2 , vB

0 = v0 − (1/Ta).

Using (119) in the SDE for ˆ λB, the optimal filter is given explicitly in terms of the observable ξt by ˆ λB

t

= λ0 + (I/(aTa)) + (v0 − (1/Ta)ξt) 1 + (v0 − (1/Ta))t (123) = ˆ λB

0 + (v0 − (1/Ta))ξt

1 + (v0 − (1/Ta))t (124) = ˆ λB

0 + vB 0 ξt

1 + vB

0 t ,

0 ≤≤ t. (125) This is of the same form as (44), with λ0 replaced by λ0 + I/(aTa) and v0 replaced by vB

0 =

v0 − (1/Ta). To solve the insider’s utility maximisation problem, we proceed in the same way as Section

3. Instead of (45), the insider is now investing in a stock with dynamics

dSt = σSt

ˆ

λB

t dt + d ˆ

BB

t

,

0 ≤ t < T. The insider’s (ˆ FB-adapted) wealth process is XB, following dXB

t = σθB t XB t

ˆ

λB

t dt + d ˆ

BB

t

,

0 ≤ t < T, where θB is now a ˆ FB-adapted strategy satisfying T

θB

t

2 dt < ∞ almost surely, and such that XB

t ≥ 0 almost surely for all t ∈ [0, T). Denote by AB the set of such admissible strategies.

SLIDE 39

5 INVESTMENT WITH INSIDE INFORMATION AND DRIFT UNCERTAINTY 39 We introduce the change of measure martingale ZB

t := E

−ˆ

λB · ˆ BB

t .

By the same methods as in Section 3 we can establish a similar formula to (50) for ZB

t . Ultimately,

this simply amounts to making the replacements vt → vB

t ,

v0 → vB

0 = v0 − 1/Ta,

ˆ λt → ˆ λB

t ,

λ0 → ˆ λB

0 = λ0 + I/(aTa).

(126) Hence ZB

t is given by

ZB

t =

vB vB

t

1

2

exp

− 1

2 (ˆ λB

t )2

vB

t

− (ˆ λB

0 )2

vB

.

The insider’s value function is uB(x) := sup

θ∈AB

E[U(XB

T )|XB 0 = x],

(127) and we note that this is implicitly conditioned on ˆ FB

0 . The maximal expected utility is then

given by a similar formula to (55): uB(x) = xγ γ H1−γ

B

, where HB =:= E

ZB

T

q , and where HB is given by a similar formula to (59), with the replacements (126). This results in HB = (1 + vB

0 T)q

1 + qvB

0 T

1/2 exp

−1

2 q(1 − q)(ˆ λB

0 )2T

1 + qvB

0 T

,

and the explicit expression uB(x) = xγ γ

1 − γ − γvB

0 T

(1 − γ)(1 + vB

0 T)

γ/2 1 − γ 1 − γ − γvB

0 T

1/2 exp 1 2 γλ2

0T

1 − γ − γvB

0 T

.

(128) Of course, this depends on the value of aBT +(1−a)ǫ, since the expectation in (127) is implicitly conditioned on ˆ FB

0 , and the strategies are ˆ

FB-adapted. The optimal wealth process XB,∗ is given by a similar formula to (61): XB,∗

t

= x ΨB

t

ΨB 1

2

exp 1 2(1 − q)(ΛB

t − ΛB 0 )

,

where, for t ∈ [0, T], ΨB

t :=

vB

t

1 + qvB

t (T − t),

ΛB

t :=

(ˆ λB

t )2

vB

t (1 + qvB t (T − t)).

The optimal strategy θB,∗ is given by a similar formula to (63): θB,∗

t

= ˆ λB

t

σ(1 − γ)

1

1 + qvB

t (T − t)

.

It is clear that the solution of the utility maximisation problem for the insider who has knowledge

f aBT + (1 − a)ǫ and who must optimally filter the resulting stock price drift is obtained from

the solution of the partial information problem of Section 3 by making the replacements in (126).

SLIDE 40

REFERENCES 40

References

[1] Amendinger J, Imkeller P and Schweizer M 1998 Additional logarithmic utility of an insider Stochas- tic Processes and their Applications 75 263–286 [2] Becherer D 2004 Utility-indifference hedging and valuation via reaction-diffusion systems Proc. R.

Soc. Lond. A 460 27–51

[3] Danilova A, Monoyios M and Ng A 2008 Optimal investment with inside information and parameter uncertainty, preprint [4] Davis M H A 1997 Option pricing in incomplete markets In: Mathematics of Derivative Securities (eds: Dempster M A H and Pliska S R), pp 216–226 (Cambridge University Press: Cambridge, UK) [5] Delbaen F, Grandits P, Rheinl¨ ander T, Samperi D, Schweizer M and Stricker C 2002 Exponential hedging and entropic penalties Mathematical Finance 12 99–123 [6] Henderson V 2002 Valuation of claims on nontraded assets using utility maximization Mathematical Finance 12 351–373 [7] Jacod J 1985 Grossissement initial, hypothe` ese (H’) et th´ eor` eme de Girsanov In: Grossissements de filtrations: exemples et applications, Lecture Notes in Math. (eds: Jeulin T, Yor M), 1118 15–35 (Springer: Berlin) [8] Kabanov Y and Stricker C 2002 On the optimal portfolio for the exponential utility maximization: remarks to the six-author paper Mathematical Finance 12 125–134 [9] Karatzas I 1996 Lectures on the Mathematics of Finance, CRM Monographs, 8, American Mathe- matical Society [10] Karatzas I, Lehoczky J P, Shreve S E and Xu G-L 1991 Martingale and duality methods for utility maximization in an incomplete market SIAM Journal on Control and Optimization 29 702–730 [11] Karatzas I and Shreve S E 1991 Brownian motion and stochastic calculus, Second Edition, Springer- Verlag, New York [12] Kramkov D and Schachermayer W 1999 The asymptotic elasticity of utility functions and optimal investment in incomplete markets Annals of Applied Probability 9 904–950 [13] Mansuy R and Yor M 2006 Random times and enlargements of filtrations in a Brownian setting Lecture Notes in Mathematics 1873, Springer [14] Monoyios M 2004 Performance of utility-based strategies for hedging basis risk Quantitative Finance 4 245–255 [15] Monoyios M 2007 Optimal hedging and parameter uncertainty IMA Journal of Management Math- ematics 18 331–351 [16] Monoyios M 2008 Marginal utility-based hedging of claims on non-traded assets with partial infor- mation, preprint [17] Musiela M and Zariphopoulou T 2004 An example of indifference prices under exponential prefer- ences Finance & Stochastics 8 229–239 [18] Pikovsky I & Karatzas I 1996 Anticipative portfolio optimization Adv. in App. Prob. 28 1095–1122 [19] Protter P 2004 Stochastic integration and differential equations, Second Edition, Springer, Berlin [20] Rogers L C G 2001 The relaxed investor and parameter uncertainty Finance & Stochastics 5 131–154 [21] Rogers L C G & Williams D Diffusions, Markov Processes, and Martingales, Vol. 2: Itˆ

Calculus,

Wiley, New York [22] Schachermayer W 2001 Optimal investment in incomplete markets when wealth may become neg- ative Annals of Applied Probability 11 694–734 [23] Zariphopoulou T 2001 A solution approach to valuation with unhedgeable risks Finance & Stochas- tics 5 61–82

Optimal investment and hedging under partial information

Contents

insider’s additional knowledge, then filtering the asset price drift.

1 Introduction

trading strategies in S. We shall relax the full information assumption. We shall assume that the agent can only

not (heroically) assumed to be known. The agent’s trading strategies must also be adapted to the

(2) One way to model the uncertainty in our knowledge of λ is to consider it as an F-adapted process, or as a random variable (measurable with respect to F0) with a given initial distribution (the prior distribution), which is updated in the face of new price information, that is, as the

F evolves. This is an example of a filtering problem, which is to compute the best estimate of a random variable given observations up to time t ∈ [0, T], and hence given

stock at each time t ∈ [0, T], to maximise the functional J(x; π) := E

T πt St dSt

where x is the initial capital at time 0. The value function is the maximum expected utility expressed as a function of x: u(x) := sup

generated by B, and we take S to follow the stochastic differential equation (1), with σ, λ known constants;

2 Filtering theory

A good reference for this section is Chapter VI.8 of Rogers & Williams [21]. The setting is a probability space (Ω, F, P) equipped with a filtration F = (Ft)0≤t≤T . All

us call F the background filtration. We consider two processes, both taken to be one-dimensional (for simplicity):

E[f(Ut)| ˆ Ft], 0 ≤ t ≤ T, where f : R → R is some test function. To proceed further, we need to specify some particular model for the signal and observation processes.

2.1 Observation model

Let B = (Bt)0≤t≤T be an F-Brownian motion, let H = (Ht)0≤t≤T be an F-adapted process satisfying E T H2

and we shall assume the observation process O is of the form Ot = t Hudu + Bt, 0 ≤ t ≤ T. (5) The typical situation will be where Ht = h(t, Ut), a deterministic function h : [0, T] × R → R

focus on) will be the linear case when h(t, x) = G(t)x, with G(·) a deterministic function. Then Ht = G(t)Ut and the observation process stochastic differential equation (SDE) is dOt = G(t)Utdt + dBt, (linear observation model).

2.2 Innovations process

Introduce the notation ˆ φt := E[φt| ˆ Ft], for any process φ. Define the ˆ F-adapted innovations process Nt := Ot − t ˆ Hudu, 0 ≤ t ≤ T. (6) Proposition 1 The innovations process N is an ˆ F-Brownian motion.

2 FILTERING THEORY 5 Proof From (5) and (6) we have Nt = t (Hu − ˆ Hu)du + Bt. With s ≤ t, we have E[Nt| ˆ Fs] − Ns = E t

(Hu − ˆ Hu)du + Bt

Fs

E t

Fu] − ˆ Hu

Fs

Fs

using the Tower property of conditional expectation. So N is is continuous ˆ F-martingale with quadratic variation [N]t = [B]t = t, so N is an ˆ F-Brownian motion.

The Innovations Conjecture Denote by FN := (FN

:= σ(Ns; s ≤ t). Since N is ˆ F-adapted, we have (FN

Ft). For linear systems, we shall see that we also have FN

then Φ can be chosen so that E T

Mt = M0 + t bsdWs, 0 ≤ t < ∞, for an FW -adapted process satisfying t

In particular, every such M has continuous sample paths. If M happens to be a square-integrable martingale (EM 2

t

= Zt, 0 ≤ t ≤ T. Notice that the inverse likelihood ratio is dP d ˜ P

= Λt := 1 Zt = exp t ˆ HsdNs + 1 2 ˆ H2

exp t ˆ HsdOs − 1 2 ˆ H2

2 FILTERING THEORY 7

2.3 Signal process model

generator of U is At given by Atf(x) = b(t, x)f ′(x) + 1 2σ2(t, x)f ′′(x). For brevity we use the notation ft ≡ f(Ut), (Af)t ≡ Atf(Ut), 0 ≤ t ≤ T. By Itˆ

M f

t (Af)sds = t σ(s, Us)f ′(Us)dWs (11) is an F-local martingale, and in fact a square-integrable martingale if f is of compact support. We assume this is the case. The cross-variation of M f with B is [M f, B]t = t σ(s, Us)f ′(Us)d[W, B]s = t ρσ(s, Us)f ′(Us)ds =: t αf

(12) where we have defined the process αf by αf

0 ≤ t ≤ T.

2.4 Fundamental filtering equation

The fundamental filtering theorem is the following. Theorem 3 For the observation and signal process models of (5) and (10) we have, for every f ∈ C2

φt := E[φt| ˆ Ft] for any process φ, the fundamental filtering equation ˆ ft = ˆ f0 + t ( Af)sds + t

fs ˆ Hs + ˆ αf

0 ≤ t ≤ T. (13) The proof will require the following lemma. Lemma 1 Consider two F-adapted processes V, C with E|Vt| < ∞, ∀t ∈ [0, T] and E T

∞. If Jt := Vt − t

ˆ Jt := ˆ Vt − t ˆ Csds is an ˆ F-martingale.

2 FILTERING THEORY 8 Proof For s ≤ t, writing t

s

t

martingale, we have E

s Cudu − t

Cudu

s Cudu ⇒ E

t

Cudu

(14) which we shall use shortly. Now consider E[ ˆ Jt| ˆ Fs] = E

Vt − t ˆ Cudu

Fs

E

Vt − s ˆ Cudu − t

ˆ Cudu

Fs

E

Vt − t

ˆ Cudu

Fs

s ˆ Cudu = E

Ft] − t

E[Cu| ˆ Fu]du

Fs

s ˆ Cudu = E[Vt| ˆ Fs] − t

E[Cu| ˆ Fs]du − s ˆ Cudu = E[E[Vt|Fs] ˆ Fs] − t