SLIDE 1 Modeling Portfolios that Contain Risky Assets III: Stochastic Models and Optimization
University of Maryland, College Park ICERM Lecture, 17 November 2011 Extracted from Math 420: Mathematical Modeling c
2011 Charles David Levermore
SLIDE 2 Lecture III: Stochastic Models and Optimization Outline
- 1. Stochastic Models of One Risky Asset
- 2. Stochastic Models of Portfolios with Risky Assets
- 3. Model-Based Objective Functions
- 4. Model-Based Portfolio Management
- 5. Conclusion
SLIDE 3
- 1. Stochastic Models of One Risky Asset
Investors have long followed the old adage “don’t put all your eggs in one basket” by holding diversified portfolios. However, before MPT the value of diversification had not been quantified. Key aspects of MPT are:
- 1. it uses the return rate mean as a proxy for return;
- 2. it uses volatility as a proxy for risk;
- 3. it analyzes Markowitz portfolios;
- 4. it shows diversification reduces volatility through covariances;
- 5. it identifies the efficient frontier as the place to be.
The orignial form of MPT did not give guidance to investors about where to be on the efficient frontier. We will now begin to build stochasitc models that can be used in conjunction with the original MPT to address this question. By doing so, we will see that maximizing the return rate mean is not the best strategy for maximizing your return.
SLIDE 4 IID Models for an Asset. We begin by building models of one risky asset with a share price history {s(d)}D
d=0. Let {r(d)}D d=1 be the associated
return rate history. Because each s(d) is positive, each r(d) lies in the in- terval (−D, ∞). An independent, identically-distributed (IID) model for this history simply independently draws D random numbers {R(d)}D
d=1 from
(−D, ∞) in accord with a fixed probability density q(R) over (−D, ∞). Such a model is reasonable if a plot of the points {(d, r(d))}D
d=1 in the
dr-plane appears to be distributed in a way that is uniform in d.
- Exercise. Plot {(d, r(d))}D
d=1 for each of the following assets and explain
which might be good candidates to be mimiced by an IID model. (a) Google, Microsoft, Exxon-Mobil, UPS, GE, and Ford stock in 2009; (b) Google, Microsoft, Exxon-Mobil, UPS, GE, and Ford stock in 2007; (c) S&P 500 and Russell 1000 and 2000 index funds in 2009; (d) S&P 500 and Russell 1000 and 2000 index funds in 2007.
SLIDE 5
- Remark. We have adopted IID models because they are simple. It is not
hard to develop more complicated stochastic models. For example, we could use a different probability density for each day of the week rather than treating all trading days the same way. Because there are usually five trading days per week, Monday through Friday, such a model would require calibrating five times as many means and covariances with one fifth as much data. There would then be greater uncertainty associated with the calibration. Moreover, we then have to figure out how to treat weeks that have less than five trading days due to holidays. Perhaps just the first and last trading days of each week should get their own probability density, no matter on which day of the week they fall. Before increasing the complexity of a model, you should investigate whether the costs of doing so outweigh the benefits. Specifically, you should investigate whether or not there is benefit in treating any one trading day of the week differently than the others before building a more complicated models.
SLIDE 6
- Remark. IID models are also the simplest models that are consistent with
the way any portfolio theory is used. Specifically, to use any portfolio theory you must first calibrate a model from historical data. This model is then used to predict how a set of ideal portfolios might behave in the future. Based on these predictions one selects the ideal portfolio that optimizes some objective. This strategy makes the implicit assumption that in the future the market will behave statistically as it did in the past. This assumption requires the market statistics to be stable relative to its
- dynamics. But this requires future states to decorrelate from past states.
Markov models are characterized by the assumption that possible future states depend upon the present state but not upon past states, thereby maximizing this decorrelation. IID models are the simplest Markov models. All the models discussed in the previous remark are also Markov models. We will use only IID models.
SLIDE 7 Return Rate Probability Densities. Once you have decided to use an IID model for a particular asset, you might think the next goal is to pick an appropriate probability density q(R). However, that is neither practical nor
- necessary. Rather, the goal is to identify appropriate statistical information
about q(R) that sheds light on the market. Ideally this information should be insensitive to details of q(R) within a large class of probability densities. Recall that a probability density q(R) over (−D, ∞) is an nonnegative integrable function such that
∞
−D q(R) dR = 1 .
Because we have been collecting mean and covariance return rate data, we will assume that the probability densities also satisfy
∞
−D R2q(R) dR < ∞ .
SLIDE 8 The mean µ and variance ξ of R are then µ = Ex(R) =
∞
−D R q(R) dR ,
ξ = Var(R) = Ex
=
∞
−D(R − µ)2 q(R) dR .
Given D samples {R(d)}D
d=1 that are drawn from the density q(R), we
can construct unbiased estimators of µ and ξ by ˆ µ = 1 D
D
R(d) , ˆ ξ = 1 D − 1
D
(R(d) − ˆ µ)2 . Being unbiased estimators means Ex(ˆ µ) = µ and Ex(ˆ ξ) = ξ. Moreover, Var(ˆ µ) = Ex
µ − µ)2 = ξ D . This implies that ˆ µ converges to µ at the rate D−1
2 as D → ∞.
SLIDE 9 Growth Rate Probability Densities. Given D samples {R(d)}D
d=1 that
are drawn from the return rate probability density q(R), the associated simulated share prices satisfy S(d) =
DR(d)
for d = 1, · · · , D . If we set S(0) = s(0) then you can easily see that S(d) =
d
DR(d′)
The growth rate X(d) is related the return rate R(d) by e
1 DX(d) = 1 + 1
DR(d) .
In other words, X(d) is the growth rate that yeilds a return rate R(d) on trading day d. The formula for S(d) then takes the form S(d) = exp
1
D
d
X(d′)
s(0) .
SLIDE 10 When {R(d)}D
d=1 is an IID process drawn from the density q(R) over
(−D, ∞), it follows that {X(d)}D
d=1 is an IID process drawn from the
density p(X) over (−∞, ∞) where p(X) dX = q(R) dR with X and R related by X = D log
DR
R = D
1 DX − 1
More explicitly, the densities p(X) and q(R) are related by p(X) = q
1 DX − 1
1 DX ,
q(R) = p
DR
DR
. Because our models will involve means and variances, we will require that
∞
−∞ X2p(X) dX =
∞
−D D2 log
DR
2 q(R) dR < ∞ , ∞
−∞ D2
1 DX − 1
2
p(X) dX =
∞
−D R2q(R) dR < ∞ .
SLIDE 11 The big advantage of working with p(X) rather than q(R) is the fact that log
s(0)
D
d
X(d′) . In other words, log(S(d)/s(0)) is a sum of an IID process. It is easy to compute the mean and variance of this quantity in terms of those of X. The mean γ and variance θ of X are γ = Ex(X) =
∞
−∞ X p(X) dX ,
θ = Var(X) = Ex
=
∞
−∞(X − γ)2 p(X) dX .
For the mean of log(S(d)/s(0)) we find that Ex
s(0)
D
d
Ex
D γ ,
SLIDE 12 For the variance of log(S(d)/s(0)) we find that Var
s(0)
1
D
d
X(d′) − d
D γ
2
= 1 D2 Ex
d
2
= 1 D2 Ex
d
d
X(d′′) − γ
= 1 D2
d
Ex
2
= d D2 θ . Here the contributions from cross terms in the double sum vanish because Ex
X(d′′) − γ
when d′′ = d′ .
SLIDE 13 In summary, we obtained Ex
s(0)
D γ ,
Var
s(0)
d D2 θ .
We see that γ t is the expected growth of the IID model asset while 1
Dθ t is
its variance at time t = d/D years.
- Remark. The IID model suggests that the growth rate mean γ is a good
proxy for the return of an asset and that
1
D θ is a good proxy for its risk.
However, these are not the proxies chosen by MPT when it is applied to a portfolio consisting of one risky asset. Those proxies can be approximated by ˆ µ and
1
D ˆ
ξ where ˆ µ and ˆ ξ are the unbiased estimators of µ and ξ given by ˆ µ = 1 D
D
R(d) , ˆ ξ = 1 D − 1
D
µ
2 .
SLIDE 14 Normal Growth Rate Model. We can illustrate what is going on with the simple IID model where p(X) is the normal or Gaussian density with mean γ and variance θ, which is given by p(X) = 1 √ 2πθ exp
2θ
Let {X(d)}∞
d=1 be a sequence of IID random variables drawn from p(X).
Let {Y (d)}∞
d=1 be the sequence of random variables defined by
Y (d) = 1 d
d
X(d′) for every d = 1, · · · , ∞ . You can easily check that Ex(Y (d)) = γ , Var(Y (d)) = θ d . You can also check that Ex(Y (d)|Y (d − 1)) = d−1
d Y (d − 1) + 1 dγ, so
the variables Y (d) are neither independent nor identically distributed.
SLIDE 15 It can be shown (the details are not given here) that Y (d) is drawn from the normal density with mean γ and variance θ/d, which is given by pd(Y ) =
2πθ exp
2θ
Because S(d)/s(0) = e
d DY (d), the mean return at day d is
Ex
d DY (d)
2πθ
2θ + d
DY
=
2πθ
−(Y − γ − 1
Dθ)2d
2θ + d
D(γ + 1 2Dθ)
dY
= exp
d
D(γ + 1 2Dθ)
This grows at rate γ +
1 2Dθ, which is higher than the rate γ that most
investors see. Indeed, we see that pd(Y ) becomes more sharply peaked around Y = γ as d increases.
SLIDE 16 By setting d = 1 in the above formula, we see that the return rate mean is µ = Ex(R) = D Ex
1 DX − 1
1
D(γ + 1 2Dθ)
Therefore µ > γ + 1
2Dθ, with µ ≈ γ + 1 2Dθ when 1 D(γ + 1 2Dθ) << 1. This
shows that most investors will see a return rate that is below the return rate mean µ — far below in volatile markets. This is because e
1 DX amplifies the
tail of the normal density. For a more realistic IID model with a density p(X) that decays more slowly than a normal density as X → ∞, this difference can be more striking. Said another way, most investors will not see the same return as Warren Buffett, but his return will boost the mean. The normal growth rate model confirms that γ is a better proxy for how well a risky asset might perform than µ because pd(Y ) becomes more peaked around Y = γ as d increases. We will extend this result to a general class
- f IID models that are more realistic.
SLIDE 17
- Exercise. Use the unbiased estimators ˆ
µ, ˆ ξ, ˆ γ, and ˆ θ given by ˆ µ = 1 D
D
r(d) , ˆ ξ = 1 D − 1
D
µ
2 ,
ˆ γ = 1 D
D
x(d) , ˆ θ = 1 D − 1
D
γ
2 ,
to estimate µ, ξ, γ, and θ given the share price history {s(d)}D
d=0 with
r(d) = D
s(d − 1) − 1
x(d) = D log
s(d − 1)
for each of the following assets. How do ˆ µ and ˆ γ compare? ˆ ξ and ˆ θ? (a) Google, Microsoft, Exxon-Mobil, UPS, GE, and Ford stock in 2009; (b) Google, Microsoft, Exxon-Mobil, UPS, GE, and Ford stock in 2007; (c) S&P 500 and Russell 1000 and 2000 index funds in 2009; (d) S&P 500 and Russell 1000 and 2000 index funds in 2007.
SLIDE 18
- 2. Stochastic Models of Portfolios with Risky Assets
We now consider a market of N risky assets. Let {si(d)}D
d=0 be the share
price history of asset i over a year. Let {ri(d)}D
d=1 and {xi(d)}D d=1 be
the associated return rate and growth rate histories, where ri(d) = D
si(d − 1) − 1
xi(d) = D log
si(d − 1)
Because each si(d) is positive, each ri(d) is in (−D, ∞) while each xi(d) is in (−∞, ∞). Let r(d) and x(d) be the N-vectors
r(d) =
r1(d) . . . rN(d)
,
x(d) =
x1(d) . . . xN(d)
.
The return rate and growth rate histories can then be expressed simply as {r(d)}D
d=1 and {x(d)}D d=1 respectively.
SLIDE 19 IID Models for Markets. An IID model for this market draws D random vectors {R(d)}D
d=1 from a fixed probablity density q(R) over (−D, ∞)N.
Such a model is reasonable if the points {(d, r(d))}D
d=1 are distributed in
a way that is uniform in d. This is hard to visualize when N is not small. However, a necessary condition for the entire market to have an IID model is that every asset has an IID model. This can be visualized for each asset by plotting the points {(d, ri(d))}D
d=1 in the dr-plane and seeing if
they appear to be distributed in a way that is uniform in d. Similar visual tests based on pairs of assets can be carried out by plotting the points {(d, ri(d), rj(d))}D
d=1 in R3 with an interactive 3D graphics package.
- Remark. Such visual tests can only warn you when IID models might not
be appropriate for describing the data. There are also statistical tests that can play this role. There is no visual or statistical test that can insure the validity of using an IID model for a market. However, due to their simplicity, IID models are often used unless there is a good reason not to use them.
SLIDE 20 After you have decided to use an IID model for the market, you must gather statistical information about the return rate probability density q(R). The mean vector µ and covariance matrix Ξ of R are given by
µ =
Ξ =
Tq(R) dR .
Given any sample {R(d)}D
d=1 drawn from q(R), these have the unbiased
estimators ˆ
µ = 1
D
D
R(d) ,
ˆ
Ξ =
1 D − 1
D
(R(d) − ˆ
µ) (R(d) − ˆ µ)
T .
If we assume that such a sample is given by the return rate data {r(d)}D
d=1
then these estimators are given in terms of the vector m and matrix V by ˆ
µ = m ,
ˆ
Ξ = D V .
SLIDE 21 IID Models for Markowitz Portfolios. Recall that the value of a portfolio that holds a risk-free balance brf(d) with return rate µrf and ni(d) shares
- f asset i during trading day d is
Π(d) = brf(d)
D µrf
N
ni(d)si(d) . We will assume that Π(d) > 0 for every d. Then the return rate r(d) and growth rate x(d) for this portfolio on trading day d are given by r(d) = D
Π(d − 1) − 1
x(d) = D log
Π(d − 1)
Recall that the return rate r(d) for the Markowitz portfolio associated with the distribution f can be expressed in terms of the vector r(d) as r(d) = (1 − 1
Tf)µrf + fTr(d) .
SLIDE 22 This implies that if the underlying market has an IID model with return rate probability density q(R) then the Markowitz portfolio with distribution f has the IID model with return rate probability density qf(R) given by qf(R) =
Tf)µrf − R Tf
Here δ( · ) denotes the Dirac delta distribution, which can be defined by the property that for every sufficiently nice function ψ(R)
Tf)µrf − R Tf
Tf)µrf − R Tf
Hence, for every sufficiently nice function ψ(R) we have the formula Ex
=
Tf)µrf − R Tf
=
Tf)µrf − R Tf
SLIDE 23 We can thereby compute the mean µ and variance ξ of qf(R) to be µ = Ex(R) = (1 − 1
Tf)µrf + R Tf
= (1 − 1
Tf)µrf
f
= (1 − 1
Tf)µrf + µ Tf ,
ξ = Ex
= (1 − 1
Tf)µrf + R Tf − µ
2q(R) dR dR
=
R
Tf − µ Tf
2 q(R) dR =
Tf q(R) dR
= fT
Tq(R) dR
where we have used the facts that
- q(R) dR = 1 ,
- R q(R) dR = µ ,
- (R − µ)(R − µ)
Tq(R) dR = Ξ .
SLIDE 24 Because µ and Ξ have the unbiased estimators ˆ
µ = m and ˆ Ξ = DV, we
see from the foregoing formulas that µ and ξ have the unbiased estimators ˆ µ = µrf(1 − 1
Tf) + m Tf ,
ˆ ξ = DfTVf . The idea now is to treat the Markowitz portfolio as a single risky asset that can be modeled by the IID process associated with the growth rate probability density pf(X) given by pf(X) = qf
1 DX − 1
1 DX .
The mean γ and variance θ of X are given by γ =
θ =
We know from our study of one risky asset that γ is a good proxy for return, while
1
Dθ is a good proxy for risk. We therefore would like to estimate γ
and θ in terms of ˆ µ and ˆ ξ.
SLIDE 25 Estimators for γ and θ. Introduce the function K(τ) = log
. Because R = D(e
1 DX − 1) and Ex(e 1 DX) = eK( 1 D), we have
µ = Ex(R) = D
D) − 1
Because R − µ = D
1 DX − eK( 1 D)
2 DX) = eK( 2 D), we have
ξ = Ex
= D2
D) − e2K( 1 D)
Because eK( 1
D) = 1 + µ
D, we see that
eK( 2
D)−2K( 1 D) = 1 +
ξ (D + µ)2 . Therefore knowing µ and ξ is equivalent to knowing K( 1
D) and K( 2 D).
SLIDE 26 The function K(τ) is the cumulant generating function for X because it recovers the cumulants {κm}∞
m=1 of X by the formula κm = K(m)(0).
In particular, you can check that K′(0) = γ , K′′(0) = θ . Because K(0) = 0, we interpolate the values K(0), K( 1
D), and K( 2 D)
with a quadratic polynomial to construct an estimator ˆ K(τ) of K(τ) as ˆ K(τ) = τD K( 1
D) + τ
D
D2
2
D) − 2K( 1 D)
We then construct estimators ˆ γ and ˆ θ by ˆ γ = ˆ K′(0) = D K( 1
D) − 1 2D
D) − 2K( 1 D)
D
2D log
ξ (D+µ)2
ˆ θ = ˆ K′′(0) = D2 K( 2
D) − 2K( 1 D)
ξ (D+µ)2
SLIDE 27 Upon replacing the µ and ξ in the foregoing estimators for ˆ γ and ˆ θ with the estimators ˆ µ = µrf(1 − 1
Tf) + m Tf and ˆ
ξ = D fTVf, we obtain the new estimators ˆ γ = D log
µ D
2D log
(D + ˆ µ)2
ˆ θ = D2 log
(D + ˆ µ)2
Finally, if we assume D is large in the sense that
µ D
D
then, by keeping the leading order of each term, we arrive at the estimators ˆ γ = µrf
Tf
Tf − 1 2fTVf ,
ˆ θ D = fTVf .
SLIDE 28
γ and ˆ θ given above have at least three potential sources of error:
µ and ˆ ξ upon which they are based,
K(τ) used to estimate γ and θ from µ and ξ,
- the “large D” approximation made at the bottom of the previous page.
These approximations all assume that the return rate distribution for each Markowitz portfolio is described by a density qf(R) that is narrow enough for some moment beyond the second to exist. The last approximation also assumes both that 1
Dm and 1 DV are small and that f is not very large.
These assumptions should be examined carefully in volatile markets.
SLIDE 29
- Remark. If the Markowitz portfolio specified by f has growth rates X that
are normally distributed with mean γ and variance θ then pf(X) = 1 √ 2πθ exp
2θ
A direct calculation then shows that Ex
= 1 √ 2πθ
2θ + τX
= 1 √ 2πθ
2θ + γτ + 1
2θτ2
= exp
2θτ2
, whereby K(τ) = log
= γτ + 1
2θτ2. In this case we have
ˆ K(τ) = K(τ), so the estimators ˆ γ = ˆ K′(0) and ˆ θ = ˆ K′′(0) are exact. More generally, if K(τ) is thrice continuously differentiable over [0, 2
D] then
the estimators ˆ γ and ˆ θ make errors that are O( 1
D2) and O( 1 D) respectively.
SLIDE 30
- Exercise. When the final forms of the estimators ˆ
γ and ˆ θ are applied to a single risky asset, they reduce to ˆ γ = ˆ µ −
1 2Dˆ
ξ , ˆ θ = ˆ ξ . Use these to estimate γ and θ for each of the following assets given the share price history {s(d)}D
d=0. How do these ˆ
γ and ˆ θ compare with the unbiased estimators for γ and θ that you obtained in the previous problem? (a) Google, Microsoft, Exxon-Mobil, UPS, GE, and Ford stock in 2009; (b) Google, Microsoft, Exxon-Mobil, UPS, GE, and Ford stock in 2007; (c) S&P 500 and Russell 1000 and 2000 index funds in 2009; (d) S&P 500 and Russell 1000 and 2000 index funds in 2007.
γ and ˆ θ based on daily data for the Markowitz portfolio with value equally distributed among the assets in each of the groups given in the previous exercise.
SLIDE 31
- 3. Model-Based Objective Functions
An IID model for the Markowitz portfolio with distribution f satifies Ex
Π(0)
D γ ,
Var
Π(0)
d D2 θ ,
where γ and θ are estimated from a share price history by ˆ γ = µrf
Tf
Tf − 1 2fTVf ,
ˆ θ D = fTVf . We see that ˆ γ t is then the estimated expected growth of the IID model while fTVf t is its estimated variance at time t = d/D years. Our approach to portfolio management will be to select a distribution f that maximizes some objective function. Here we develop a family of such
- bjective functions built from ˆ
γ and ˆ θ with the aid of two important tools from probability, the Law of Large Numbers and the Central Limit Theorem.
SLIDE 32 Law of Large Numbers. Let {X(d)}∞
d=1 be any sequence of IID random
variables drawn from a probability density p(X) with mean γ and variance θ > 0. Let {Y (d)}∞
d=1 be the sequence of random variables defined by
Y (d) = 1 d
d
X(d′) for every d = 1, · · · , ∞ . You can easily check that Ex(Y (d)) = γ , Var(Y (d)) = θ d . Given any δ > 0 the Law of Large Numbers states that lim
d→∞ Pr
This limit is not uniform in δ. Its convergence rate can be estimated by the Chebyshev Inequality, which yields the (not uniform in δ) upper bound Pr
- |Y (d) − γ| ≥ δ
- ≤ Var(Y (d))
δ2 = 1 δ2 θ d .
SLIDE 33 Growth Rate Mean. Because the value of the associated portfolio is Π(d) = Π(0) exp
D
we see that Y (d) is the growth rate of the portfolio at day d. The Law
- f Large Numbers implies that Y (d) is likely to approach γ as d → ∞.
This suggests that investors whose goal is to maximize the value of their portfolio over an extended period should maximize γ. More precisely, it suggests that such investors should select f to maximize the estimator ˆ γ.
- Remark. The suggestion to maximize ˆ
γ rests upon the assumption that the investor will hold the portfolio for an extended period. This is a suitable assumption for most young investors, but not for many old investors. The development of objective functions that are better suited for older investors requires more information about Y (d) than the Law of Large Numbers
- provides. However, this additional information can be estimated with the
aid of the Central Limit Theorem.
SLIDE 34 Central Limit Theorem. Let {X(d)}∞
d=1 be any sequence of IID random
variables drawn from a probability density p(X) with mean γ and variance θ > 0. Let {Y (d)}∞
d=1 be the sequence of random variables defined by
Y (d) = 1 d
d
X(d′) for every d = 1, · · · , ∞ . Recall that Ex(Y (d)) = γ , Var(Y (d)) = θ d . Now let {Z(d)}∞
d=1 be the sequence of random variables defined by
Z(d) = Y (d) − γ
for every d = 1, · · · , ∞ . These random variables have been normalized so that Ex(Z(d)) = 0 , Var(Z(d)) = 1 .
SLIDE 35 The Central Limit Theorem states that as d → ∞ the limiting distribution of Z(d) will be the mean-zero, variance-one normal distribution. Specifically, for every ζ ∈ R it implies that lim
d→∞ Pr
∞
−ζ
1 √ 2π e−1
2Z2 dZ .
This can be expressed in terms of Y (d) as lim
d→∞ Pr
∞
−ζ
1 √ 2π e−1
2Z2 dZ .
- Remark. The power of the Central Limit Theorem is that it assumes so little
about the underlying probability density p(X). Specifically, it assumes that
∞
−∞ X2p(X) dX < ∞ ,
and that 0 < θ =
∞
−∞(X − γ)2p(X) dX ,
where γ =
∞
−∞ X p(X) dX .
SLIDE 36
- Remark. The Central Limit Theorem does not estimate how fast this limit
is approached. Any such estimate would require additional assumptions about the underlying probability density p(X). It will not be uniform in ζ. Remark. In an IID model of a portfolio Y (d) is the growth rate of the portfolio when it is held for d days. The Central Limit Theorem shows that as d → ∞ the values of Y (d) become strongly peak around γ. This behavior seems to be consistent with the idea that a reasonable approach towards portfolio management is to select f to maximize the estimator ˆ γ. However, by taking ζ = 0 we see that the Central Limit Theorem implies lim
d→∞ Pr
2 .
This shows that in the long run the growth rate of a portfolio will exceed γ with a probability of only 1
- 2. A conservative investor might want the portfolio
to exceed the optimized growth rate with a higher probability.
SLIDE 37 Growth Rate Exceeded with Probability. Let Γ(λ, T) be the growth rate exceeded by a portfolio with probability λ at time T in years. Here we will use the Central Limit Theorem to construct an estimator ˆ Γ(λ, T) of this
- quantity. We do this by assuming T = d/D is large enough that we can
use the approximation Pr
∞
−ζ
1 √ 2π e−1
2Z2 dZ .
Given any probability λ ∈ (0, 1), we set λ =
∞
−ζ
1 √ 2π e−1
2Z2 dZ =
ζ
−∞
1 √ 2π e−1
2Z2 dZ ≡ N(ζ) .
Our approximation can then be expressed as Pr
ζ √ T σ
where σ =
SLIDE 38 Finally, we replace γ and σ in the above approximation by the estimators ˆ γ = µrf
Tf
Tf − 1 2fTVf ,
ˆ σ =
This yields the estimator ˆ Γ(λ, T) = ˆ γ − ζ √ T ˆ σ = ˆ µ − 1
2ˆ
σ2 − ζ √ T ˆ σ , where ˆ µ = µrf(1 − 1
Tf) + m Tf and ζ = N−1(λ).
- Remark. The only new assumption we have made in order to construct
this estimator is that T is large enough for the Central Limit Theorem to yield a good approximation of the distribution of growth rates. Investors
- ften choose T to be the interval at which the portfolio will be rebalanced,
regardless of whether T is large enough for the approximation to be valid. If an investor plans to rebalance once a year then T = 1, twice a year then T = 1
2, and four times a year then T = 1
- 4. The smaller T, the less likely it
is that the Central Limit Theorem approximation is valid.
SLIDE 39 Risk Aversion. The idea now will be to select the admissible Markowitz portfilio that maximizes ˆ Γ(λ, T) given a choice of λ and T by the investor. In other words, the objective will be to maximize the growth rate that will be exceeded by the portfolio with probability λ when it is held for T years. Because 1 − λ is the fraction of times the investor is willing to experience a downside tail event, the choice of λ measures the risk aversion of the
- investor. More risk averse investors will select a higher λ.
- Remark. The risk aversion of an investor generally increases with age.
Retirees whose portfolio provides them with an income that covers much
- f their living expenses will generally be extremely risk averse. Investors
within ten years of retirement will be fairly risk averse because they have less time for their nest-egg to recover from any economic downturn. In constrast, young investors can be less risk averse because they have more time to experience economic upturns and because they are typically far from their peak earning capacity.
SLIDE 40 An investor can simply select ζ such that λ = N(ζ) is a probability that reflects their risk aversion. For example, based on the tabulations N(0) = .5000 , N
1
4
N
1
2
N
3
4
N(1) ≈ .8413 , N
5
4
N
3
2
N
7
4
an investor who is willing to experience a downside tail event roughly
- nce every two years might select ζ = 0 ,
twice every five years might select ζ = 1
4 ,
thrice every ten years might select ζ = 1
2 ,
twice every nine years might select ζ = 3
4 ,
- nce every six years might select ζ = 1 ,
- nce every ten years might select ζ = 5
4 ,
- nce every fifteen years might select ζ = 3
2 ,
- nce every twenty years might select ζ = 7
4 .
SLIDE 41
- Remark. The Central Limit Theorem approximation generally degrades
badly as ζ increases because p(X) typically decays much more slowly than a normal density as X → −∞. It is therefore a bad idea to pick ζ > 2 based on this approximation. Fortunately, ζ = 7
4 already corresponds to a
fairly conservative investor.
- Remark. You should pick a larger value of ζ whenever your analysis of the
historical data gives you less confidence either in the calibration of V and
m or in the validity of an IID model.
- Remark. Our approach is similar to something in financal management
called value at risk. The finance problem is much harder because the time horizon T considered there is much shorter, typically on the order of days. In that setting the Central Limit Theorem approximation is certainly invalid.
SLIDE 42
- 4. Model-Based Portfolio Management
We now address the problem of how to manage a portfolio that contains N risky assets along with a risk-free safe investment and possibly a risk-free credit line. Given the mean vector m, the covariance matrix V, and the risk-free rates µsi and µcl, the idea is to select the portfolio distribution f that maximizes an objective function of the form ˆ Γ(f) = ˆ µ − 1
2ˆ
σ2 − χ ˆ σ , where ˆ µ = µrf
Tf
Tf ,
ˆ σ =
µrf =
µsi for 1
Tf < 1 ,
µcl for 1
Tf > 1 .
Here χ = ζ/ √ T where ζ ≥ 0 is the risk aversion coefficient and T > 0 is a time horizon that is usually the time to the next portfolio rebalancing. Both ζ and T are chosen by the investor.
SLIDE 43
Reduced Maximization Problem. Because frontier portfolios minimize ˆ σ for a given value of ˆ µ, the optimal f clearly must be a frontier portfolio. Because the optimal portfolio must also be more efficient than every other portfolio with the same volatility, it must lie on the efficient frontier. Recall that the efficient frontier is a curve µ = µef(σ) in the σµ-plane given by an increasing, concave, continuously differentiable function µef(σ) that is defined over [0, ∞) for the unconstrained One Risk-Free Rate and Two Risk-Free Rates models, and over [0, σmx] for the long portfolio model. The problem thereby reduces to finding σ that maximizes Γ
ef(σ) = µef(σ) − 1 2σ2 − χ σ .
This function has the continuous derivative Γ′
ef(σ) = µ′ ef(σ) − σ − χ.
Because µef(σ) is concave, Γ′
ef(σ) is strictly decreasing.
SLIDE 44 Because Γ′
ef(σ) is strictly decreasing, there are three possibilities.
ef(σ) takes its maximum at σ = 0, the left endpoint of its interval of
- definition. This case arises whenever Γ′
ef(0) ≤ 0.
ef(σ) takes its maximum in the interior of its interval of definition at
the unique point σ = σ
- pt that solves the equation
Γ′
ef(σ) = µ′ ef(σ) − σ − χ = 0 .
This case arises for the unconstrained models whenever Γ′
ef(0) > 0,
and for the long portfolio model whenever Γ′
ef(σmx) < 0 < Γ′ ef(0).
ef(σ) takes its maximum at σ = σmx, the right endpoint of its in-
terval of definition. This case arises only for the long portfolio model whenever Γ′
ef(σmx) ≥ 0.
SLIDE 45 This reduced maximization problem can be visualized by considering the family of parabolas parameterized by Γ as µ = Γ + χσ + 1
2σ2 .
As Γ varies the graph of this parabola shifts up and down in the σµ-plane. For some values of Γ the corresponding parabola will intersect the efficient frontier, which is given by µ = µef(σ). There is clearly a maximum such Γ. As the parabola is strictly convex while the efficient frontier is concave, for this maximum Γ the intersection will consist of a single point (σ
Then σ = σ
ef(σ).
This reduction is appealing because the efficient frontier only depends on general information about an investor, like whether he or she will take short
- positions. Once it is computed, the problem of maximizing any given ˆ
Γ(f)
- ver all admissible portfolios f reduces to the problem of maximizing the
associated Γ
ef(σ) over all admissible σ — a problem over one variable.
SLIDE 46 In summary, our approach to portfolio selection has three steps:
- 1. Choose a return rate history over a given period (say the past year)
and calibrate the mean vector m and the covariance matrix V with it.
- 2. Given m, V, µsi, µcl, and any portfolio constraints, compute µef(σ).
- 3. Finally, choose χ = ζ/
√ T and maximize the associated Γ
ef(σ); the
maximizer σ
- pt corresponds to a unique efficient frontier portfolio.
Below we will illustrate the last step on some models we have developed.
SLIDE 47 One Risk-Free Rate Model. This is the easiest model to analyze. You first compute σmv, µmv, and νas from the return rate history. The model assumes that µsi = µcl < µmv. Then its tangency parameters are νtg = νas
µmv − µrf
νas σmv
2
, σtg = σmv
µmv − µrf
2
, where µrf = µsi = µcl, while its efficient frontier is µef(σ) = µrf + νtg σ for σ ∈ [0, ∞) . Because Γ
ef(σ) = µef(σ) − 1 2σ2 − χσ, we have
Γ′
ef(σ) = νtg − σ − χ .
When χ ≥ νtg we see that Γ′
ef(0) = νtg − χ ≤ 0, whereby σ
while when χ < νtg there is a positive solution of Γ′
ef(σ) = 0. We obtain
σ
if νtg ≤ χ , νtg − χ if χ < νtg .
SLIDE 48 Two Risk-Free Rates Model. This is the next easiest model to analyze. You first compute σmv, µmv, and νas from the return rate history. The model assumes that µsi < µcl < µmv. Then its tangency parameters are νst = νas
µmv − µsi
νas σmv
2
, σst = σmv
µmv − µsi
2
, νct = νas
µmv − µcl
νas σmv
2
, σct = σmv
µmv − µcl
2
, while its efficient frontier is µef(σ) =
µsi + νst σ for σ ∈ [0, σst] , µmv + νas
mv
for σ ∈ [σst, σct] , µcl + νct σ for σ ∈ [σct, ∞) .
SLIDE 49 Because Γ
ef(σ) = µef(σ) − 1 2σ2 − χσ, we have
Γ′
ef(σ) =
νst − σ − χ for σ ∈ [0, σst] , νas σ
mv
− σ − χ for σ ∈ [σst, σct] , νct − σ − χ for σ ∈ [σct, ∞) . When χ ≥ νtg we see that Γ′
ef(0) = νtg − χ ≤ 0, whereby σ
while when χ < νtg there is a positive solution of Γ′
ef(σ) = 0. We obtain
σ
if νst ≤ χ , νst − χ if νst − σst ≤ χ < νst , σq(χ) if νct − σct ≤ χ < νst − σst , νct − χ if χ < νct − σct , where σ = σq(χ) ∈ [σst, σct] solves the quartic equation ν 2
as σ2 =
mv
SLIDE 50 Long Portfolio Model. This is the most complicated model that we will
- analyze. You first compute σmv, µmv, and νas from the return rate history.
You then construct the efficient branch of the long frontier. We saw how to do this by an iterative construction whenever ff(µ0) ≥ 0 for some µ0. Here we will assume that fmv ≥ 0 and set µ0 = µmv. In that case we found that σlf(µ) is a continuously differentiable function over [µmv, µmx] that is given by a list in the form σlf(µ) = σfk(µ) ≡
mvk +
µ − µmvk
νask
2
for µ ∈ [µk, µk+1] , where σmvk, µmvk, and νask are the frontier parameters associated with the vector mk and matrix V
k that determined σfk(µ) in the kth step of our
- construction. In particular, σmv0 = σmv, µmv0 = µmv, and νas0 = νas
because m0 = m and V
0 = V.
SLIDE 51
Next, you construct the continuously differentiable function µef(σ) over [0, σmx] that determines the efficient frontier given the return rate µsi of the safe investment. The form of this construction depends upon the tangent line to the curve σ = σlf(µ) at the point (σmx, µmx). This tangent line has µ-intercept ηmx and slope νmx given by ηmx = µmx − σlf(µmx) σ′
lf(µmx) ,
νmx = 1 σ′
lf(µmx) .
These parameters are related by νmx = µmx − ηmx σmx . The cases µsi ≥ ηmx and µsi < ηmx are considered separately.
SLIDE 52 Case µsi ≥ ηmx. Here the efficient long frontier is simply determined by µef(σ) = µsi + νef σ for σ ∈ [0, σmx] , where the slope of this linear function is given by νef = µmx − µsi σmx . Notice that µsi ≥ ηmx if and only if νef ≤ νmx. Because Γ
ef(σ) = µef(σ) − 1 2σ2 − χσ,
Γ′
ef(σ) = νef − σ − χ
for σ ∈ [0, σmx] . We therefore find that σ
if νef ≤ χ , νef − χ if νef − σmx ≤ χ < νef , σmx if χ < νef − σmx .
SLIDE 53 Case µsi < ηmx. In this case there is a tangent line with µ-intercept µsi. The tangent line to the long frontier at the point (σk, µk) has µ-intercept ηk and slope νk given by ηk = µmvk − ν 2
askσ 2 mvk
µk − µmvk , νk = ν 2
askσk
µk − µmvk . If we set η0 = −∞ then for every µsi < ηmx there is a unique j such that ηj ≤ µsi < ηj+1 . For this value of j we have the tangancy parameters νst = νasj
µmvj − µsi
νasj σmvj
2
, σst = σmvj
νasj σmvj
µmvj − µsi
2
.
SLIDE 54 Therefore when µsi < ηmx the efficient long frontier is given by µef(σ) =
µsi + νst σ for σ ∈ [0, σst] , µmvj + νasj
mvj
for σ ∈ [σst, σj+1] , µmvk + νask
mvk
for σ ∈ [σk, σk+1] and k > j . Because Γ
ef(σ) = µef(σ) − 1 2σ2 − χσ,
Γ′
ef(σ) =
νst − σ − χ for σ ∈ [0, σst] , νasj σ
mvj
− σ − χ for σ ∈ [σst, σj+1] , νask σ
mvk
− σ − χ for σ ∈ [σk, σk+1] and k > j . The last case in the above formulas can arise only when σj+1 < σmx.
SLIDE 55 We therefore find that σ
if νst ≤ χ , νst − χ if νst − σst ≤ χ < νst , σqj(χ) if νj+1 − σj+1 ≤ χ < νst − σst , σqk(χ) if νk+1 − σk+1 ≤ χ < νk − σk , σmx if χ < νmx − σmx , where σ = σqk(χ) ∈ [σk, σk+1] solves the quartic equation ν 2
ask σ2 =
mvk
The fourth case can arise only when σj+1 < σmx.
- Remark. The tasks of finding expressions for µopt, γopt, and fopt for the
long portfolio model is left as an exercise.
SLIDE 56
- Remark. The foregoing solutions illustrate two basic principles of investing.
When the market is bad it is often in the regime µsi ≥ ηmx. In that case the above solution gives an optimal long portfolio that is placed largely in the safe investment, but the part of the portfolio placed in risky assets is placed in the most agressive risky assets. Such a position allows you to catch market upturns while putting little at risk when the market goes down. When the market is good it is often in the regime µsi < ηmx. In that case the above solution gives an optimal long portfolio that is placed largely in risky assets, but much of it is not placed in the most agressive risky assets. Such a position protects you from market downturns while giving up little in returns when the market goes up. Many investors will ignore these basic principles and become either overly conservative in a bear market or overly aggressive in a bull market.
SLIDE 57
- Exercise. Consider the following groups of assets:
(a) Google, Microsoft, Exxon-Mobil, UPS, GE, and Ford stock in 2009; (b) Google, Microsoft, Exxon-Mobil, UPS, GE, and Ford stock in 2007; (c) S&P 500 and Russell 1000 and 2000 index funds in 2009; (d) S&P 500 and Russell 1000 and 2000 index funds in 2007. Assume that µsi is the US Treasury Bill rate at the end of the given year, and the µcl is three percentage points higher. Assume you are an investor who chooses χ = 0. Design the optimal portfolios with risky assets drawn from group (a), from group (c), and from groups (a) and (c) combined. Do the same for group (b), group (d), and groups (b) and (d) combined. How well did these optimal portfolios actually do over the subsequent year?
- Exercise. Repeat the above exercise for an investor who chooses χ = 1.
Compare these optimal portfolios with the corresponding ones from the previous exercise.
SLIDE 58
The MPT models we have developed illustrate three basic principles of portfolio management. (As we have just seen, there are others.)
- 1. Diversification reduces the volatility of a portfolio.
- 2. Increased volatility lowers the expected growth rate of a portfolio.
- 3. Diversification raises the expected growth rate of a portfolio.
- Remark. The last of these follows from the first two.
SLIDE 59 Reacll that two kinds of people hold risky assets: traders and investors. Traders often take positions that require constant attention. They might buy and sell assets on short time scales in an attempt to profit from market
- fluctuations. They might also take highly leveraged positions that expose
them to enormous gains or loses depending how the market moves. They must be ready to handle margin calls. Trading is often a full time job. Investors operate on longer time scales. They buy or sell an asset based
- n their assessment of its fundamental value over time. Investing does
not have to be a full time job. Indeed, most people who hold risky assets are investors who are saving for retirement. Lured by the promise of high returns, sometimes investors will buy shares in funds designed for traders. At that point they have become gamblers, whether they realize it or not. It is important to realize that MPT is designed to help balance investment portfolios, not trading portfolios.
SLIDE 60 If you invest or plan to invest, I highly recommend that you read
- The Investment Answer by Daniel C. Goldie and Gordon S. Murray,
Business Plus, New York, 2011. This short, straight talking book provides a framework for investing with a minimum of mathematics. It tells you everything you need to know to invest wisely except how to allocate your assets. But that is what these lectures have been about. The two complement each other. An important topic discussed in the book is how to engage professional
- advice. Professionals will be generally more knowledgeable than you about
investment products that are available. Your interactions with them will be more productive if you engage them as an informed customer.
SLIDE 61 The MPT models we have studied can be used to manage your investment portfolio provided certain caveats are kept in mind.
- 1. Never invest money in risky assets that you are not prepared to lose.
- 2. Develop an understanding of the economic, technological, political,
and natural events that might effect your investments. Invest for the long term based on this understanding.
- 3. Study each risky asset before you invest in it. Decide if its business
plan makes sense in the context of your larger understanding.
- 4. Remember that investing is not a science and models are not reality.
Use models for guidence with a full awareness of their limitations.
SLIDE 62
One major limitation of the models we have studied is that they assume the validity of an underlying IID model. The truth is that all agents who buy and sell risky assets are influenced by the past. An IID model will be valid when the motives of enough agents are sufficiently diverse and uncorrelated. You can test the validity of this assumption with the historical data. But even when the historical data supports this assumption, you must be on guard for correlations that might arise due to changing circumstances. Another major limitation is that they assume the probability densities in the underlying IID model are sufficiently narrow that second moments exist. When this assumption is not valid this theory breaks down completely. Yet another major limitation is that dependencies between different assets are only captured by the covariances in historical data. Such models can lose validity when a major event occurs that has no analog in the period spanned by the historical data that you used to calibrate your model.
SLIDE 63 Recall that certain aspects of MPT that are unrealistic. These generally arise from simplifications that were adopted to make the analysis easier. These include:
- the fact portfolios can contain fractional shares of any asset;
- the fact portfolios are rebalanced every trading day;
- the fact transaction costs and taxes are neglected;
- the fact dividends are neglected.
In practice, any portfolio with a distribution that is nearby the one for the
- ptimal Markowitz portfolio will perform nearly as well. Consequently, most
investors rebalance no more than a few times per year, and not every asset is involved each time. Transaction costs and tax issues are thereby limited. Similarly, borrowing costs can be kept to a minimum by not borrowing often. The theory can be modified to account for dividends.
SLIDE 64 Many common criticisms of MPT are simply wrong. These include the following claims:
- it assumes asset returns are normally distributed;
- it assumes markets are efficient;
- it assumes all investors are rational and risk-adverse;
- it assumes all investors have access to the same information.
Some of these arose because some avocates of MPT did not understand its full generality, and stated more restrictive assumptions in their work that were later attacked by critics. The first claims above are examples of this. We saw that MPT does not assume asset returns are normally distributed, and does not assume an efficient market hypothesis. Other such claims arose because some critics of MPT did not understand it. The last two claims above are examples of this. In fact, without investor diversity it is unlikely that the IID assumptions that underpin our models would be valid.
SLIDE 65 Many modern portfolio models are more complicated than the ones we have studied. Many of these use mathematical tools that one sees in some graduate courses on stochastic processes. Stochastic Portfolio Theory developed by Robert Fernholtz and others is a notable example. Finally, the simple MPT models that we have studied do not consider any derivative tools that can be used to hedge a portfolio. These tools reduce your risk by paying someone to take it on when certain contingencies are
- met. In other words, they are insurance polices for risky assets. They
thereby transfer the risk held by individual investors to the system as a whole — the so-called securitization of risk. Traditional derivatives are put and call options, but since the 1980s there has been an explosion in derivative products such as exotic options, swaps, futures, and forwards. As we saw in 2008 and 2011, without proper regulation these tools can create ties that critically weaken the entire financial system.