The natural mathematics arising in information theory and investment - - PowerPoint PPT Presentation

the natural mathematics arising in information theory and
SMART_READER_LITE
LIVE PREVIEW

The natural mathematics arising in information theory and investment - - PowerPoint PPT Presentation

The natural mathematics arising in information theory and investment Thomas Cover Stanford University Page 1 of 40 Felicity of mathematics We wish to maximize the growth rate of wealth. There is a satisfactory theory. The strategy achieving


slide-1
SLIDE 1

The natural mathematics arising in information theory and investment

Thomas Cover

Stanford University Page 1 of 40

slide-2
SLIDE 2

Felicity of mathematics

We wish to maximize the growth rate of wealth. There is a satisfactory theory. The strategy achieving this goal is controversial. (Probably because the strategy involves maximizing the expected logarithm.) Why is π fundamental? π = C/D,

  • n

1 n2 = π2 6 ,

φ(x) =

1 √ 2π e− x2

2 .

Recall from physics the statement that the laws of physics have a strangely felicitous relation with mathematics. We shall try to establish the reasonableness of the theory of growth optimality by presenting the richness of the mathematics that describes it and by giving a number of problems having growth optimality as the answer. A theory is natural if it fits and has few “moving parts”. Ideally, it should “predict” other properties. The new or unpublished statements will be identified.

Page 2 of 40

slide-3
SLIDE 3

Outline

Setup Mean variance theory Growth optimal portfolios for stochastic markets

Properties:

Stability of optimal portfolio Expected Ratio Optimality Competitive optimality Sn/S∗

n Martingale

S∗

n

. = enW ∗ (AEP)

Growth optimal portolios for arbitrary markets

Universal portfolios

ˆ Sn/S∗

n ≥ 1 2√n+1 for all xn

Amplification

Relationship of growth optimality to information theory

Page 3 of 40

slide-4
SLIDE 4

Portfolio Selection

Stock X: X = (X1, X2, . . . , Xm) ∼ F (x) X ≥ 0 Xi = price-relative of stock i Portfolio b: b = (b1, b2, . . . , bm), bi ≥ 0,

  • bi = 1

proportion invested Wealth Relative S: Factor by which wealth increases S =

m

  • i=1

biXi = btX Find the “largest” S.

Page 4 of 40

slide-5
SLIDE 5

Mean-Variance Theory.

Markowitz, Tobin, Sharpe, . . . Choose b so that (Var S, ES) is undominated. S = btX.

Page 5 of 40

slide-6
SLIDE 6

Conflict of mean-variance theory and growth rate. Portfolio selection: Maximize growth rate of wealth. Sn(X1, X2, . . . , Xn) · = 2nW Efficient portfolio is not necessarily growth optimal (E.Thorp)

Page 6 of 40

slide-7
SLIDE 7

Consider the stock market process {Xi}: Xi ∈ Rm, Portfolios bi(·):

m

  • j=1

bij(xi−1) = 1 for each time i = 1, 2, ... and for every past xi−1 = (x1, x2, ..., xi−1). Note: bij < 0 corresponds to shorting stock j on day i. Shorting cash is called buying on margin. Goal: Given a stochastic process {Xi} with known distribution, find portfolio sequence bi(·) that “maximizes” Sn =

n

  • i=1

bt

i(Xi−1)Xi

.

Page 7 of 40

slide-8
SLIDE 8

Page 8 of 40

slide-9
SLIDE 9

Page 9 of 40

slide-10
SLIDE 10
  • 1. Asymptotic Growth Rate of Wealth

X1, X2, . . . i.i.d. ∼ F (x) Wealth at time n: Sn =

n

  • i=1

btXi = 2(n 1

n

log btXi)

= 2n(E log btX+o(1)), a.e. Definition: Growth rate W (b, F ) =

  • log btx dF (x)

W ∗ = max

b

W (b, F ) Sn . = 2nW ∗ .

Page 10 of 40

slide-11
SLIDE 11

Example

Cash vs. Hot Stock X =    (1, 2), prob 1

2

  • 1, 1

2

  • ,

prob 1

2

b = (b1, b2) E log S = 1 2 log(b1 + 2b2) + 1 2 log(b1 + 1 2b2) b∗ = (1 2 , 1 2) W ∗ = 1 2 log 9 8 S∗

n

. = 9 8 n/2 . = (1.06)n

Page 11 of 40

slide-12
SLIDE 12

Live off fluctuations

n s

Cash Hot stock

S∗

n

Page 12 of 40

slide-13
SLIDE 13

Calculation of optimal portfolio

X ∼ F (x) Log Optimal Portfolio b∗: max

b

E log btX = W ∗ Log Optimal Wealth: S∗ = b∗tX ∂ ∂bi E ln btX = E Xi btX Kuhn-Tucker conditions: b∗ : E

Xi b∗tX

= 1, b∗

i > 0

≤ 1, b∗

i = 0

Consequence: ES/S∗ ≤ 1, for all S.

Theorem

E ln S

S∗ ≤ 0, ∀S

⇔ E S

S∗ ≤ 1, ∀S

Page 13 of 40

slide-14
SLIDE 14

Properties of growth rate W(b, F)

.

Theorem

W (b, F ) is concave in b and linear in F . Let bF maximize W (b, F ) over all portfolios b : m

i=1 bi = 1.

W ∗(F ) = W (bF , F ) W (b, F ) b 1

Theorem

W ∗(F ) is convex in F .

Question: Let W (b) =

  • ln btx dF (x).

Is W (b) a transform?

Page 14 of 40

slide-15
SLIDE 15
  • 2. Stability of b∗: Expected proportion remains constant

b∗ is a stable point

Let b = (b1, b2, ..., bm) denote the proportion of wealth in each stock. The proportions held in each stock at the end of the trading day are

  • b = (b1X1

btX , b2X2 btX , ..., bmXm btX ) Then b is log optimal if and only if b = E b i.e. bi = E biXi

btX ,

i = 1, 2, ..., m, i.e. the expected proportions remain unchanged. This is the counterpart to Kelly gambling.

Page 15 of 40

slide-16
SLIDE 16

Generalization to arbitrary stochastic processes {Xn}

Xn: arbitrary stochastic process: Wealth from bi(·) : Sn =

n

  • i=1

bt

iXi,

bi = bi(Xi−1) Let S∗

n = n

  • i=1

b∗t

i Xi,

b∗

i = b∗ i (Xi−1)

where b∗

i is conditionally log optimal. Thus

b∗

i (Xi−1) : max b

E{ln btXi|Xi−1}

Page 16 of 40

slide-17
SLIDE 17

Optimality for arbitrary stochastic processes {Xn}

Theorem

For any market process {Xi}, E{Sn+1/S∗

n+1|Xn} ≤ Sn/S∗ n.

Sn/S∗

n is a nonnegative super martingale with respect to {Xn}

Sn/S∗

n −

→ Y, a.e. EY ≤ 1.

Corollary:

Pr{sup

n

Sn S∗

n

≥ t} ≤ 1/t, by Kolmogorov’s inequality. So Sn cannot ever exceed S∗

n by factor t with probability

greater than 1/t. Same as fair gambling.

Theorem

If {Xi} is ergodic, then 1

n log S∗ n −

→ W , a.e.

Page 17 of 40

slide-18
SLIDE 18
  • 3. Value of Side Information

Theorem: Believe that X ∼ g, when in fact X ∼ f. Loss in growth rate: ∆(fg) = Ef log bt

f X

bt

gX

≤ D(f||g) =

  • f log f

g . Mutual information: I(X; Y ) =

  • p(x, y) log p(x, y)

p(x)p(y) Value of side information: W (X) = max

b

E ln btX, W (X|Y) = max

b(·) E ln bt(Y)X

W (X) → W (X|Y ) b∗ b∗(y) ∆(X; Y ) = Increase in growth rate for market X. Theorem: (A.Barron ,T.C.) ∆(X; Y ) ≤ I(X; Y ).

Page 18 of 40

slide-19
SLIDE 19
  • 4. Black-Scholes option pricing

Cash: 1 Stock: Xi =

  • 1 + u,

w.p. p 1 − d, w.p. q Option: Pay c dollars today for option to buy at time n the stock at price K. c →

  • (Xn − K),

Xn ≥ K 0, Xn < K Black, Scholes idea: Replicate option by buying and selling Xi, at times i = 1, 2, ..., n. Example: Option expiration date n = 1. Strike price K. Initial wealth = c. c1 + c2X = (X − K)+. c = c1 + c2. If it takes c dollars to replicate option, then c is a correct price for the option.

Page 19 of 40

slide-20
SLIDE 20

Black-Scholes option pricing

Growth optimal approach:

  • 1, X, (X − K)+

c

  • Best portfolio without option:

maxb1+b2=1E ln (b1 + b2X) Growth optimal wealth: X∗ = b∗

1 + b∗ 2X

Add option: max

b

E ln

  • (1 − b)X∗ + b(X − K)+

c

  • d

db E ln

  • (1 − b)X∗ + b(X − k)+

c

  • b=0

= E

(X−K)+ c

− X∗ X∗ ≥ 0,

  • r

E (X − K)+ X∗ ≥ c. Critical price: c∗ = E (X − K)+ X∗ . But this is the same critical option price c∗ as the Black Scholes theory. Note: c∗ does not depend on probabilities, only on u and d.

Page 20 of 40

slide-21
SLIDE 21
  • 5. Asymptotic Equipartition Principle

AEP X1, X2, ..., Xn i.i.d. ∼ p(x), 1 n log 1 p(X1, X2, ..., Xn) → H. AEP for markets Wealth: Sn =

n

  • i=1

btXi. 1 n log Sn → W. Proof: 1 n log Sn = 1 n log

n

  • i=1

btXi = 1 n

n

  • i=1

log btXi → W. p(X1, X2, ..., Xn) . = 2−nH Sn(X1, X2, ..., Xn) . = 2nW

Page 21 of 40

slide-22
SLIDE 22

Asymptotic Equipartition Principle: Horse race

b = (b1, b2, ..., bm), X = (0, 0, ..., 0, m , 0, ..., 0), with probability pi, b∗ = (p1, p2, ..., pm) Kelly gambling Proof: W = E log S =

m

  • i=1

pi log bim = log m +

  • i

pi log bi pi +

  • i

pi log pi ≤ log m − H(p1, ..., pm), with equality if and only if bi = pi, for i = 1, 2, ..., m. Conservation law W + H = log m

Page 22 of 40

slide-23
SLIDE 23

Comparisons

Information Theory Investment Entropy Rate Doubling Rate H = − pi log pi W ∗ = maxb E log btX AEP p(X1, X2, ..., Xn) . = 2−nH S∗(X1, X2, ..., Xn) . = 2nW ∗ Universal Data Compression Universal Portfolio Selection l∗∗(X1, X2, ..., Xn) . = nH S∗∗(X1, X2, ..., Xn) . = 2nW ∗ W ∗ + H ≤ log m

Page 23 of 40

slide-24
SLIDE 24
  • 6. Competitive optimality

X ∼ F (x). Consider the two-person zero sum game: Player 1: Portfolio b1. Wealth S1 = W1bt

1X.

Player 2: portfolio b1. Wealth S2 = W2bt

2X.

Fair randomization: EW1 = EW2 = 1, Wi ≥ 0. Payoff: Pr{S1 ≥ S2} V = max

b1,W1

min

b2,W2

Pr{S1 ≥ S2}

Theorem (R.Bell, T.C.) The value V of the game is 1/2. Optimal strategy for player

1 is b1 = b∗, where b∗ is the log optimal portfolio. W1 ∼ unif[0, 2]. Comment: b∗ is both long run and short run optimal.

Page 24 of 40

slide-25
SLIDE 25
  • 7. Universal portfolio selection

Market sequence x1, x2, . . . , xn Sn(b) =

n

  • i=1

btxi S∗

n = max b

Sn(b) =

n

  • i=1

b∗tXi. Investor: ˆ bi(x1, x2, . . . , xi−1) ˆ Sn =

n

  • i=1

ˆ bt

ixi

Page 25 of 40

slide-26
SLIDE 26

Page 26 of 40

slide-27
SLIDE 27

Page 27 of 40

slide-28
SLIDE 28

Minimax regret universal portfolio

Minimax regret for horizon n is defined as R∗

n = min ˆ b(·)

max

xn,b

n

i=1 btxi

n

i=1 ˆ

bi(xi−1)xi = min

ˆ b

max

xn

S∗

n

ˆ Sn Theorem: (Erik Ordentlich, T.C.) R∗

n =

1 Vn , where Vn =

n n1,...,nm

  • 2−nH( n1

n ,..., nm n )

Note: For m = 2 stocks, Vn = n

k=0

n

k

2−nH( k

n ) ∼

  • 2

πn

Vn ≤

2 √n+1

Corollary: For m = 2 stocks, there exists ˆ bi(xi−1) such that ˆ Sn ≥ 2S∗

n

√n + 1, for every sequence x1, . . . , xn.

Page 28 of 40

slide-29
SLIDE 29

Achieving R∗

n: Universal Portfolio for horizon n Portfolio ˆ bi(Xi−1) : Invest ˆ b(jn) = 1 Vn n1(jn) n n1(jn) n2(jn) n n2(jn) · · · nm(jn) n nm(jn) in “plunging” strategy jn and let it ride, where jn ∈ {1, 2, ..., m}n. Example For horizon n = 2. For m = 2. X1 = (X11, X12) ˆ b1 = ( 1

2 , 1 2 )

ˆ b2(X1) = (

4 5 X11+ 1 5 X12

X11+X12

,

1 5 X11+ 4 5 X12

X11+X12

) ˆ b(11) = 4/10 ˆ b(12) = 1/10 ˆ b(21) = 1/10 ˆ b(22) = 4/10

Page 29 of 40

slide-30
SLIDE 30
  • 8. Accelerated Performance

Stock x ∈ Rm

+ , requires b ∈ Rm + , so that btx ≥ 0.

Let X(α) = {x ∈ Rm : xi ≥ α, m

  • i=1

xi = 1} B(α) = {b ∈ Rm : m

  • i=1

bi = 1, btx ≥ 0, ∀x ∈ X(α)}

B(α) is polar cone to X(α): B(α) = X ⊥(α). B(α) allows short selling and buying on margin. Thus x ∈ X(α), b ∈ B(α) yields S = btx ≥ 0. Let Ω = Rm

+ , X(α) = AΩ, B(α) = A−1Ω.

A =

  • α

1 − α 1 − α α

  • A−1 =

1 2α − 1

  • α

−(1 − α) −(1 − α) α

  • b ∈ Ω,X ∈ Ω.

˜ b = A−1b ∈ B(α), ˜ X = AX ∈ X(α). ˜ bt ˜ X = bt A−1t AX = btX

α 1 − α

X(α) B(α)

Page 30 of 40

slide-31
SLIDE 31

Accelerated Performance

Theorem (Acceleration (Erik Ordentlich, T.C., to appear)) m = 2 stocks. The short selling investor can come within factor Vn(α) of the best long-only investor given hindsight: max

  • bi(·)∈B(α)

min

x∈Xn(α),

b∈B(0)

n

i=1

bt

ixi

n

i=1 btxi

= Vn(α), where [x] = x rounded off to interval [α, ¯ α]. Vn(α) =

n

  • k=0

n k k n k n − k n n−k Note: Vn(α) ր. Vn(0) ∼

  • 2

π 1 √n.

Vn( 1

2) = 1.

Page 31 of 40

slide-32
SLIDE 32

Accelerated Performance

50 100 150 200 250 300 900 1000 1100 1200 1300 1400 1500 1600

28−Sep−07 till 14−Oct−08 Time S&P500

Page 32 of 40

slide-33
SLIDE 33

Accelerated Performance

−6 −4 −2 2 4 6 0.5 1 1.5 2 2.5

b Sn

Sn

*

Sn

**

9/28/07 – 10/14/08, n = 263. S∗

n: Wealth of best long-only constant rebalanced portfolio in hindsight.

S∗∗

n : Wealth of best short selling and margin constant rebalanced portfolio in hindsight.

Page 33 of 40

slide-34
SLIDE 34

Accelerated Performance

−6 −4 −2 2 4 6 0.5 1 1.5 2 2.5

b Sn α=0.45 Sn

^= 1.0475

Sn

*

Sn

**

Sn

^

9/28/07 – 10/14/08, n = 263. S∗

n: Wealth of best long-only constant rebalanced portfolio in hindsight.

S∗∗

n : Wealth of best short selling and margin constant rebalanced portfolio in hindsight.

ˆ Sn: Wealth of universal portfolio.

Page 34 of 40

slide-35
SLIDE 35

Comparisons with Information Theory

General Market Horse Race Market X ∼ F (x) X = mei, pi b∗ : E b∗

i Xi

b∗tX = b∗ i

bi = pi Kelly gambling W ∗ = Eb∗tX W ∗ = log m − H(p), H =entropy Wrong distribution G(x): ∆(F ||G) = bt

F x

bt

GxdF (x)

∆ = pi ln pi

gi = D(p||g), relative entropy

Side information (X, Y) ∼ f(x, y): ∆ =

  • ln

bt

f(x|y)x

bt

f(x)x f(x, y)dxdy

∆ = p(x, y) ln

p(x,y) p(x)p(y)

= I(X; Y ), mutual information

Page 35 of 40

slide-36
SLIDE 36

Comparisons

General Market Horse Race Market Asymptotic growth rate {Xi} stationary: W ∗ = maxb E{ln btX0|X−1

−∞}

W ∗ = log m − H(X0|X−1

−∞)

= log m − H(X), H(X) = entropy rate AEP for ergodic processes:

1 n log S∗ n → W ∗, a.e.

− 1

n log p(Xn) → H(X), a.e.

S∗

n ·

= 2nW ∗ p(Xn) · = 2−nH

Page 36 of 40

slide-37
SLIDE 37

Comparisons

Universal portfolio (individual sequences): General Market Horse Race Market x1, x2, ..., xn ∈ Rm

+

x1, x2, ..., xn ∈ {e1, ..., em} Sn(b, xn) = n

i=1 btxi

Sn(b, xn) = m

i=1 bni(xn) i

  • Sn(ˆ

bn, xn) = n

i=1 ˆ

bt(xi−1)xi

  • Sn(ˆ

bn, xn) = ˆ b(xn) Vn Vn Same cost of universality for both. Vn = min

ˆ b(·)

max

b,xn

  • Sn(ˆ

bn, xn) Sn(b, xn) = n n1, ..., nm

  • 2−nH( n1

n ,..., nm n ) Page 37 of 40

slide-38
SLIDE 38

Concluding remarks

Growth optimal portfolios have many properties: Long run optimality Martingale property Competitive optimality Asymptotic equipartition property Universal achievability Black-Scholes Amplification Relationship with information theory

Page 38 of 40

slide-39
SLIDE 39

References

Algoet Barron Bell Borodin Cover Erkip Gluss Gy¨

  • rfi

Hakansson Iyengar Jamshidian Lugosi Mathis Merton Ordentlich Platen Samuelson Shannon Thorp Vajda Warmuth Ziemba Markowitz Sharpe Duffie

Page 39 of 40

slide-40
SLIDE 40

References

  • R. Bell and T. Cover, “Game-Theoretic Optimal Portfolios,” Management Science,

34(6):724-733, June 1988.

  • T. Cover, “Universal Portfolios,” Mathematical Finance, 1(1):1-29, January 1991.
  • T. Cover and E. Ordentlich, “Universal Portfolios with Side Information,”IEEE

Transactions on Information Theory, 42(2):348-363, March 1996.

  • E. Ordentlich and T. Cover, “The Cost of Achieving the Best Portfolio in Hindsight,”

Mathematics of Operations Research, 23(4):960-982, November 1998.

Page 40 of 40