 
              Introduction The Longstaff Schwartz algorithm Numerical experiments Longstaff Schwartz algorithm and Neural Network regression J´ erˆ ome Lelong (joint work with B. Lapeyre) Univ. Grenoble Alpes Advances in Financial Mathematics 2020 J. Lelong (Univ. Grenoble Alpes) January 2020 1 / 21
Introduction The Longstaff Schwartz algorithm Numerical experiments Introduction ◮ Computing an American option involving a large number of assets remains numerically challenging. ◮ A hope: Neural Network (NN) can (may) help to reduce the computational burden. ◮ Some previous works using NN for optimal stopping (not LS algorithm though) ◮ Michael Kohler, Adam Krzy˙ zak, and Nebojsa Todorovic. Pricing of high-dimensional american options by neural networks. Mathematical Finance: An International Journal of Mathematics, Statistics and Financial Economics , 20(3):383–410, 2010 ◮ S. Becker, P. Cheridito, and A. Jentzen. Deep optimal stopping. Journal of Machine Learning Research , 20(74):1–25, 2019 J. Lelong (Univ. Grenoble Alpes) January 2020 2 / 21
Introduction The Longstaff Schwartz algorithm Numerical experiments Computing Bermudan options prices ◮ A discrete time (discounted) payoff process ( Z T k ) 0 ≤ k ≤ N adapted to ( F T k ) 0 ≤ k ≤ N . max 0 ≤ k ≤ N | Z T k | ∈ L 2 . ◮ The time- T k discounted value of the Bermudan option is given by U T k = esssup τ ∈T Tk E [ Z τ |F T k ] where T t is the set of all F− stopping times with values in { T k , T k + 1 , ..., T } . ◮ From the Snell enveloppe theory, we derive the standard dynamic programming algorithm ( → “Tsistsiklis-Van Roy” type algorithms). � U T N = Z T N (1) U T k = max � Z T k , E [ U T k + 1 |F T k ] � J. Lelong (Univ. Grenoble Alpes) January 2020 3 / 21
Introduction The Longstaff Schwartz algorithm Numerical experiments The policy iteration approach.. . Let τ k be the smallest optimal stopping time after T k . � τ N = T N (2) τ k = T k 1 { Z Tk ≥ E [ Z τ k + 1 |F Tk ] } + τ k + 1 1 { Z Tk < E [ Z τ k + 1 |F Tk ] } . This is a dynamic programming principle on the policy not on the value function → “Longstaff-Schwartz” algorithm. This approach has the practitioners’ favour for its robustness. Difficulty: how to compute the conditional expectations? J. Lelong (Univ. Grenoble Alpes) January 2020 4 / 21
Introduction The Longstaff Schwartz algorithm Numerical experiments . . . in a Markovian context ◮ Markovian context: ( X t ) 0 ≤ t ≤ T is a Markov process and Z T k = φ k ( X T k ) . E [ Z τ k + 1 |F T k ] = E [ Z τ k + 1 | X T k ] = ψ k ( X T k ) where ψ k is a measurable function. ◮ Because of the L 2 assumption, ψ k can be computed by a least-square problem �� � 2 � � � Z τ k + 1 − ψ ( X T k ) inf ψ ∈ L 2 ( L ( X Tk )) E J. Lelong (Univ. Grenoble Alpes) January 2020 5 / 21
Introduction The Longstaff Schwartz algorithm Numerical experiments Different numerical strategies ◮ The standard numerical (LS) approach: approximate the space L 2 by a finite dimensional vector space (polynomials, . . . ) ◮ We investigate the use of Neural Networks to approximate ψ k . ◮ Kohler et al. [2010]: neural networks but in a different context (approximation of the value function Tsitsiklis and Roy [2001], equation (1)) and re-simulation of the paths at each time steps. J. Lelong (Univ. Grenoble Alpes) January 2020 6 / 21
Introduction The Longstaff Schwartz algorithm Numerical experiments LS: truncation step Longstaff-Schwartz type algorithms rely on direct approximation of stopping times and use of the same simulated paths for all time steps (obvious and large computational gains). ◮ ( g k , k ≥ 1 ) is an L 2 ( L ( X )) basis and Φ p ( X , θ ) = � p k = 1 θ k g k ( X ) . ◮ Backward approximation of iteration policy using (2), � τ p , � N = T N τ p τ p � n = T n 1 { Z Tn ≥ Φ p ( X Tn ; � n ) } + � n + 1 1 { Z Tn < Φ p ( X Tn ; � θ p θ p n ) } ◮ with conditional expectation computed using a Monte Carlo minimization problem: � θ p n is a minimizer of �� 2 � � � � � Φ p ( X T n ; θ ) − Z � inf θ E � . τ p n + 1 � � �� ◮ Price approximation: U p 0 = max Z 0 , E Z � . τ p 1 J. Lelong (Univ. Grenoble Alpes) January 2020 7 / 21
Introduction The Longstaff Schwartz algorithm Numerical experiments The LS algorithm ◮ ( g k , k ≥ 1 ) is an L 2 ( L ( X )) basis and Φ p ( X , θ ) = � p k = 1 θ k g k ( X ) . ◮ Paths X ( m ) T 0 , X ( m ) T 1 , . . . , X ( m ) T N and payoff paths Z ( m ) T 0 , Z ( m ) T 1 , . . . , Z ( m ) T N , m = 1 , . . . , M . ◮ Backward approximation of iteration policy,  τ p , ( m )  � = T N N � + � τ p , ( m ) τ p , ( m ) � = T n 1 � n + 1 1 � �  n Z ( m ) Tn ≥ Φ p ( X ( m ) Z ( m ) Tn < Φ p ( X ( m ) Tn ; � θ p , M Tn ; � θ p , M ) ) n n ◮ with conditional expectation computed using a Monte Carlo minimization problem: � θ p , M is a minimizer of n � � M � 2 � � 1 � Φ p ( X ( m ) T n ; θ ) − Z ( m ) � � inf . � τ p , ( m ) M θ n + 1 m = 1 � � � M m = 1 Z ( m ) ◮ Price approximation: U p , M Z 0 , 1 = max . 0 M τ p , ( m ) � 1 J. Lelong (Univ. Grenoble Alpes) January 2020 8 / 21
Introduction The Longstaff Schwartz algorithm Numerical experiments Reference papers ◮ Description of the algorithm: F.A. Longstaff and R.S. Schwartz. Valuing American options by simulation : A simple least-square approach. Review of Financial Studies , 14:113–147, 2001. ◮ Rigorous approach: Emmanuelle Cl´ ement, Damien Lamberton, and Philip Protter. An analysis of a least squares regression method for american option pricing. Finance and Stochastics , 6(4):449–471, 2002. - U p 0 converge to U 0 , p → + ∞ - U p , M converge to U p 0 , M → + ∞ a.s. 0 - “almost” a central limit theorem J. Lelong (Univ. Grenoble Alpes) January 2020 9 / 21
Introduction The Longstaff Schwartz algorithm Numerical experiments The modified algorithm ◮ In LS algorithm replace the approximation on a Hilbert basis Φ p ( . ; θ ) by a Neural Network. This is not a vector space approximation (non linear). ◮ The optimization problem is non linear, non convex, . . . ◮ Aim: extending the proof of (a.s.) convergence results J. Lelong (Univ. Grenoble Alpes) January 2020 10 / 21
Introduction The Longstaff Schwartz algorithm Numerical experiments A quick view of Neural Networks ◮ In short, a NN: x → Φ p ( x , θ ) ∈ R , with θ ∈ R d , d large ◮ Φ p = A L ◦ σ a ◦ A L − 1 ◦ · · · ◦ σ a ◦ A 1 , L ≥ 2 ◮ A l ( x l ) = w l x l + β l (affine functions) ◮ L − 2 “number of hidden layers” ◮ p “maximum number of neurons per layer” (i.e. sizes of the w l matrix) ◮ σ a a fixed non linear (called activation function ) applied component wise ◮ θ := ( w l , β l ) l = 1 ,..., L parameters of all the layers ◮ Restriction to a compact set Θ p = { θ : | θ | ≤ γ p } and assume lim p →∞ γ p = ∞ . → use the USLLN. ◮ NN p = { Φ p ( · , θ ) : θ ∈ Θ p } and NN ∞ = ∪ p ∈ N NN p J. Lelong (Univ. Grenoble Alpes) January 2020 11 / 21
Introduction The Longstaff Schwartz algorithm Numerical experiments Hypothesis H ◮ For every p , there exists q ≥ 1 | Φ p ( x , θ ) | ≤ κ q ( 1 + | x | q ) ∀ θ ∈ Θ p , a.s. the random function θ ∈ Θ p �− → Φ p ( X T n , θ ) are continuous. ◮ E [ | X T n | 2 q ] < ∞ for all 0 ≤ n ≤ N . ◮ For all p , n < N , P ( Z T n = Φ p ( X T n ; θ p n )) = 0. ◮ If θ 1 and θ 2 solve �� 2 � � � � inf � Φ p ( X T n ; θ ) − Z � θ ∈ Θ p E � , τ p n + 1 then Φ p ( x , θ 1 ) = Φ p ( x , θ 2 ) for almost all x No need of a unique minimizer but only of the represented function. J. Lelong (Univ. Grenoble Alpes) January 2020 12 / 21
Introduction The Longstaff Schwartz algorithm Numerical experiments The result Theorem 1 Under hypothesis H ◮ Convergence of the Neural network approximation ( i . e . U p n |F T n ] = E [ Z τ n |F T n ] in L 2 (Ω) p →∞ E [ Z τ p lim 0 → U 0 ) . ◮ SLLN: for every k = 1 , . . . , N, � � � M 1 Z ( m ) ( i . e . U p , M → U p lim = E Z τ p a . s . 0 ) τ p , ( m ) 0 M M →∞ � k k m = 1 J. Lelong (Univ. Grenoble Alpes) January 2020 13 / 21
Introduction The Longstaff Schwartz algorithm Numerical experiments Convergence of the NN approximation A simple consequence of Hornik [1991]. ◮ Also known as the “Universal Approximation Theorem”. Theorem 2 (Hornik) Assume that the function σ a is non constant and bounded. Let µ denote a probability measure on R r , then NN ∞ is dense in L 2 ( R r , µ ) . ◮ Corollary: If for every p , α p ∈ Θ p is a minimizer of θ ∈ Θ p E [ | Φ p ( X ; θ ) − Y | 2 ] , inf (Φ p ( X ; α p )) p converges to E [ Y | X ] in L 2 (Ω) when p → ∞ . ◮ proof of the convergence of the “non-linear approximation” Φ p ( X ; θ ) . J. Lelong (Univ. Grenoble Alpes) January 2020 14 / 21
Recommend
More recommend