Beyond Simple Monte-Carlo: Parallel Computing with QuantLib Klaus - PowerPoint PPT Presentation

Beyond Simple Monte-Carlo: Parallel Computing with QuantLib Klaus Spanderen E.ON Global Commodities November 14, 2013 Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib

Symmetric Multi-Processing Graphical Processing Units Message Passing Interface Conclusion Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib

Symmetric Multi-Processing: Overview ◮ Moore’s Law: Number of transistors doubles every two years. ◮ Leaking turns out to be the death of CPU scaling. ◮ Multi-core designs helps processor makers to manage power dissipation. ◮ Symmetric Multi-Processing Herb Sutter: ”The Free Lunch is has become a main stream Over: A Fundamental Turn Toward technology. Concurrency in Software.” Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib

Multi-Processing with QuantLib Divide and Conquer: Spawn several independent OS processes 50 20 10 GFlops 5 2 1 0.5 8 16 32 64 128 256 1 2 4 # Processes The QuantLib benchmark on a 32 core (plus 32 HT cores) server. Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib

Multi-Threading: Overview ◮ QuantLib is per se not thread-safe. ◮ Use case one: really thread-safe QuantLib (see Luigi’s talk) ◮ Use case two: multi-threading to speed-up single pricings. ◮ Joesph Wang is working with Open Multi-Processing (OpenMP) to parallelize several finite difference and Monte-Carlo algorithms. ◮ Use case three: multi-threading to parallelize several pricings, e.g. parallel pricing to calibrate models. ◮ Use case four: Use of QuantLib in C#,F#, Java or Scala via SWIG layer and multi-threaded unit tests. ◮ Focus on use case three and four: ◮ Situation is not too bad as long as objects are not shared between different threads. Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib

Multi-Threading: Parallel Model Calibration C++11 version of a parallel model calibration function Disposable<Array> CalibrationFunction::values(const Array& params) const { model_->setParams(params); std::vector<std::future<Real> > errorFcts; std::transform(std::begin(instruments_), std::end(instruments_), std::back_inserter(errorFcts), [](decltype(*begin(instruments_)) h) { return std::async(std::launch::async, &CalibrationHelper::calibrationError, h.get());}); Array values(instruments_.size()); std::transform(std::begin(errorFcts), std::end(errorFcts), values.begin(), [](std::future<Real>& f) { return f.get();}); return values; } Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib

Multi-Threading: Singleton ◮ Riccardo’s patch: All singletons are thread local singletons. template <class T> T& Singleton<T>::instance() { static boost::thread_specific_ptr<T> tss_instance_; if (!tss_instance_.get()) { tss_instance_.reset(new T); } return *tss_instance_; } ◮ C++11 Implementation: Scott Meyer Singleton template <class T> T& Singleton<T>::instance() { static thread_local T t_; return t_; } Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib

Multi-Threading: Observer-Pattern ◮ Main purpose in QuantLib: Distributed event handling. ◮ Current implementation is highly optimized for single threading performance. ◮ In a thread local environment this would be sufficient, but ... ◮ ... the parallel garbage collector in C#/F#, Java or Scala is by definition not thread local! ◮ Shuo Chen article ”Where Destructors meet Threads” provides a good solution ... ◮ ... but is not applicable to QuantLib without a major redesign of the observer pattern. Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib

Multi-Threading: Observer-Pattern Scala example fails immediately with spurious error messages ◮ pure virtual function call ◮ segmentation fault import org.quantlib.{Array => QArray, _} object ObserverTest { def main(args: Array[String]) : Unit = { System.loadLibrary("QuantLibJNI"); val aSimpleQuote = new SimpleQuote(0) while (true) { (0 until 10).foreach(_ => { new QuoteHandle(aSimpleQuote) aSimpleQuote.setValue(aSimpleQuote.value + 1) }) System.gc } } } Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib

Multi-Threading: Observer-Pattern ◮ The observer pattern itself can be solved using the thread-safe boost::signals2 library. ◮ Problem remains, an observer must be unregistered from all observables before the destructor is called. ◮ Solution: ◮ QuantLib enforces that all observers are instantiated as boost shared pointers. ◮ The preprocessor directive BOOST SP ENABLE DEBUG HOOKS provides a hook to every destructor call of a shared object. ◮ if the shared object is an observer then use the thread-safe version of Observer::unregisterWithAll to detach the observer from all observables. ◮ Advantage: this solution is backward compatible, e.g. test suite can now run multi-threaded. Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib

Finite Differences Methods on GPUs: Overview ◮ Performance of Finite Difference Methods is mainly driven by the speed of the underlying sparse linear algebra subsystem. ◮ In QuantLib any finite difference operator can be exported as boost::numeric::ublas::compressed matrix < Real > ◮ boost sparse matrices can by exported in Compressed Sparse Row (CSR) format to high performance libraries. ◮ CUDA sparse matrix libraries: ◮ cuSPARSE: basic linear algebra subroutines used for sparse matrices. ◮ cusp: general template library for sparse iterative solvers. Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib

Spare Matrix Libraries for GPUs Performance pictures from NVIDIA (https://developer.nvidia.com/cuSPARSE) Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib

Spare Matrix Libraries for GPUs Performance pictures from NVIDIA Speed-ups are smaller than the reported ”100x” for Monte-Carlo Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib

Example I: Heston-Hull-White Model on GPUs SDE is defined by ( r t − q t ) S t dt + √ v t S t dW S dS t = t √ v t dW v dv t = κ v ( θ v − v t ) dt + σ v t κ r ( θ r , t − r t ) dt + σ r dW r = dr t t dW S t dW v ρ Sv dt = t dW S t dW r = ρ Sr dt t dW v t dW r ρ vr dt = t Feynman-Kac gives the corresponding PDE: 2 S 2 ν ∂ 2 u ν ν ∂ 2 u ∂ 2 u ∂ u 1 ∂ S 2 + 1 ∂ν 2 + 1 2 σ 2 2 σ 2 = r ∂ r 2 ∂ t ρ S ν σ ν S ν ∂ 2 u ∂ S ∂ν + ρ Sr σ r S √ ν ∂ 2 u √ ν ∂ 2 u + ∂ S ∂ r + ρ vr σ r σ ν ∂ν∂ r ( r − q ) S ∂ u ∂ S + κ v ( θ v − ν ) ∂ u ∂ν + κ r ( θ r , t − r ) ∂ u + ∂ r − ru Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib

Example I: Heston-Hull-White Model on GPUs ◮ Good new: QuantLib can build the sparse matrix. ◮ An operator splitting scheme needs to be ported to the GPU. void HundsdorferScheme::step(array_type& a, Time t) { Array y = a + dt_*map_->apply(a); Array y0 = y; for (Size i=0; i < map_->size(); ++i) { Array rhs = y - theta_*dt_*map_->apply_direction(i, a); y = map_->solve_splitting(i, rhs, -theta_*dt_); } Array yt = y0 + mu_*dt_*map_->apply(y-a); for (Size i=0; i < map_->size(); ++i) { Array rhs = yt - theta_*dt_*map_->apply_direction(i, y); yt = map_->solve_splitting(i, rhs, -theta_*dt_); } a = yt; } Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib

Example I: Heston-Hull-White Model on GPUs Heston−Hull−White Model: GTX560 vs. Core i7 GPU single precision 14 GPU double precision 12 Speed−Up GPU vs CPU 10 8 6 4 2 20x50x20x10 20x50x20x20 50x100x50x10 50x100x50x20 50x200x100x10 50x200x100x20 50x400x100x20 Grid Size (t,x,v,r) Speed-ups are much smaller than for Monte-Carlo pricing. Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib

Example II: Heston Model on GPUs Heston Model: GTX560 vs. Core i7 20 GPU single precision GPU double precision 15 Speed−Up GPU vs CPU 10 5 50x200x100 100x200x100 100x500x100 100x500x200 100x1000x500 100x2000x500 100x2000x1000 Grid Size (t, x, v) Speed-ups are much smaller than for Monte-Carlo pricing. Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib

Example III: Virtual Power Plant Kluge model (two OU processes plus jump diffusion) leads to a three dimensional partial integro differential equation: ∂ t + σ 2 ∂ 2 V ∂ V ∂ x 2 − α x ∂ V ∂ x − β y ∂ V x rV = 2 ∂ y σ 2 ∂ 2 V ∂ 2 V ∂ u 2 − κ u ∂ V u + ∂ u + ρσ x σ u 2 ∂ x ∂ u � + λ ( V ( x , y + z , u , t ) − V ( x , y , u , t )) ω ( z ) dz R Due to the integro part the equation is not truly a sparse matrix. Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib

Example III: Virtual Power Plant GTX560@0.8/1.6GHz vs. Core i5@3.0Ghz 1000 GPU BiCGStab+Tridiag GPU BiCGStab+nonsym Bridson 500 GPU BiCGStab ● GPU BicgStab+Diag CPU Douglas Scheme CPU BiCGStab+Tridiag 100 ● Calculation Time 50 ● 10 ● 5 ● ● 1 10x10x10x6 20x20x10x6 40x20x20x6 80x40x20x6 100x50x40x6 Grid Size (x,y,u,s) Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib

Beyond Simple Monte-Carlo: Parallel Computing with QuantLib Klaus - PowerPoint PPT Presentation

Beyond Simple Monte-Carlo: Parallel Computing with QuantLib Klaus Spanderen E.ON Global Commodities November 14, 2013 Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib Symmetric Multi-Processing Graphical Processing

Stochastic Local Volatility in QuantLib J. Gttker-Schnetmann, K. Spanderen QuantLib User

Monte Carlo Generators Monte Carlo Generators Monte Carlo Generators QCD Lecture III P .

Solvency II Regulation How QuantLib can help Oleksandr Khomenko, ERGO QuantLib User Meeting,

QuantLib Erlk onige Peter Caspers IKB December 4th 2014 Peter Caspers (IKB) QuantLib Erlk

Monte Carlo Methods Guojin Chen Christopher Cprek Chris Rambicure Monte Carlo Methods 1.

Monte Carlo Approximation of Monte Carlo Filters Adam M. Johansen et al. Collaborators Include:

BROCHURE 2019 TETRA JUICES DEL MONTE DEL MONTE 6 x 1L GOLD PINEAPPLE 6 x 1L 6 x 1L 6 x 1L

Introduction to Selected Classes of the QuantLib Library II Dimitri Reiswich December 2010

Introduction to Selected Classes of the QuantLib Library I Dimitri Reiswich December 2010

QUANTLIB USER MEETING (QLUM) 2017 NOVEMBER 30 th 2017 Venue: IKB Deutsche InDustrIeBanK aG |

Collocating Local Volatility Model K. Spanderen Uniper Global Commodities QuantLib User Meeting

Chapter 5: Monte Carlo Methods Monte Carlo methods are learning methods Experience

Draft Introduction to (randomized) quasi-Monte Carlo Pierre LEcuyer MCQMC Conference,

Monte Carlo Estimation 7 January 2019 OSU CSE 1 Monte Carlo Methods Class of computational

Monte Carlo Localization Ximing Yu March 24, 2009 Ximing Yu Monte Carlo Localization 1

Monte Carlo Control CMPUT 366: Intelligent Systems S&B 5.3-5.5, 5.7 Lecture Outline 1.

Plasma modeling Ibrahem Elkamash, PhD Physics Department, Faculty of Science, Mansoura

Homogenization of the L evy operators with asymmetric L evy measures Mariko ARISAWA

Regge Quantum Gravity Aleksandar Mikovi c Lusofona University and GFM - Lisbon University

Fourier law and phase transitions Errico Presutti Anogia. June 21-26, 2009. 1 Contents

FIRESIDE CHATS FOR LOCKDOWN TIMES Introduction to DFT (Part I) Nicola Marzari, EPFL OUTLINE

General-Relativistic Radiative Transfer Ziri Younsi 4th November 2014 Outline Background

Data Storyteller Data Urban Science Planning Across City government,

PRE-COMPETITION 1 April 2011, Friday BRIEFING Friday, 1 April 2011 Programme Welcome! -

Beyond Simple Monte-Carlo: Parallel Computing with QuantLib Klaus - PowerPoint PPT Presentation

Beyond Simple Monte-Carlo: Parallel Computing with QuantLib Klaus Spanderen E.ON Global Commodities November 14, 2013 Klaus Spanderen Beyond Simple Monte-Carlo: Parallel Computing with QuantLib Symmetric Multi-Processing Graphical Processing

Stochastic Local Volatility in QuantLib J. Gttker-Schnetmann, K. Spanderen QuantLib User

Monte Carlo Generators Monte Carlo Generators Monte Carlo Generators QCD Lecture III P .

Solvency II Regulation How QuantLib can help Oleksandr Khomenko, ERGO QuantLib User Meeting,

QuantLib Erlk onige Peter Caspers IKB December 4th 2014 Peter Caspers (IKB) QuantLib Erlk

Monte Carlo Methods Guojin Chen Christopher Cprek Chris Rambicure Monte Carlo Methods 1.

Monte Carlo Approximation of Monte Carlo Filters Adam M. Johansen et al. Collaborators Include:

BROCHURE 2019 TETRA JUICES DEL MONTE DEL MONTE 6 x 1L GOLD PINEAPPLE 6 x 1L 6 x 1L 6 x 1L

Introduction to Selected Classes of the QuantLib Library II Dimitri Reiswich December 2010

Introduction to Selected Classes of the QuantLib Library I Dimitri Reiswich December 2010

QUANTLIB USER MEETING (QLUM) 2017 NOVEMBER 30 th 2017 Venue: IKB Deutsche InDustrIeBanK aG |

Collocating Local Volatility Model K. Spanderen Uniper Global Commodities QuantLib User Meeting

Chapter 5: Monte Carlo Methods Monte Carlo methods are learning methods Experience

Draft Introduction to (randomized) quasi-Monte Carlo Pierre LEcuyer MCQMC Conference,

Monte Carlo Estimation 7 January 2019 OSU CSE 1 Monte Carlo Methods Class of computational

Monte Carlo Localization Ximing Yu March 24, 2009 Ximing Yu Monte Carlo Localization 1

Monte Carlo Control CMPUT 366: Intelligent Systems S&amp;B 5.3-5.5, 5.7 Lecture Outline 1.

Plasma modeling Ibrahem Elkamash, PhD Physics Department, Faculty of Science, Mansoura

Homogenization of the L evy operators with asymmetric L evy measures Mariko ARISAWA

Regge Quantum Gravity Aleksandar Mikovi c Lusofona University and GFM - Lisbon University

Fourier law and phase transitions Errico Presutti Anogia. June 21-26, 2009. 1 Contents

FIRESIDE CHATS FOR LOCKDOWN TIMES Introduction to DFT (Part I) Nicola Marzari, EPFL OUTLINE

General-Relativistic Radiative Transfer Ziri Younsi 4th November 2014 Outline Background

Data Storyteller Data Urban Science Planning Across City government,

PRE-COMPETITION 1 April 2011, Friday BRIEFING Friday, 1 April 2011 Programme Welcome! -

Monte Carlo Control CMPUT 366: Intelligent Systems S&B 5.3-5.5, 5.7 Lecture Outline 1.