 
              GPU Technology Conference 2015 Silicon Valley Big Data in Real Time: An Approach to Predictive Analytics for Alpha Generation and Risk Management Yigal Jhirad and Blay Tarnoff March 19, 2015
Table of Contents I. State-Space Models: State Instantiation — Portfolio and Risk Management: Big Data/Real Time Sensitivity II. Parallel Approach/Predictive Analytics — C luster Analysis : C oherence, C orrelation, C ointegration — Evolutionary Optimization III. Summary IV. Author Biographies DISCLAIMER: This presentation is for information purposes only. The presenter accepts no liability for the content of this presentation, or for the consequences of any actions taken on the basis of the information provided. Although the information in this presentation is considered to be accurate, this is not a representation that it is complete or should be relied upon as a sole resource, as the information contained herein is subject to change. 2
State-Space Models Portfolio/Risk Management  — Financials markets display discrete dynamic micro states and regimes – Traditional valuation techniques may not be as effective, e.g. Options markets – Implied Volatility vs. Jump Diffusion Regime Shifts — Big Data: Time Series Tick Data, Interday, and Intraday — Predictive Analytics: Identifying Structural Breaks — Alpha Generation, Risk Management, Market Impact Parallel Processing: APL/CPU and CUDA/GPU physics based modeling exploit hardware efficiently  — Array Based Processing paradigm Matrix/Vector thought process is key — APL is a programming language whose quantum data object is an array, which is fundamental to parallel processing and can leverage parallel processing across CPU’s — CUDA leverages GPU Hardware Application in Econometrics and Applied Mathematics  — Monte Carlo Simulations — Fourier Analysis — Principal Components — Optimization — Cluster Analysis — Cointegration 3
Cluster Analysis: Correlation & Cointegration Cluster Analysis: A multivariate technique designed to identify relationships and cohesion  — Factor Analysis, Risk Model Development Correlation Analysis: Pairwise analysis of data across assets. Each pairwise comparison can be run in parallel.  — Use Correlation or Cointegration as primary input to cluster analysis — Apply proprietary signal filter to remove selected data and reduce spurious correlations 4
Evolutionary Optimization Asymptotic Multi-Phase Optimization  — Identify a target portfolio of stocks that is trending consistently over consecutive periods using specialized, possibly time-sensitive, optimization algorithms — Establish a portfolio of stocks that is performing in a cohesive way Identifying and Assessing factors driving outperformance  — Optimize a basket of factors to track target portfolio — Look at factors such as Value vs. Growth, Large Cap vs. Small Cap Optimized Factor Attribution of Targeted Portfolio Relative Ranking Cash/Assets 1 Capex/Assets 2 Dividend Yield 3 Dividend Growth (1 and 5 Year) 4 Market Cap (High - Low) 5 Dividend Payout 6 ROIC 7 E/P 8 Indicative factor attribution of target portfolio Period: 2nd Half of 2013 5
Cluster Analysis: Correlation Application example: Correlation function removing N/A values Correlation measures the direction and strength of a linear relationship between variables. The Pearson  product moment correlation between two variables X and Y is calculated as: For N assets there are unique correlation pairs  Given an N x M matrix A in which each row is a list of returns for a particular equity, return an N x N  matrix R in which each element is the scalar result of correlating each row of A to every other — Each element of A may be an N/A value — When processing a pair of rows, the calculation must include neither each N/A value nor the corresponding element in the other row. This requires evaluating each pair separately. — As a result the increased computational demand is more effectively implemented through a parallel processing solution. As the matrix size increases the benefits of parallel processing become more significant. 6
Cluster Analysis : Configuration Tesla K20c Hardware Constraints  — Compute capability: 3.5 — Warp size: 32 — Number of shared memory banks: 32 — Coalescence capacity (4-byte words): 128 bytes = 32 contiguous words — Max blocks / warps / threads per multiprocessor: 16 / 64 / 2048 — Max registers per multiprocessor: 64K — Max shared memory per multiprocessor: 48K Software Constraints/Response  — Run 32 independent correlations per block for output coalescence — Read 32 prices/returns per correlation for input coalescence — Registers used per thread: 33 => max 15 blocks per SM @ 4 warps per block — Shared memory used per block: 256 + 384 per warp => max 16 blocks per SM @ 4 warps per block — Thread block configuration: 4 x 32 => 15 blocks / 60 warps / 1920 threads per SM 7
Cluster Analysis: Kernel Operation Each pair of rows produces a single correlation Output Input
Cluster Analysis: Kernel Correlation computation 𝑌− 𝑍− 𝑌 𝑍 2 𝑍− 𝑌− 2 𝑌 𝑍 Two passes through prices/returns in global memory required First pass: • Compute means of 𝒀 and 𝒁 𝒀, 𝒁 Second pass: • Subtract means from 𝒀 and 𝒁 𝒀 ́, 𝒁 ́ • S um 𝒀 ́ × 𝒁 ́, 𝒀 ́ × 𝒀 ́ and 𝒁 ́ × 𝒁 ́ 𝒀 ́𝒁 ́, 𝒀 ́ 2 , 𝒁 ́ 2 • M ultiply resultant 𝒀 ́𝒁 ́ by reciprocal square roots of 𝒀 ́ 2 × 𝒁 ́ 2
Cluster Analysis: Kernel First Pass Computation of means (row 33, col 0 of block grid) Repeat: read next section of  Prices/Returns into registers Ns necessary for NA  handling: “ 0 ” where either Xs or Ys is NA; “ 1 ” where Xs and Ys are both valid
Cluster Analysis: Kernel First Pass Computation of means (row 33, col 0 of block grid) Repeat: sum values in  registers into shared memory + + +
Cluster Analysis: Kernel First Pass Computation of means (row 33, col 0 of block grid) Sum rows to obtain totals for each row of Prices/Returns  Save totals to shared memory  ÷ ÷ Divide totals by N  ÷ ÷ Assets
Cluster Analysis: Kernel Second Pass Computation of correlations Read section of Prices/  Returns into registers (reuse registers from first pass) Ns necessary for NA  handling: “ 0 ” where either Xs or Ys is NA; “ 1 ” where Xs and Ys are both valid
Cluster Analysis: Kernel Second Pass Computation of correlations Subtract means of X and Y  − −
Cluster Analysis: Kernel Second Pass Computation of correlations Multiply and sum (reuse  × × X, Y, N from first pass) + +
Cluster Analysis: Kernel Second Pass Computation of correlations Multiply and sum (reuse  × X, Y, N from first pass) +
Cluster Analysis: Kernel Second Pass Computation of correlations Sum rows to obtain of terms for each row of Prices/Returns  𝑌 − 𝑍 − 𝑌 𝑍 𝑌 − 𝑍 − 𝑌 2 𝑍 2
Cluster Analysis: Kernel Second Pass Computation of correlations 𝑌 − 𝑍 − 𝑌 𝑍 𝑌 − 𝑍 − 𝑌 2 𝑍 2 Assets × × rsqrt(X2) rsqrt(Y2) XY
Cluster Analysis: Kernel Output to global memory : row 33, col 0 of block grid Assets Assets Assets
Cluster Analysis: Big Data, Real Time Host/Global Memory Pinned-mapped memory eliminates transfer complexity, reduces overhead  Kernel processing speed approximately 0.1 seconds for 586 assets × 120 periods  Output Input
Cluster Analysis: Big Data, Real Time Streaming data source Overview Data source constantly streaming tick data  Routine to Simple routine periodically aggregates ticks into VWAPs  periodically or returns, producing an array of prices, one per ticker aggregate ticks into VWAPs/returns loaded into one column of input  VWAPs Prices/Returns matrix in host memory at every interval Aggregation by period enables arbitrary real time speed,  limited by speed of GPU kernel algorithm G P U To target application
Summary State-Space model Instantiation using Intraday Time Series data for Predictive Analytics  Our approach is based on Time Series Cluster Analysis and Evolutionary Optimization  — Identifying Structural Breaks, Patterns, and Signals — Coherency and Group membership Applications in Portfolio Management for Alpha Generation and Risk Management  22
Recommend
More recommend