Silicon Valley Big Data in Real Time: An Approach to Predictive - - PowerPoint PPT Presentation

silicon valley
SMART_READER_LITE
LIVE PREVIEW

Silicon Valley Big Data in Real Time: An Approach to Predictive - - PowerPoint PPT Presentation

GPU Technology Conference 2015 Silicon Valley Big Data in Real Time: An Approach to Predictive Analytics for Alpha Generation and Risk Management Yigal Jhirad and Blay Tarnoff March 19, 2015 Table of Contents I. State-Space Models: State


slide-1
SLIDE 1

Big Data in Real Time: An Approach to Predictive Analytics for Alpha Generation and Risk Management

Yigal Jhirad and Blay Tarnoff March 19, 2015

GPU Technology Conference 2015 Silicon Valley

slide-2
SLIDE 2 2

Table of Contents

I. State-Space Models: State Instantiation — Portfolio and Risk Management: Big Data/Real Time Sensitivity
  • II. Parallel Approach/Predictive Analytics
— Cluster Analysis: Coherence, Correlation, Cointegration — Evolutionary Optimization
  • III. Summary
  • IV. Author Biographies
DISCLAIMER: This presentation is for information purposes only. The presenter accepts no liability for the content of this presentation, or for the consequences of any actions taken on the basis of the information provided. Although the information in this presentation is considered to be accurate, this is not a representation that it is complete or should be relied upon as a sole resource, as the information contained herein is subject to change.
slide-3
SLIDE 3 3

State-Space Models

 Portfolio/Risk Management — Financials markets display discrete dynamic micro states and regimes – Traditional valuation techniques may not be as effective, e.g. Options markets – Implied Volatility vs. Jump Diffusion Regime Shifts — Big Data: Time Series Tick Data, Interday, and Intraday — Predictive Analytics: Identifying Structural Breaks — Alpha Generation, Risk Management, Market Impact  Parallel Processing: APL/CPU and CUDA/GPU physics based modeling exploit hardware efficiently — Array Based Processing paradigm Matrix/Vector thought process is key — APL is a programming language whose quantum data object is an array, which is fundamental to parallel processing and can leverage parallel processing across CPU’s — CUDA leverages GPU Hardware  Application in Econometrics and Applied Mathematics — Monte Carlo Simulations — Fourier Analysis — Principal Components — Optimization — Cluster Analysis — Cointegration
slide-4
SLIDE 4 4

Cluster Analysis: Correlation & Cointegration

 Cluster Analysis: A multivariate technique designed to identify relationships and cohesion — Factor Analysis, Risk Model Development  Correlation Analysis: Pairwise analysis of data across assets. Each pairwise comparison can be run in parallel. — Use Correlation or Cointegration as primary input to cluster analysis — Apply proprietary signal filter to remove selected data and reduce spurious correlations
slide-5
SLIDE 5 5

Evolutionary Optimization

 Asymptotic Multi-Phase Optimization — Identify a target portfolio of stocks that is trending consistently over consecutive periods using specialized, possibly time-sensitive, optimization algorithms — Establish a portfolio of stocks that is performing in a cohesive way  Identifying and Assessing factors driving outperformance — Optimize a basket of factors to track target portfolio — Look at factors such as Value vs. Growth, Large Cap vs. Small Cap Optimized Factor Attribution of Targeted Portfolio Relative Ranking Cash/Assets 1 Capex/Assets 2 Dividend Yield 3 Dividend Growth (1 and 5 Year) 4 Market Cap (High - Low) 5 Dividend Payout 6 ROIC 7 E/P 8 Indicative factor attribution of target portfolio Period: 2nd Half of 2013
slide-6
SLIDE 6 6

Cluster Analysis: Correlation

Application example: Correlation function removing N/A values  Correlation measures the direction and strength of a linear relationship between variables. The Pearson product moment correlation between two variables X and Y is calculated as:  For N assets there are unique correlation pairs  Given an N x M matrix A in which each row is a list of returns for a particular equity, return an N x N matrix R in which each element is the scalar result of correlating each row of A to every other — Each element of A may be an N/A value — When processing a pair of rows, the calculation must include neither each N/A value nor the corresponding element in the other row. This requires evaluating each pair separately. — As a result the increased computational demand is more effectively implemented through a parallel processing solution. As the matrix size increases the benefits of parallel processing become more significant.
slide-7
SLIDE 7 7

Cluster Analysis: Configuration

 Hardware Constraints — Compute capability: 3.5 — Warp size: 32 — Number of shared memory banks: 32 — Coalescence capacity (4-byte words): 128 bytes = 32 contiguous words — Max blocks / warps / threads per multiprocessor: 16 / 64 / 2048 — Max registers per multiprocessor: 64K — Max shared memory per multiprocessor: 48K  Software Constraints/Response — Run 32 independent correlations per block for output coalescence — Read 32 prices/returns per correlation for input coalescence — Registers used per thread: 33 => max 15 blocks per SM @ 4 warps per block — Shared memory used per block: 256 + 384 per warp => max 16 blocks per SM @ 4 warps per block — Thread block configuration: 4 x 32 => 15 blocks / 60 warps / 1920 threads per SM Tesla K20c
slide-8
SLIDE 8 Each pair of rows produces a single correlation

Cluster Analysis: Kernel Operation

Input Output
slide-9
SLIDE 9

Cluster Analysis: Kernel

Correlation computation 𝑌− 𝑌 𝑍− 𝑍 𝑌− 𝑌 2 𝑍− 𝑍 2 Two passes through prices/returns in global memory required First pass:
  • Compute means of 𝒀 and 𝒁
𝒀, 𝒁 Second pass:
  • Subtract means from 𝒀 and 𝒁 𝒀 ́, 𝒁 ́
  • Sum 𝒀 ́ × 𝒁 ́, 𝒀 ́ × 𝒀 ́ and 𝒁 ́ × 𝒁 ́ 𝒀 ́𝒁 ́, 𝒀 ́2, 𝒁 ́2
  • Multiply resultant 𝒀 ́𝒁 ́ by reciprocal square roots of 𝒀 ́2 × 𝒁 ́2
slide-10
SLIDE 10

Cluster Analysis: Kernel First Pass

Computation of means (row 33, col 0 of block grid)  Repeat: read next section of Prices/Returns into registers  Ns necessary for NA handling: “0” where either Xs or Ys is NA; “1” where Xs and Ys are both valid
slide-11
SLIDE 11

Cluster Analysis: Kernel First Pass

Computation of means (row 33, col 0 of block grid)  Repeat: sum values in registers into shared memory

+ + +

slide-12
SLIDE 12

Cluster Analysis: Kernel First Pass

Computation of means (row 33, col 0 of block grid)  Sum rows to obtain totals for each row of Prices/Returns  Save totals to shared memory  Divide totals by N Assets

÷ ÷ ÷ ÷

slide-13
SLIDE 13

Cluster Analysis: Kernel Second Pass

Computation of correlations  Read section of Prices/ Returns into registers (reuse registers from first pass)  Ns necessary for NA handling: “0” where either Xs or Ys is NA; “1” where Xs and Ys are both valid
slide-14
SLIDE 14

Cluster Analysis: Kernel Second Pass

Computation of correlations  Subtract means of X and Y

− −

slide-15
SLIDE 15

Cluster Analysis: Kernel Second Pass

Computation of correlations  Multiply and sum (reuse X, Y, N from first pass)

× × + +

slide-16
SLIDE 16

Cluster Analysis: Kernel Second Pass

Computation of correlations  Multiply and sum (reuse X, Y, N from first pass)

× +

slide-17
SLIDE 17

Cluster Analysis: Kernel Second Pass

Computation of correlations  Sum rows to obtain of terms for each row of Prices/Returns 𝑌 − 𝑌 𝑍 − 𝑍 𝑌 − 𝑌 2

𝑍 − 𝑍

2
slide-18
SLIDE 18

Cluster Analysis: Kernel Second Pass

Computation of correlations Assets

rsqrt(X2)

×

rsqrt(Y2)

×

XY

𝑌 − 𝑌 𝑍 − 𝑍 𝑌 − 𝑌 2 𝑍 − 𝑍 2
slide-19
SLIDE 19

Cluster Analysis: Kernel

Output to global memory : row 33, col 0 of block grid Assets Assets Assets
slide-20
SLIDE 20 Host/Global Memory  Pinned-mapped memory eliminates transfer complexity, reduces overhead  Kernel processing speed approximately 0.1 seconds for 586 assets × 120 periods

Cluster Analysis: Big Data, Real Time

Input Output
slide-21
SLIDE 21 Overview  Data source constantly streaming tick data  Simple routine periodically aggregates ticks into VWAPs
  • r returns, producing an array of prices, one per ticker
 VWAPs/returns loaded into one column of input Prices/Returns matrix in host memory at every interval  Aggregation by period enables arbitrary real time speed, limited by speed of GPU kernel algorithm

Cluster Analysis: Big Data, Real Time

Streaming data source Routine to periodically aggregate ticks into VWAPs

U P G

To target application
slide-22
SLIDE 22 22

Summary

 State-Space model Instantiation using Intraday Time Series data for Predictive Analytics  Our approach is based on Time Series Cluster Analysis and Evolutionary Optimization — Identifying Structural Breaks, Patterns, and Signals — Coherency and Group membership  Applications in Portfolio Management for Alpha Generation and Risk Management
slide-23
SLIDE 23
slide-24
SLIDE 24
slide-25
SLIDE 25
slide-26
SLIDE 26
slide-27
SLIDE 27 27

Author Biographies

 Yigal D. Jhirad, Senior Vice President, is Director of Quantitative and Derivatives Strategies and a Portfolio Manager for Cohen & Steers’ options and real assets strategies. Mr. Jhirad heads the firm’s Investment Risk Committee. Prior to joining the firm in 2007, Mr. Jhirad was an Executive Director in the institutional equities division of Morgan Stanley, where he headed the company’s portfolio and derivatives strategies effort. He was responsible for developing, implementing, and marketing quantitative and derivatives products to a broad array of institutional clients, including hedge funds, active and passive funds, pension funds and endowments. Mr. Jhirad holds a BS in Economics from the Wharton School of the University of Pennsylvania. He is a Financial Risk Manager (FRM) as Certified by the Global Association of Risk Professionals.  Blay A. Tarnoff is a senior applications developer and database architect. He specializes in array programming and database design and development. He has developed equity and derivatives applications for program trading, proprietary trading, quantitative strategy, and risk management. Mr. Tarnoff holds an AB in Mathematics from Brown University. He is currently a consultant at Cohen & Steers and was previously at Morgan Stanley. yjhirad@yahoo.com blay@eblay.com