Dive Deeper in Finance GTC 2017 San Jos California Daniel Egloff - - PowerPoint PPT Presentation

dive deeper in finance
SMART_READER_LITE
LIVE PREVIEW

Dive Deeper in Finance GTC 2017 San Jos California Daniel Egloff - - PowerPoint PPT Presentation

Dive Deeper in Finance GTC 2017 San Jos California Daniel Egloff Dr. sc. math. Managing Director QuantAlea May 7, 2017 Today Generative models for financial time series Sequential latent Gaussian Variational Autoencoder


slide-1
SLIDE 1

Dive Deeper in Finance

GTC 2017 – San José – California

Daniel Egloff

  • Dr. sc. math.

Managing Director QuantAlea May 7, 2017

slide-2
SLIDE 2

Today

▪ Generative models for financial time series

– Sequential latent Gaussian Variational Autoencoder

▪ Implementation in TensorFlow

– Recurrent variational inference using TF control flow operations

▪ Applications to FX data

– 1s to 10s OHLC aggregated data – Event based models for tick data is work in progress

slide-3
SLIDE 3

Generative Models and GPUs

▪ What I cannot create, I do not understand (Richard Feynman) ▪ Generative models are recent innovation in Deep Learning

– GANs – Generative adversarial networks – VAE – Variational autoencoders

▪ Training is computationally demanding

– Explorative modelling not possible without GPUs

slide-4
SLIDE 4

Deep Learning

▪ Deep Learning in finance is complementary to existing models and not a

replacement

▪ Deep Learning benefits

– Richer functional relationship between explanatory and response variables – Model complicated interactions – Automatic feature discovery – Capable to handle large amounts of data – Standard training procedures with back propagation and SGD – Frameworks and tooling

slide-5
SLIDE 5

Latent Variable – Encoding/Decoding

▪ Latent variable can be thought of a encoded representation of x Encoder 𝑞 𝑨 𝑦 𝑦 Decoder 𝑞 𝑦 𝑨 𝑦 𝑨 𝑞 𝑨 ▪ Likelihood serves as decoder ▪ Posterior provides encoder

slide-6
SLIDE 6

Intractable Maximum Likelihood

▪ Maximum likelihood standard model fitting approach

𝑞 𝑦 = ׬ 𝑞 𝑦 𝑨 𝑞 𝑨 𝑒𝑨 → max

▪ Problem: marginal 𝑞 𝑦 and posterior

𝑞 𝑨 𝑦 =

𝑞 𝑦 𝑨 𝑞 𝑨 𝑞 𝑦

are intractable and their calculation suffers from exponential complexity

▪ Solutions

– Markov Chain MC, Hamiltonian MC – Approximation and variational inference

slide-7
SLIDE 7

Variational Autoencoders

▪ Assume latent space with prior 𝑞 𝑨 𝑦 𝑦 𝑨 𝑞 𝑨 𝑞 𝑨 𝑦 𝑞 𝑦 𝑨

slide-8
SLIDE 8

Variational Autoencoders

▪ Parameterize likelihood 𝑞 𝑦 𝑨 with a deep neural network 𝑦 𝑦 𝑨 𝑞 𝑨 𝜈 𝜏 𝑞𝜒 𝑦 𝑨 𝑞 𝑨 𝑦

slide-9
SLIDE 9

Variational Autoencoders

▪ Parameterize likelihood 𝑞 𝑦 𝑨 with a deep neural network ▪ Approximate intractable posterior 𝑞 𝑨 𝑦 with a deep neural network 𝑦 𝑦 𝑨 𝑞 𝑨 𝜈 𝜏 𝑞𝜒 𝑦 𝑨 𝑞 𝑨 𝑦 𝑟𝜄 𝑨 𝑦 𝜈 𝜏

slide-10
SLIDE 10

Variational Autoencoders

▪ Parameterize likelihood 𝑞 𝑦 𝑨 with a deep neural network ▪ Approximate intractable posterior 𝑞 𝑨 𝑦 with a deep neural network ▪ Learn the parameters 𝜄 and 𝜒 with backpropagation 𝑦 𝑦 𝑨 𝑟𝜄 𝑨 𝑦 𝜈 𝜏 𝜈 𝜏 𝑞𝜒 𝑦 𝑨 𝑞 𝑨

slide-11
SLIDE 11

▪ Which loss to optimize? ▪ Can we choose posterior from a flexible family of distributions Q by

minimizing a distance to real posterior?

Variational Inference

𝑟∗ 𝑨 𝑦 = argmin

𝜄∈𝑅

𝐿𝑀 𝑟𝜄 𝑨 𝑦 ฮ𝑞𝜒 𝑨 𝑦 𝐿𝑀 𝑟𝜄 𝑨 𝑦 ฮ𝑞𝜒 𝑨 𝑦 = 𝐹𝑟𝜄 𝑨 𝑦 log 𝑟𝜄 𝑨 𝑦 − 𝐹𝑟𝜄 𝑨 𝑦 log 𝑞𝜒 𝑦, 𝑨 + log 𝑞𝜒 𝑦 ≥ 0 Can be made small if Q is flexible enough

▪ Problem: not computable because it involves marginal 𝑞𝜒 𝑦

slide-12
SLIDE 12

▪ Which loss to optimize? ▪ Can we choose posterior from a flexible family of distributions Q by

minimizing a distance to real posterior?

Variational Inference

𝑟∗ 𝑨 𝑦 = argmin

𝜄∈𝑅

𝐿𝑀 𝑟𝜄 𝑨 𝑦 ฮ𝑞𝜒 𝑨 𝑦 0 ≤ 𝐹𝑟𝜄 𝑨 𝑦 log 𝑟𝜄 𝑨 𝑦 − 𝐹𝑟𝜄 𝑨 𝑦 log 𝑞𝜒 𝑦, 𝑨 + log 𝑞𝜒 𝑦 −𝐹𝑀𝐶𝑃(𝜄, 𝜒)

▪ Drop left hand side because positive

slide-13
SLIDE 13

▪ Which loss to optimize? ▪ Can we choose posterior from a flexible family of distributions Q by

minimizing a distance to real posterior?

Variational Inference

𝑟∗ 𝑨 𝑦 = argmin

𝜄∈𝑅

𝐿𝑀 𝑟𝜄 𝑨 𝑦 ฮ𝑞𝜒 𝑨 𝑦 𝐹𝑀𝐶𝑃(𝜄, 𝜒) ≤ log 𝑞𝜒 𝑦

▪ Obtain tractable lower bound for marginal ▪ Training criterion: maximize evidence lower bound

slide-14
SLIDE 14

▪ To interpret lower bound, write it as

Variational Inference

= 𝐹𝑟𝜄(𝑨|𝑦) log 𝑞𝜒 𝑦 𝑨 − 𝐿𝑀 𝑟𝜄 𝑨 𝑦 ԡ𝑞 𝑨 Reconstruction score

𝑨~𝑟𝜄 𝑨 𝑦 𝑦

𝑞𝜒 𝑦 𝑨 Penalty of deviation from prior log 𝑞𝜒 𝑦 ≥ 𝐹𝑀𝑃𝐶 𝜄, 𝜒

▪ The smaller the tighter the lower bound

𝐿𝑀 𝑟𝜄 𝑨 𝑦 ฮ𝑞𝜒 𝑨 𝑦

slide-15
SLIDE 15

Applications to Time Series

▪ Sequence structure for observable and latent factor ▪ Model setup

– Gaussian distributions with parameters calculated from deep recurrent neural

network

– Prior standard Gaussian – Model training with variational inference

slide-16
SLIDE 16

Inference and Training

𝜈𝑢 𝜏𝑢 𝜈𝑢−1 𝜏𝑢−1

ℎ𝑢+1 ℎ𝑢+1 𝑨𝑢+1 𝑦𝑢+1

𝜈𝑢+1 𝜏𝑢+1 𝜈𝑢 𝜏𝑢

𝑦𝑢−1 𝑨𝑢−1 ℎ𝑢−1 ℎ𝑢−1

𝜈𝑢−1 𝜏𝑢−1 𝜈𝑢+1 𝜏𝑢+1

𝑦𝑢 ℎ𝑢 ℎ𝑢 𝑨𝑢

𝑟𝜄 𝑨 𝑦

slide-17
SLIDE 17

▪ Probability distributions factorize

Implied Factorization

𝑞𝜒 𝑦≤𝑈 𝑨≤𝑈 = ෑ

𝑢=1 𝑈

𝑞𝜒 𝑦𝑢 𝑦<𝑢, 𝑨≤𝑢 = ෑ

𝑢=1 𝑈

𝑂 𝑦𝑢 𝜈𝜒 𝑦<𝑢, 𝑨≤𝑢 , 𝜏𝜒 𝑦<𝑢, 𝑨≤𝑢

▪ Loss calculation

– Distributions can be easily simulated to calculate expectation term – Kullback Leibler term can be calculated analytically

𝑟𝜄 𝑨≤𝑈 𝑦≤𝑈 = ෑ

𝑢=1 𝑈

𝑟𝜄 𝑨𝑢 𝑦<𝑢, 𝑨<𝑢 = ෑ

𝑢=1 𝑈

𝑂 𝑨𝑢 𝜈𝜄 𝑦<𝑢, 𝑨<𝑢 , 𝜏𝜄 𝑦<𝑢, 𝑨<𝑢

slide-18
SLIDE 18

▪ Loss calculation

– Kullback Leibler term can be calculated analytically – For fixed 𝑢 the quantities 𝜈𝜒, 𝜈𝜄, 𝜏𝜒, 𝜏𝜄 depend on

𝑨𝑢~𝑂 𝑨𝑢 𝜈𝜄 𝑦<𝑢, 𝑨<𝑢 , 𝜏𝜄 𝑦<𝑢, 𝑨<𝑢

– Simulate from this distribution to estimate expectation with a sample mean

Calculating ELBO

𝐹𝑀𝐶𝑃 𝜄, 𝜒 = −𝐹𝑟 ቂ ቃ σ𝑢 ቄ ቅ 𝑦𝑢 − 𝜈𝜒

𝑈𝜏𝜒−1 𝑦𝑢 − 𝜈𝜒 + log det 𝜏𝜒 +

𝜈𝜄𝑈𝜈𝜄 + 𝑢𝑠𝜏𝜄 − log det 𝜏𝜄 Approximate with Monte Carlo sampling from 𝑟𝜄 𝑨≤𝑈 𝑦≤𝑈

slide-19
SLIDE 19

Generation

𝜈𝑢 𝜏𝑢

ℎ𝑢+1 𝑨𝑢+1 𝑦𝑢+1

𝜈𝑢+1 𝜏𝑢+1

𝑦𝑢−1 𝑨𝑢−1 ℎ𝑢−1

𝜈𝑢−1 𝜏𝑢−1

𝑨𝑢

𝑞(𝑨)

ℎ𝑢 𝑦𝑢

𝑞𝜒 𝑦 𝑨

slide-20
SLIDE 20

Time Series Embedding

▪ Single historical value not predictive enough ▪ Embedding

– Use lag of ~20 historical observations at every time step

Time steps Batch

t t +1 t +2

slide-21
SLIDE 21

Implementation

▪ Implementation in TensorFlow ▪ Running on P100 GPUs for model training ▪ Long time series and large batch sizes require substantial GPU memory

slide-22
SLIDE 22

TensorFlow Dynamic RNN

▪ Unrolling rnn with tf.nn.dynamic_rnn

– Simple to use – Can handle variable sequence length

▪ Not flexible enough for generative networks

slide-23
SLIDE 23

TensorFlow Control Structures

▪ Using tf.while_loop

– More to program, need to understand control structures in more detail – Much more flexible

slide-24
SLIDE 24

Implementation

▪ Notations

slide-25
SLIDE 25

Implementation

▪ Variable and Weight Setup

Recurrent neural network definition

slide-26
SLIDE 26

Implementation

▪ Allocate TensorArray objects ▪ Fill input TensorArray objects with data

slide-27
SLIDE 27

Implementation

▪ While loop body inference part

Update inference rnn state

slide-28
SLIDE 28

Implementation

▪ While loop body inference part

Update generator rnn state

slide-29
SLIDE 29

Implementation

▪ Call while loop ▪ Stacking TensorArray objects

slide-30
SLIDE 30

Implementation

▪ Loss Calculation

slide-31
SLIDE 31

FX Market

▪ FX market is largest and most liquid market in the world ▪ Decentralized over the counter market

– Not necessary to go through a centralized exchange – No single price for a currency at a given point in time

▪ Fierce competition between market participants ▪ 24 hours, 5 ½ days per week

– As one major forex market closes, another one opens

slide-32
SLIDE 32

FX Data

▪ Collect tick data from major liquidity provider e.g. LMAX ▪ Aggregation to OHLC bars (1s, 10s, …) ▪ Focus on US trading session

8am – 5pm EST 3am – 12am EST 5pm – 2am EST (Sidney)

London session US session Asian session

7pm – 4am EST (Tokyo)

5 4 3 2 1 12 11 10 9 8 7 6 5 4 3 2 1 12 11 10 9 8 7 6

slide-33
SLIDE 33

EURUSD 2016

slide-34
SLIDE 34

Single Day

slide-35
SLIDE 35

One Hour

slide-36
SLIDE 36

10 Min Sampled at 1s

5 pips 1/10 pips = 1 deci-pip

At high frequency FX prices fluctuate in range

  • f deci-pips

Larger jumps in the order of multiple pips and more

slide-37
SLIDE 37

Setup

▪ Normalize data with std deviation ො

𝜏 over training interval

▪ 260 trading days in 2016, one model per day ▪ 60 dim embedding, 2 dim latent space

ො 𝜏 Training Out of sample test

slide-38
SLIDE 38

Results

Training

slide-39
SLIDE 39

Out of Sample

slide-40
SLIDE 40

Volatility of Prediction

slide-41
SLIDE 41

Latent Variables

slide-42
SLIDE 42

Pricing in E-Commerce

▪ Attend our talk on our latest work on AI and GPU accelerated genetic

algorithms with Jet.com

slide-43
SLIDE 43

Daniel Egloff

  • Dr. sc. math.

Phone: +41 79 430 03 61 daniel.egloff@quantalea.net

Contact details