TRADING USING DEEP LEARNING MAN VS MACHINE Orders By Algorithms - - PowerPoint PPT Presentation
TRADING USING DEEP LEARNING MAN VS MACHINE Orders By Algorithms - - PowerPoint PPT Presentation
TRADING USING DEEP LEARNING MAN VS MACHINE Orders By Algorithms 84% Orders By Human 16% TRADING USING DEEP LEARNING Artificial Neural Networks Neural networks are a family of models inspired by biological brain structure and are used to
MAN VS MACHINE
TRADING USING DEEP LEARNING
84%
Orders By Algorithms
16%
Orders By Human
Artificial Neural Networks
Neural networks are a family of models inspired by biological brain structure and are used to estimate or approximate functions that can depend on a large number
- f inputs and are generally unknown.
Recent breakthroughs in artificial neural networks led to a modern renascence in AI.
TRADING USING DEEP LEARNING
DEEP LEARNING SUPERIORITY
96.92%
Deep Learning
94.9%
Human
DEEP META LEARNING
ref: http://www.image-net.org/challenges/LSVRC/
GRADIENT DESCENT
๐น = Error of the network
๐ฅ๐ข = ๐ฅ๐ขโ1 โ ๐ฟ ๐๐น ๐๐ฅ
๐ = Weight matrix representing the filters
DEEP META LEARNING
GRADIENT BASED MODELS
Legend
๐ฆ0
๐
0(๐ฆ0, ๐ฅ0)
๐
1(๐ฆ1, ๐ฅ1)
๐
2(๐ฆ2, ๐ฅ2)
๐
๐ ๐ฆ๐, ๐ฅ๐ = เท
๐ง
๐
๐โ1(๐ฆ๐โ1, ๐ฅ๐โ1)
๐
๐โ2(๐ฆ๐โ2, ๐ฅ๐โ2)
๐ฅ0 ๐ฅ1 ๐ฅ๐ ๐ฅ๐โ1
๐น = ๐ เท ๐ง, ๐ง ๐ง
๐ เท ๐ง, ๐ง
- Loss Function
๐ฆ0 - Features Vector ๐ฆ๐ - Output of ๐ layer ๐ฅ๐ - Weights of ๐ layer ๐ง โ Ground Truth เท ๐ง โ Model Output ๐น โ Loss Surface
๐๐น ๐๐ฆ๐ = ๐๐ เท ๐ง, ๐ง ๐๐ฆ๐ ๐๐น ๐๐ฅ๐ = ๐๐น ๐๐ฆ๐ ๐๐
๐ ๐ฆ๐โ1, ๐ฅ๐
๐๐ฅ๐ ๐๐น ๐๐ฆ๐โ1 = ๐๐น ๐๐ฆ๐ ๐๐
๐ ๐ฆ๐โ1, ๐ฅ๐
๐ฆ๐โ1
๐โ Activation Function
๐๐น ๐๐ฆ๐โ2 = ๐๐น ๐๐ฆ๐โ1 ๐๐
๐โ1 ๐ฆ๐โ2, ๐ฅ๐โ1
๐ฆ๐โ2 ๐๐น ๐๐ฅ๐โ1 = ๐๐น ๐๐ฆ๐โ1 ๐๐
๐ ๐ฆ๐โ2, ๐ฅ๐โ1
๐๐ฅ๐โ1
โฆ โฆ ๐บ๐๐ ๐ฅ๐๐ ๐ ๐๐ ๐๐๐๐๐๐ข๐๐๐
๐ถ๐๐๐ ๐๐ ๐๐๐๐๐๐ข๐๐๐
1: Forward Propagation 2: Loss Calculation 3: Optimization
DEEP META LEARNING
Supervised Learning in a nutshell.
CAT Learning From Examples.
FINNANCIAL PREDICTION PITFALLS
Much Data
Possible relevant data from many markets is incredibly large.
No Theory
Complex non-linear interactions in the data are not well specified by financial theory.
Noisy Data
Noise In financial data Is very common and sometimes distinguishing noise from behavior is hard.
Importance
Data Importance is questionable and determination of meaningful data is hard.
Overfitting
Overfitted easily, most models have poor predictive capabilities On financial data.
Behavior
Behavior of financial markets change all the time and can be really unpredictable. TRADING USING DEEP LEARNING
WHY DEEP LEARNING?
Much Data
Possible relevant data from many markets is incredibly large.
No Theory
Complex non-linear interactions in the data are not well specified by financial theory.
Noisy Data
Noise In financial data Is very common and sometimes distinguishing noise from behavior is hard.
Importance
Data Importance is questionable and determination of meaningful data is hard.
Overfitting
Overfitted easily, most models have poor predictive capabilities On financial data.
Behavior
Behavior of financial markets change all the time and can be really unpredictable. TRADING USING DEEP LEARNING
160.5 161 161.5 162 162.5 163 163.5 164 35:09.9 36:57.2 38:04.2 40:03.3 41:06.0 42:41.7 44:12.7 45:35.9 47:39.1 49:05.9 50:17.6 51:31.1 53:04.0 54:36.3 56:31.8 58:07.0 59:29.3 00:57.2 02:31.0 04:05.5 08:16.0 10:11.6 12:09.7 17:17.8 20:37.3 22:09.9 28:36.1 30:52.1 33:07.0 36:46.1 39:29.7 42:05.1 46:26.1 49:55.8 53:08.2 54:59.0 57:12.2 00:35.6 04:10.1 06:56.1 09:30.2 12:35.7 15:44.1 19:03.8 21:05.0 23:13.6 28:12.2 32:33.9 38:22.9 43:34.5 46:47.4 52:54.0 00:33.2 09:54.6 22:43.6 33:06.1 44:02.0 59:27.2 03:38.7 14:35.6 .5 1 .5 2 .5 3 .5 4 35:09.9 37:08.7 38:59.2 40:38.7 42:17.6 43:49.0 45:35.9 47:55.2 49:32.2 51:02.2 52:25.6 54:24.2 56:31.8 58:27.3 59:58.8 01:43.3 03:22.2 08:03.3 10:11.6 14:04.0 17:37.5 21:19.2 26:03.8 30:46.0 33:07.0 36:54.2 40:00.4 43:27.5 48:22.7 52:26.2 54:59.0 57:33.4 01:00.6 04:41.4 08:23.5 11:20.4 15:44.1 19:15.9 21:24.3 25:00.7 30:25.5 37:26.4 43:34.5 47:56.6 53:40.0 01:46.4 21:12.6 31:31.7 44:02.0 59:33.3 12:50.0
BACK TO FINANCE
?
TRADING USING DEEP LEARNING
Strategy Universe
Strategy Configurations Configuration
DEEP REINFORCEMENT LEARNING
Trading Decision Utility
1 - buy 0 - hold
- 1 - Sell
P&L / Drawdown
DEEP LEARNING IN FINANCE
Technical analysis Might of might not work, One thing for sure: Very hard to generalize.
Technical Analysis
talib.SMA(โฆ talib.MOM(โฆ
TA-Lib : Technical Analysis Library
Successful Technical Trading Agents Using Genetic Programming Farnsworth , 2004
DEEP LEARNING IN FINANCE
Surprisingly, Genetic programing can be very successful when It comes to financial strategies gp = SymbolicRegressor(... gp.fit(X_train, y_train)
DEEP LEARNING IN FINANCE
APPLYING DEEP LEARNING TO ENHANCE MOMENTUM TRADING STRATEGIES IN STOCKS
DEEP LEARNING IN FINANCE
APPLYING DEEP LEARNING TO ENHANCE MOMENTUM TRADING STRATEGIES IN STOCKS L Takeuchi, 2013
๐๐: ฯ๐ขโ12
๐ข+๐
๐๐๐๐ขโ๐๐ก๐๐ขโ12 ๐๐ก๐๐ขโ12
โ๐ ๐
| ๐ โ (1,11), โช 1 ๐๐ ๐ข ๐๐ ๐๐๐๐ฃ๐๐ ๐ง ๐๐๐ก๐ 0
๐บ๐๐๐ข๐ฃ๐ ๐ ๐๐๐๐ข๐๐
FEATURE ENGINEERI NG MODEL RESULT S
DEEP LEARNING IN FINANCE
๐๐: ฯ๐ขโ12
๐ข+๐
๐๐๐๐ขโ๐๐ก๐๐ขโ12 ๐๐ก๐๐ขโ12
โ๐ ๐
| ๐ โ (1,11), โช 1 ๐๐ ๐ข ๐๐ ๐๐๐๐ฃ๐๐ ๐ง ๐๐๐ก๐ 0
๐ป๐ ๐๐ฃ๐๐ ๐๐ ๐ฃ๐ขโ
40 33 4 50 2 40 33 ๐:
๐๐๐๐ข+1 โ ๐๐ก๐๐ขโ12 ๐๐ก๐๐ขโ12 > เท
๐ขโ12 ๐ข+๐ ๐๐๐๐ข โ ๐๐ก๐๐ขโ12
๐๐ก๐๐ขโ12
๐ป๐ ๐๐ฃ๐๐ ๐๐ ๐ฃ๐ขโ
12 Monthly Returns For every month: Daily cumulative returns Z-score Against other cumulative returns of other stocks Flag if January Layers Structure Hyper Parameters
K - Fold Stacked RBMs Layers size found by grid search Not Written
Confusion Matrix
True False
Predicted True Predicted False 22.38% 19.19% 30.97% 27.45%
Precision: 61.224% Recall: 53.659% Accuracy: 53.061%
DEEP MODELING COMPLEX COUPLINGS WITHIN FINANCIAL MARKETS
DEEP LEARNING IN FINANCE
DEEP MODELING COMPLEX COUPLINGS WITHIN FINANCIAL MARKETS
๐๐: ๐๐ โช ๐บ๐
๐บ๐๐๐ข๐ฃ๐ ๐ ๐๐๐๐ข๐๐
FEATURE ENGINEERI NG MODEL RESULT S
DEEP LEARNING IN FINANCE
R B M
S T O C K
Used Deep Belief Network to find hidden couplings between markets Used Past Prices of stocks and forex as features Unsupervised Learning Model Layers Structure Hyper Parameters
DBN of Stacked RBMs Optimizer: SGD Loss: Negative Log Likelihood Note: No Cross Validation in paper
F O R E X
R B M R B M R B M
DEEP LEARNING FOR MULTIVARIATE FINANCIAL TIME SERIES
DEEP LEARNING IN FINANCE
DEEP LEARNING FOR MULTIVARIATE FINANCIAL TIME SERIES
๐๐:
log
๐๐๐๐ข ๐๐ก๐๐ขโ1โ๐ ๐
| ๐ โ (โ33, โ2), โช 1 ๐๐ ๐ข ๐๐ ๐๐๐๐ฃ๐๐ ๐ง ๐๐๐ก๐ 0
๐บ๐๐๐ข๐ฃ๐ ๐ ๐๐๐๐ข๐๐
FEATURE ENGINEERI NG MODEL RESULT S
DEEP LEARNING IN FINANCE
๐: 1 ๐๐ ๐๐ ๐๐๐ ๐๐ก ๐๐ค๐๐ ๐๐๐๐๐๐ ๐๐ข ๐ข + 1 ๐ป๐ ๐๐ฃ๐๐ ๐๐ ๐ฃ๐ขโ
D E N S e
1
๐ป๐ ๐๐ฃ๐๐ ๐๐ ๐ฃ๐ขโ
Matrix of log returns over all the stocks Z-score Against other stocks log returns Flag if January Layers Structure Hyper Parameters
DBN Connected to MLP Optimizer: ADAGrad CrosVal: Tarining: 70%, Valid: 15%, Test:
15%
Loss: Negative Log Likelihood
R B M R B M R B M
โDeepโ Belief Net Fully Connected
Activation: ๐ข๐๐โ Regularization: ๐1
IMPLEMENTING DEEP NEURAL NETWORKS FOR FINANCIAL MARKET PREDICTION
DEEP LEARNING IN FINANCE
IMPLEMENTING DEEP NEURAL NETWORKS FOR FINANCIAL MARKET PREDICTION
๐: แซ
๐ ๐โ100 ๐๐ข โ ๐๐ขโ1
๐๐ขโ1 โช แซ
๐=5 100
๐๐ต( ๐, ๐) โช แซ
๐=1 ๐
๐(๐, ๐
๐)
๐บ๐๐๐ข๐ฃ๐ ๐ ๐๐๐๐ข๐๐
FEATURE ENGINEERI NG MODEL RESULT S
DEEP LEARNING IN FINANCE
๐ โ 1,0, โ1 ๐๐๐ ๐๐ฃ๐ง, ๐ก๐๐๐, โ๐๐๐ ๐ป๐ ๐๐ฃ๐๐ ๐๐ ๐ฃ๐ขโ
1
๐ป๐ ๐๐ฃ๐๐ ๐๐ ๐ฃ๐ขโ
All moving averages from 5 to 100 List of 100 lagged prices Pearson correlation between the returns (all 100) of the stock and all the other stocks(45) Layers Structure Hyper Parameters
Simple Fully Connected Optimizer: SGD CrosVal: Tarining: 80%, Test:
20%
Loss: Categorical Cross Entropy
1000100 135
Fully Connected
Activation: ReLU, ๐ก๐๐๐ข๐๐๐ฆ
Training Algorithm: Walk forward
Accuracy: 73% F1 Score: 0.4
9895DEEP LEARNING IN FINANCE
DEEP LEARNING IN FINANCE
DEEP LEARNING IN FINANCE Heaton et al, 2016
๐๐๐: | ๐ โ (0,365), j โ (1,500)
๐บ๐๐๐ข๐ฃ๐ ๐ ๐๐๐๐ข๐๐
FEATURE ENGINEERI NG MODEL RESULT S
DEEP LEARNING IN FINANCE
๐๐๐: | ๐ โ (0,365), j โ (1,500)
๐ป๐ ๐๐ฃ๐๐ ๐๐ ๐ฃ๐ขโ
50
4 4 2
๐๐๐
๐ก&๐500: | j โ (1,500) ๐ป๐ ๐๐ฃ๐๐ ๐๐ ๐ฃ๐ขโ
Trained an auto encoder Used it to find stock close to the market encoded Used those with deep architecture to find s&p500
Layers Structure Hyper Parameters
AutoEncod er DFP Policy Sparsity indicator: 0.1
50
Stationarity
Definition
Let ๐ฆ๐ข be a stochastic process and let ๐บ
๐ฆ ๐ฆ๐ข1+๐, . . . . , ๐ฆ๐ข๐+๐ represent the cumulative distribution
function of the joint distribution of ๐ฆ๐ข at times ๐ข1 + ๐, . . . . . . . , ๐ข๐ + ๐. ๐๐ is said to be strictly stationary if, for all k, for all ๐ and for all ๐ข1. . . . . . ๐ข๐
๐ฎ๐ ๐๐๐+๐, . . . . , ๐๐๐+๐ = ๐ฎ๐ ๐๐๐, . . . . , ๐๐๐
In other words
Shifting the time origin by an amount ๐ has no effect on the joint distribution which depends only on the intervals between ๐ข1 . . . . . . . ๐ข๐
Stationarizing A Time Series
V A L U E F O R E C A S T I N G
"Stationarizing" A Time Series
V A L U E F O R E C A S T I N G
๐ง๐ = ๐ง๐๐ ๐๐๐ ๐ง๐ = [ln(๐ง๐๐ ๐๐๐)]โฒ
Algorithm Structure
Learning Model Logarithm & Differentiation
Integration & Exponentiation
V A L U E F O R E C A S T I N G
Segmentation Concatenation
๐ง๐ = [ln(๐ง๐๐ ๐๐๐)]โฒ ๐ง๐๐ ๐๐๐ = ๐ืฌ เท
๐ง๐
Features Prediction
Nonstationary Stochastic Series Stationary Stochastic Series Segments Regression Prediction Stationary Stochastic Prediction Nonstationary Final Stochastic Prediction
4D Financial Graph Data
Assets
DEEP REINFORCEMENT LEARNING
Time Features
DEEP LEARNING IN PRODUCTION
ALGOTRADING PITFALLS
Slippage is the difference between where the computer signaled the entry and exit for a trade and where actual clients, with actual money, entered and exited.
Slippage
Commission is a service charge assessed by a broker
- r investment advisor in return
for providing investment advice and/or handling the purchase or sale of a security.
Commission
DEEP LEARNING IN FINANCE
Theoretical Motivations
Smoothness Assumption
Points that are close to each other are more likely to share a label.
- Allow to easily interpolate between examples.
- The root for the curse of dimensionality
Depth Assumption
Depth is a double edge sword.
- Depth can add exponentially (comparing to width) more predictive power to the network but only
if done right.
Distributed Representations
Localist models are very inefficient whenever the data has componential structure
- Distributed representations are useful for efficient combining of features to learn the underlying
mechanism that the labels are derived from.
Kitchen sink approach
Possible relevant features from many market can be incredibly large.
DEEP LEARNING IN FINANCE
UNDERFITTING
Finance models often have poor results
Solve this by smart risk management
โIt works if you remove this situationโ
Save yourself some money, it just doesnโt work.
Feature Engineering VS Raw Data
A good question. DEEP LEARNING IN FINANCE
Data: Unlimited
Image Finance
Overtraining : Hard Data: Limited Overtraining: Easy
$
โYou are basically risking millions on something no
- ne in the world fully
understands?โ
"Why Should I Trust You?": Explaining the Predictions of Any Classifier โ Ribeiro, et al 2015
DEEP LEARNING IN FINANCE
MARKOV PROPERTY
๐ ๐๐ = ๐ฆ๐ ๐๐โ1 = ๐ฆ๐โ1, ๐๐โ2 = ๐ฆ๐โ2, . . . . , ๐0 = ๐ฆ0) = ๐(๐๐ = ๐ฆ๐|๐๐โ๐ข = ๐ฆ๐โ๐ข)
Markov Property
DEEP LEARNING IN FINANCE
RNN PITFALLS
Example: Jane walked into the room John walked in too. It was late at night. Jane said hi to _____ The difficulty in training recurrent neural nets โ Bengio, et al 2013 Tricks:
- Init the weight matrices to the identity Matrix
- Set the activation function to RELU
- Norm clipping the exploding gradient
เท ๐ โ
๐๐ ๐๐ฅ
๐๐ ๐ โฅ ๐ขโ๐ ๐๐กโ๐๐๐ ๐๐๐๐ เท ๐ โ
๐ขโ๐ ๐๐กโ๐๐๐ || เท ๐||
เท ๐ A Simple Way to Initialize Recurrent Networks of Rectified Linear Unitsโ Hinton, et al 2015
DEEP LEARNING IN FINANCE
DEBUGGING DEEP NETS
Understanding the difficulty in training deep feedforward neural netsโ Bengio, et al 2013
Mean and standard deviation of the activation (output of the sigmoid) during learning, for 4 hidden layers . The top hidden layer quickly saturates at 0 (slowing down all learning), but then slowly desaturates ~ epoch 100.
Other Uses of Deep Learning
Extracting price impact of news
Stress signal on Individual Asset from text (Word2Vec) and fundamental data
Non-Linear Portfolio Replication
Using Fewer Instruments
Reinforcement Learning
For Continues Algo optimization
Space Embedding
For Correlation between Assets
DEEP LEARNING IN FINANCE
GOOG AAPL MMM
... ...
Deep Learning Orders
Trading / Training On Site
Research In House
Data Storage GPU CLUSTER Testing Data
Market Market
Under fitting
Underfitting occurs when a statistical model or machine learning algorithm cannot capture the underlying trend of the data. Intuitively, underfitting
- ccurs when the model or the algorithm does not fit
the data well enough.
Underftting is either caused by trying to fit a too simple model or by not training the model enough. Calculate 1+1 What is 1? I've
- nly seen
equations with 4
TRADING USING DEEP LEARNING
Represented As Tensor Multiplications Computed On NVidia GPUs Model Training Algorithm
USING THE GPU
TRADING USING DEEP LEARNING
X
THE IMPORTANCE OF POWER
TRADING USING DEEP LEARNING
Performance
Computational Speed allows us to further train our model to yield better accuracy in the finite amount of training time we have.
Memory Capacity
High Memory Capacity is crucial to our systems, allowing us to construct wider and deeper models and integrate more data into the models.
Interface Speed
The Speed of the card interface defines how many model โshiftsโ can we perform in one trading cycle.
CPU vs GPU
Compiled with .nvcc GDDR5 โ 7.0 Gbps Compiled with .gcc DDR4 - 2400Mhz
TRADING USING DEEP LEARNING
Matrix multiplication complexityโ ๐(๐3) Epoch takes: 7500 sec (avg) Matrix multiplication complexityโ ๐(๐)
[1]
Epoch takes: 500 sec (avg)
[1] - Understanding the Efficiency of GPU Algorithms for Matrix-Matrix Multiplication - Fatahalian, et al, 2004