Factor Investing Using Deep Learning Steven E. Thornton Data - - PowerPoint PPT Presentation

factor investing using deep learning
SMART_READER_LITE
LIVE PREVIEW

Factor Investing Using Deep Learning Steven E. Thornton Data - - PowerPoint PPT Presentation

Factor Investing Using Deep Learning Steven E. Thornton Data Scientist, RN Financial Corporation Rafael Nicolas Fermin Cota CEO, RN Financial Corporation About RN Financial Research RN Financial Research Centre is a financial research


slide-1
SLIDE 1

Factor Investing Using Deep Learning

Steven E. Thornton – Data Scientist, RN Financial Corporation Rafael Nicolas Fermin Cota – CEO, RN Financial Corporation

slide-2
SLIDE 2

1

About RN Financial Research

RN Financial Research Centre is a financial research company focused

  • n developing high-performance software for generating clean data.

We use data science for quantitative analysis on equities by leveraging fundamental company data.

slide-3
SLIDE 3

2

Part One - Data

“Data is the new Oil”

slide-4
SLIDE 4

3

Data quality Quality of data is paramount Free of biases Data needs to be clean Right format Homogenized Trustworthy source Implementable Up-to-date

slide-5
SLIDE 5

4

Data we use Descriptive Pricing Fundamentals Corporate Action

slide-6
SLIDE 6

5

Research process

  • 1. Obtain data
  • 2. Clean data
  • 3. Linking with other datasets
  • 4. Generating aggregations or new features
  • 5. Storing in a database
slide-7
SLIDE 7

6

Issues we overcame through our experience

  • 1. Extraction is slow
  • 2. Messy
  • 3. Inflexible
  • 4. Difficult to compare among analysts
  • 5. Cleaning is slow
slide-8
SLIDE 8

7

Domain expertise We’ve experienced the headaches of data cleaning We’re not just a data vendor Constantly refining our process We use our data

slide-9
SLIDE 9

8

Software and hardware Software

  • Retrieve data from vendors
  • Distributed analytical database
  • High performance C++ for data cleaning
  • Easy to add new factors
  • Scalable

Hardware

  • High-performance hardware managed by experts
  • Storage systems
slide-10
SLIDE 10

9

Skill management

Developer Financial Expert

slide-11
SLIDE 11

10

Part Two- Case Study

Applying Deep Learning for Fundamental-Based Signal Generation

slide-12
SLIDE 12

11

Objective Build a model that forecasts if stock ABC will outperform stock XYZ

  • ver the next year.
slide-13
SLIDE 13

12

Optimized using GPUs

  • 1. Model Training
  • 2. Portfolio Construction
slide-14
SLIDE 14

13

Data 1,233 Factors

  • Momentum
  • Value
  • Growth
  • Technical
  • Sentiment

Investable Universe

  • Market cap of at least $1B USD
  • Price of at least $5.00
  • Sector Filters
  • Corporate action filters
slide-15
SLIDE 15

14

Investable universe

slide-16
SLIDE 16

15

Model overview

1. Rank factor values on each day 2. Generate all pairs of stocks on each day 3. Predict if stock ABC will outperform stock XYZ over the next year 4. Generate a probability matrix 5. Compute the probability of each stock outperforming/underperforming all other stocks 6. Long the top 50 stocks each day, short the bottom 50 7. Apply a portfolio construction algorithm each day to determine stock weights 8. Layer portfolios each day

slide-17
SLIDE 17

16

Step 1: Rank factor values on each day

slide-18
SLIDE 18

17

Step 2: Generate all pairs of stocks on each day

slide-19
SLIDE 19

18

Step 3: Predict if stock ABC will outperform stock XYZ over the next year

Classification Model

  • Training set contains 1 day every 3

months (expanding)

  • If we don’t use enough data, our

model may “memorize” the best/worst stocks

slide-20
SLIDE 20

19

Step 4: Generate a probability matrix

P =

A matrix that contains the predicted probability of stock j

  • utperforming stock i over the

next year:

Pi,j = Pr(Xi ≤ Xj)

slide-21
SLIDE 21

20

Step 5: Probability of each stock outperforming/underperforming all other stocks

Pr(Xi > X1 & Xi > X2 & · · · & Xi > XN) =

N

Y

j=1 j6=i

1 − Pr(Xi ≤ Xj)

Pr(Xi ≤ X1 & Xi ≤ X2 & · · · & Xi ≤ XN) =

N

Y

j=1 j6=i

Pr(Xi ≤ Xj)

Stock Log Probability of Best Log Probability of Worst Stock 1

  • 478.86101
  • 1311.6859

Stock 2

  • 1033.0009
  • 605.94667

Stock 3

  • 740.48224
  • 841.6922

Stock 4

  • 868.96107
  • 713.01704

Stock 5

  • 666.08427
  • 911.40809

Probability of stock i underperforming all other stocks: Probability of stock i outperforming all other stocks:

slide-22
SLIDE 22

21

Step 6: Long the top 50 stocks each day, short the bottom 50

slide-23
SLIDE 23

22

Step 6: Long the top 50 stocks each day, short the bottom 50

slide-24
SLIDE 24

23

Step 7: Portfolio construction algorithm each day to determine stock weights

López de Prado, Marcos, Building Diversified Portfolios that Outperform Out-of-Sample (May 23, 2016). Journal of Portfolio Management, 2016, Forthcoming.

Overview of Algorithm

1. Cluster the portfolios using hierarchical clustering (single linkage) 2. Sort based on the clustering 3. Split portfolios in half, weighting both halves by their inverse portfolio volatility

slide-25
SLIDE 25

24

Step 7: Portfolio construction algorithm each day to determine stock weights

slide-26
SLIDE 26

25

Step 7: Portfolio construction algorithm each day to determine stock weights

slide-27
SLIDE 27

26

Step 7: Portfolio construction algorithm each day to determine stock weights

slide-28
SLIDE 28

27

Step 7: Portfolio construction algorithm each day to determine stock weights

slide-29
SLIDE 29

28

Step 8: Layer portfolios each day

slide-30
SLIDE 30

Questions?