Forecasting High Frequency Volatility: A study of the Bitcoin Market - - PowerPoint PPT Presentation

forecasting high frequency volatility a study of the
SMART_READER_LITE
LIVE PREVIEW

Forecasting High Frequency Volatility: A study of the Bitcoin Market - - PowerPoint PPT Presentation

Forecasting High Frequency Volatility: A study of the Bitcoin Market using Support Vector Regression Yaohao Peng Mariana Rosa Montenegro Ana Julia Akaishi Padula Jader Martins Camboim de S a University of Brasilia Laboratory of Machine


slide-1
SLIDE 1

Forecasting High Frequency Volatility: A study of the Bitcoin Market using Support Vector Regression

Yaohao Peng Mariana Rosa Montenegro Ana Julia Akaishi Padula Jader Martins Camboim de S´ a

University of Brasilia Laboratory of Machine Learning in Finance and Organizations

slide-2
SLIDE 2

Main goals

◮ Evaluate the predictive performance of Bitcoin volatility of

machine learning techniques in comparison to GARCH models

◮ Error metrics: Root Mean Square Error (RMSE) and Mean

Absolute Error (MAE)

◮ Diebold-Mariano Test

◮ Analyze the Bitcoin volatility on low (daily) and high (hourly)

frequency data sets

slide-3
SLIDE 3

Motivation: The evolution of wealth

“Wealth” is a key concept in finance, and its idea has changed radically throughout the history (Ferguson, 2008)

◮ Wealth as a consequence of power: having the means to

conquer and pillage

◮ Wealth as the cause of power: possession of precious metals;

production and trade

◮ Wealth as possessing money: money can be converted to any

  • ther asset

◮ Wealth as possessing financial assets: money’s value reserve

is increasingly lower

◮ Can cryptocurrencies be the next step?

slide-4
SLIDE 4

Cryptocurrencies

slide-5
SLIDE 5

Why Bitcoin?

Satoshi Nakamoto

◮ One of the richest “people” in the history of mankind

slide-6
SLIDE 6

Volatility forecasting

Volatility forecasting bears a huge importance in financial series analysis

◮ Decisive impacts on risk management and derivatives pricing ◮ Financial series’ conditional variance is typically non-constant ◮ Classic models: ARCH (Engle & Bollerslev, 1986), GARCH

(Bollerslev, 1986), EGARCH (Nelson, 1991), GJR-GARCH (Glosten, Jagannathan & Runkle, 1993)

◮ GARCH(1,1) is a generalization of an ARCH(∞), and

performs well for financial data (Hansen & Lunde, 2005; Orhan & K¨

  • ksal, 2012)
slide-7
SLIDE 7

High frequency volatility forecasting

The increasing of financial transaction flows motivates a “High-frequency trading paradigm” (Easley, L´

  • pez de Prado &

O’Hara, 2012)

◮ Exchange rates and cryptocurrencies’ intraday volatility tend

to be very high (Li & Wang, 2016)

slide-8
SLIDE 8

Machine learning in volatility forecasting

Support Vector Regression (SVR) is a Kernel-based learning algorithm which can fit models with high degree of nonlinearity while using few parameters

◮ Applications in volatility forecasting: (Chen, H¨

ardle & Jeong, 2010; Premanode & Toumazou, 2013; Santamar´ ıa-Bonfil, Frausto-Sol´ ıs & V´ azquez-Rodarte, 2015)

◮ SVR’s efficiency and superiority towards other machine

learning techniques are discussed in Gavrishchaka & Banerjee (2006) and Barun´ ık & Kˇ rehl´ ık (2016)

slide-9
SLIDE 9

Bitcoin volatility forecasting

Bitcoin volatility analysis are still scarce, and mainly focusing on traditional GARCH models and its extensions (Li & Wang, 2016)

◮ Bitcoin’s reaction to news is quicker than Gold and US Dollar

(Dyhrberg, 2016a; 2016b)

◮ Fundamental value vs speculative bubbles (Dowd, 2014) ◮ Informational innefficiency (Urquhart, 2016)

slide-10
SLIDE 10

GARCH(1,1)

rt = µt + ǫt µt = γ0 + γ1rt−1 ht = α0 + α1ǫ2

t−1 + β1ht−1 ◮ Proxy volatility: ˜

ht = (rt − ¯ r)2 (Chen, H¨ ardle & Jeong, 2010) For this paper, we used the Gaussian, Student’s t and Skewed Student’s t distributions for ǫt

slide-11
SLIDE 11

Support Vector Regression

The Support Vector Machine is a regression method that computes nonlinear decision functions by means of a Kernel function κ(xi, xj) = ϕT(xi) · ϕ(xj) ∈ R that maps the original data to a much higher dimension

◮ This paper used the Gaussian Kernel

κ(xi, xj) = exp

  • −||xi − xj||2

2σ2

  • , σ > 0, the most widely used

in the machine learning literature

slide-12
SLIDE 12

Support Vector Regression

The SVR decision function has the form f (xi) = wTϕ(x) − w0 =

n

  • j=1

κ(xi, xj)(λ∗

j − λj) − w0

Given the bias-variance dilemma, two parameters are introduced:

◮ To avoid overfitting, a tolerance band ε

¯ is allowed for the deviation between observed and predicted values

◮ For deviations greater ther ε

¯ in a quantity ξ > 0, a penalty C ¯ is imputed to SVR’s objective function

slide-13
SLIDE 13

Support Vector Regression

slide-14
SLIDE 14

SVR-GARCH(1,1)

The SVR-GARCH (1,1) follows the same structure of the GARCH (1,1), with the mean and volatility equations estimated via SVR rt = fm(rt−1) + ǫt ht = fv(ht−1, ǫ2

t−1)

(1)

◮ Santamar´

ıa-Bonfil, Frausto-Sol´ ıs & V´ azquez-Rodarte (2015) presented empirical evidences that the SVR-GARCH managed to outperform standard GARCH’s predictions, showing better ability to approximate the nonlinear behavior of financial data and stylized facts, such as heavy tails and volatility clusters

slide-15
SLIDE 15

Empirical analysis

◮ Data collected from January 5th 2015 to December 31st 2016. ◮ Both low and high frequency databases were split into three

mutually exclusive subsets: Training set (50%), validation set (20%) and test set (30%).

◮ The parameters’ search were performed by grid search ◮ The predictions’ performance were evaluated by error metrics

RMSE and MAE and the Diebold-Mariano test for predictive accuracy

slide-16
SLIDE 16

Forecasting performance: Error metrics

◮ Both error metrics were significantly lower for SVR-GARCH (1,1) in

comparison to the GARCH models

◮ The overall volatility was higher in low frequency data than in high

frequency (as seen in Xie & Li (2010))

◮ The GARCH with Gaussian distribution performed slightly poorly

than Student’s t and Skewed Student’s t distributions

slide-17
SLIDE 17

Forecasting performance: Diebold-Mariano Test

◮ For the majority of the testes models, the null hypothesis is rejected

at a greater than 99% significance level, providing strong statistical evidences that the predictive superiority of SVR-GARCH(1,1) towards GARCH models

◮ In both data frequencies, the p-value for the Gaussian GARCH

model was the lowest

◮ In high frequency data, the test showed that SVR-GARCH(1,1) is

“less emphatically” better than the other models, especially the Skewed Student’s t GARCH (1,1)

slide-18
SLIDE 18

Limitations and future developments

◮ Analyze other markets (derivatives, commodities,...) and

cryptocurrencies (Ethereum, Litecoin, Dash,...)

◮ Replication to different time periods and data frequencies ◮ Comparison with other machine learning methods ◮ Test for other GARCH extensions, distributions for ǫt and

Kernel functions

slide-19
SLIDE 19

Thank you!

peng.yaohao@gmail.com lamfo.unb.br lamfo-unb.github.io