
This is the first post in a two-part series about the Hurst Exponent. Tom and I worked on this series together and I drew on some of his previously published work as well as other sources like the very useful Quantstart.com.
The remainder of this post is devoted to presenting and discussing some Python code for calculating Hurst. In the next post, we are going to delve more deeply into the calculation and work out what’s going on. Our ultimate goal is to demystify the Hurst Exponent and show how to take it beyond some nice theory to something of practical value to algo traders.
Without further ado, here is the code for calculating the Hurst Exponent in Python. We determine Hurst by firstly calculating the standard deviation of the difference between a series and its lagged counterpart. We then repeat this calculation for a number of lags and plot the result as a function of the number of lags. If we plot this on a log-log scale, we end up with a straight line, the slope of which provides an estimate for the Hurst exponent. I found this article which describes this approach to calculating Hurst, as does this one.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
from numpy import * from pylab import plot, show # first, create an arbitrary time series, ts ts = [0] for i in range(1,100000): ts.append(ts[i-1]*1.0 + random.randn()) # calculate standard deviation of differenced series using various lags lags = range(2, 20) tau = [sqrt(std(subtract(ts[lag:], ts[:-lag]))) for lag in lags] # plot on log-log scale plot(log(lags), log(tau)); show() # calculate Hurst as slope of log-log plot m = polyfit(log(lags), log(tau), 1) hurst = m[0]*2.0 print 'hurst = ',hurst |
You can see in the code that we used lags 2 through 20 for calculating H. These lags are somewhat arbitrary, but based on the best results obtained using synthetic data with known behaviour. Specifically, we found that if we set the maximum number of lags too high, the results became quite inaccurate. These values are the defaults used in some other implementations, such as the standard Hurst function in MATLAB.
In the next post, we will look at these lags in more detail and show how they are actually crucial for calculating Hurst in such a way that is useful and meaningful. We tweak this part of the calculation to uncover a practical application of Hurst in developing algo trading systems. Check out the follow on post Demystifying the Hurst Exponent – Part 2 here.
If you have used the Hurst Exponent, or indeed any of the other tests for mean reversion that we mentioned in this post, please share your experiences in the comments. Thanks!
Wait! Before you keep reading....
Learn why Algo Trading is the only trading that will make you profitable long term — and where to start.
Discover exactly what algo trading is, the skills used by 99% of profitable traders, and how to escape your demo account to trade profitably.
(12) Comments
[…] Demystifying the Hurst Exponent Part 1 [Robot Wealth] This is the first post in a two-part series about the Hurst Exponent. Tom and I worked on this series together, but the awesome code presented throughout is all his. Thanks Tom! Mean-reverting time series have long been a fruitful playground for quantitative traders. In fact, some of the biggest names in quant trading allegedly made their fortunes exploiting mean reversion of financial time series […]
Hi Kris,
I’ve had a brief look at Kaufman’s Efficiency Ratio. The results weren’t great, but I didn’t dig into it effectively enough to fully rule it out as a filter for the mean reversion systems I’m trading.
Happy to write it up with a proper analysis.
Nick
Hey Nick
I’d love to see a proper analysis of Kaufman’s Efficiency Ratio! You are welcome to share it here!
Cheers
Thanks Kris – will do – I’ll write it up and submit. Many thanks
Interesting post thanks… I’m currently investigating applying the Hurst exponent in machine learning to improve my trade selection. Try as I might, I can’t find a default Hurst Matlab function however!
Hi David,
I guess its not really a default function, but try this implementation of the generalized Hurst exponent:
https://au.mathworks.com/matlabcentral/fileexchange/30076-generalized-hurst-exponent?requestedDomain=www.mathworks.com
Thanks Kris, I’ll give that a try… I also stumbled across this new Matlab function gives 3 separate (!) estimates of the Hurst exponent too:
https://uk.mathworks.com/help/wavelet/ref/wfbmesti.html
Yes! This is a problem and one that we are going to attempt to get to the bottom of in the next post. Thanks for sharing that implementation.
[…] BLOG […]
Hi Kris,
Thanks for the article, incredibly interesting. I was just wondering if you could shed some light on where the sqrt of std comes from? I’ve been through both articles you linked to and find the RS method of calculating the Hurst exponent (the one originally derived by Hurst for the work on Nile levels) to be intuitively easier to understand. Looking at your update note, my guess would be that you were initially using that algorithm for calculating Hurst, but then moved on to the Generalized Hurst Exponent calculation that Mike from QuantStart uses on his page?
I’ve been through Mike’s explanation, but I still can’t seem to rationalize why we are using the square root of the standard deviation in this algorithm. Any light you could shed on that would be much appreciated.
Thanks again for the great article.
Hi Kris,
thanks for sharing.
I think we can save the sqrt() step here in : tau = [sqrt(std(subtract(ts[lag:], ts[:–lag]))) for lag in lags],
and no need to multiply 2 here in : hurst = m[0]*2.0.
So we the computation is simplified to:
…
tau = [std(subtract(ts[lag:], ts[:–lag])) for lag in lags],
…
hurst = m[0]
which saves computation and easier to be understood from the Hurst exponent definition.
But still thanks a lot for sharing this, learned a lot. 🙂
Regards
Andy
Nice one! Thanks Andy, that makes a lot of sense.