What if you had a tool that could help you decide when to apply mean reversion strategies and when to apply momentum to a particular time series? That’s the promise of the Hurst exponent, which helps characterise a time series as mean reverting, trending, or a random walk. For a brief introduction to Hurst, including some Python code for its calculation, check out our previous post. Even if you have read this post previously, it is worth checking out again as we have updated our method for calculating Hurst and believe this new implementation is more accurate.
It would be great if we could plug some historical time series data into the Hurst algorithm and know whether we expect the time series to mean revert or trend. But as is usually the case when we apply such tools to the financial domain, it isn’t quite that straightforward. In the last post, we noted that Hurst gives different results depending on how it is calculated; this begs the question of how to choose a calculation method intelligently so that we avoid choosing arbitrary parameters. The purpose of this post is to delve into the algorithm behind the calculation of Hurst in an attempt to understand this very question. Hopefully we will draw some conclusions around if, when and how we might apply the very attractive theory of Hurst in a manner that is practical to systematic traders.
In this post, we perform the analysis in Python, which is something of a departure from tradition for Robot Wealth. We are currently building our skills in both Python and the Microsoft .Net framework to complement our skills in R, C and MATLAB, so expect to see more from us using these tools.
Introduction to Hurst
The Hurst exponent, H, measures the long-term memory of a time series, characterising it as either mean-reverting, trending or a random walk. H is a number between 0 and 1, with H < 0.5 indicating mean reversion, H > 0.5 indicating a trending time series and H = 0.5 indicating a random walk. Smaller and larger values of H indicate stronger mean-reversion and trending, respectively.
Here, we apply the algorithm for calculating Hurst from the previous post to an artificial mean-reverting time series created from a discrete Ornstein-Uhlenbeck process parameterised arbitrarily:
# create OU process N = 100000 ts = zeros(N) mu = 0.75 theta = 0.04 sigma = 0.05 for i in range(1,N): dts = (mu - ts[i-1])*theta + randn()*sigma ts[i] = ts[i-1] + dts # calculate Hurst lag1 = 2 lags = range(lag1, 20) tau = [sqrt(std(subtract(ts[lag:], ts[:-lag]))) for lag in lags] plot(log(lags), log(tau)); show() m = polyfit(log(lags), log(tau), 1) hurst = m*2 print 'hurst = ',hurst
This returns a Hurst exponent of around 0.43, indicating that the series is moderately mean reverting, as expected.
As the algorithm shows, calculation of Hurst is related to the autocorrelations of the time series. Autocorrelation (also known as serial correlation) refers to the correlation between a time series and lagged values of itself. In particular, Hurst is related to the rate at which these autocorrelations decrease as the lag increases. We know that we get different values of Hurst depending on which lags we use in its calculation. So which lags should we focus on?
Analysing SPY with Hurst
In the example above, we used lags 2-20. This is completely arbitrary and is the same value used by default in the MATLAB genhurst function for calculating the Hurst exponent. Let’s look at some real financial data – price history for the SPY ETF – and investigate the effect of varying this range of lags and the subset of data analysed. Here’s the code for obtaining and plotting the data:
import pandas.io.data as web import datetime from numpy import * from pylab import plot, show start = datetime.datetime(1993, 1, 1) end = datetime.datetime(2016, 12, 31) spy = web.DataReader("SPY", 'yahoo', start, end) closes = spy['Adj Close'][:] plot(closes); show()
And a plot of the data:
When we calculate Hurst we get a value of roughly 0.436, which tells us that our series is moderately mean reverting.
# calculate Hurst lag1 = 2 lags = range(lag1, 20) tau = [sqrt(std(subtract(closes[lag:], closes[:-lag]))) for lag in lags] plot(log(lags), log(tau)); show() m = polyfit(log(lags), log(tau), 1) hurst = m*2 print 'hurst = ',hurst
Now, lets use the last 2000 values only.
# recent prices closes_recent = spy['Adj Close'][-2000:] plot(closes_recent); show() # calculate Hurst of recent prices lag1 = 2 lags = range(lag1, 20) tau = [sqrt(std(subtract(closes_recent[lag:], closes_recent[:-lag]))) for lag in lags] plot(log(lags), log(tau)); show() m = polyfit(log(lags), log(tau), 1) hurst = m*2 print 'hurst = ',hurst
Here’s a plot of the recent SPY data:
We can see a clear trend so we would expect a Hurst value of greater than 0.5, however our algorithm returns H = 0.428, less than the value calculated for the longer period. This is clearly not in line with what we would expect, so let’s try looking at another range of lags. After trying a few different ranges, we found that using lags 300-400 resulted in a Hurst exponent of 0.668, which is more in line with what we would expect. However, if we plug those lags into the Hurst calculation for the entire series, we get H = 0.641, which may seem a little odd – at first glance anyway.
We can draw a significant conclusion from these results: that the lags used to calculate Hurst have a much greater impact on the calculation than the particular segment of the time series analysed (for SPY, anyway). Further, when we look at the entire time series, we see that SPY is moderately mean-reverting for shorter lags. Lags up to about 20 are most mean reverting, and then H increases as we increase the lags used in its calculation. If we continue to increase the lags used to calculate H up to the range 300-400, we find that H indicates a moderately trending time series. We also find that for moderate lag values, H tends to approach 0.5, suggesting that SPY is also a random walk at some time scales.
This leads to the conclusion that SPY is neither absolutely mean reverting nor absolutely trending. Instead, Hurst indicates that it is moderately mean reverting over short time periods and tends to exhibit momentum over the longer term. I find this to be a pleasing result, one that is in line with the results of Jagadeesh and Titman (1993), who investigated momentum and found that multiple-month relative returns predict future returns. The result is also in line with what we tend to see over shorter time horizons in equities markets.
What I find most extraordinary about these results is that they are consistent regardless of whether the time series appears to the naked eye to be trending (as in the sub-period from 2009 to the end of 2016) or mean reverting. Who could conclude, through observation of the price series alone, that the time series in Figure 2 was mean reverting over a short time horizon? I suppose if one looks closely, regular pullbacks in the uptrend are visible, but I don’t think it is obviously mean reverting. I believe that Hurst is therefore actually giving us some incredibly valuable insight into the behaviour of SPY.
In order to see this effect more clearly, consider the following random walk, which by construction has no memory effect but does have a definite uptrend:
# artifical random walk with trend rw = cumsum(randn(10000)+0.025) plot(rw)
Calculating Hurst for this series gives H = 0.502, which indeed corresponds to a random walk, despite the obvious uptrend. This is a great example of how our eyes can easily fail to detect the underlying dynamics of a time series.
In summary, I hope I have demonstrated that despite the attractive theory, utilising the Hurst calculation requires some deeper thought and analysis than simply plugging some numbers into an algorithm. However, if we take the time to investigate, we can obtain some potentially very useful insights. Hurst is essentially a measure of the memory in a particular time series, and this memory can be both mean-reverting and trending at the same time, depending on the time scale.
In the case of the SPY ETF, the Hurst calculation showed that we would be more likely to trade successfully on short time frames if we utilised a mean-reversion trading model. Conversely, Hurst suggests that we would be more likely to be successful on longer time frames with momentum-style trading models. We also saw that Hurst approaches 0.5 – the random walk – for medium time frames, suggesting that we should perhaps avoid models that rely on that time horizon. Of course, this assumes that the future will be like the past, which may or may not be the case. However, we did see some consistency in the H values calculated for the entire time series and a subset of that time series, which gives me some degree of confidence in this approach.
What do you think? How would you interpret the results I obtained for the SPY ETF? I would love to hear from you in the comments.