Back to Community
Hurst Exponent

Originally taken from this thread, the Hurst Exponent tells you whether a series is

  1. Geometric random walk (H=0.5)
  2. Mean-reverting series (H<0.5)
  3. Trending Series (H>0.5)

If H decreases towards zero, the price series may be more mean reverting and if it increases more towards one, the price series may be more trending. As for testing whether H really = 0.5, there's something called the variance-ratio test ( more to update later ).

Much of the credit goes to Tom Starke as the mathematics behind the code is beyond my reach

Enjoy!

-Seong

Clone Algorithm
417
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 523767f96f6c990720665805
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.
Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

25 responses

Hi Seong,

I am not a specialist in statistical models but I found that, while the Hurst has some relevance in determining the nature of self-correlation in a time series (i.e. trend: future return are positively correlated with past returns) it is by itself not very reliable. I build this little neural network tool which uses the Hurst and Sharpe ratio to identify the different domains more effectively (https://www.leinenbock.com/training-a-simple-neural-network-to-recognise-different-regimes-in-financial-time-series/). I had much better results with it than with the Hurst alone. The tricky bit is the training of the network. In my example I produce randomly generated time series for which I know whether they are mean reverting, trending or random. Initially, I train with very 'obvious' data sets and then move to more subtle ones. It does a good job in picking them up and I found it a good one to use especially for micro-trends.
My choice of Hurst and Sharpe for this job is because Hurst seems better with mean reversion than trends while Sharpe obviously does trends well. Since putting this on my site I've much improved this by using a third indicator but this is not ready yet.
Play around with it some time. It's good fun actually.

Tom,

Thanks for you reply, I've been playing around with the neural network and am wondering why after a certain point, the output begins to remain at [-1]

8379 0 [-1.] TRUE 1.6839304718 90 %  
8380 1 [-1.] FALSE 1.68333333333 90 %  
8381 1 [-1.] FALSE 1.68273661822 90 %  
8382 1 [-1.] FALSE 1.68214032601 90 %  
8383 1 [-1.] FALSE 1.68154445625 90 %  
8384 0 [-1.] TRUE 1.68189868934 90 %  

It seems to happen with both the Quantopian version as well as one that I've compiled for Python on my desktop

2002-12-24handle_data:179DEBUG[-1.]  
2002-12-24handle_data:179DEBUG[-1.]  
2002-12-24handle_data:179DEBUG[-1.]  
2002-12-24handle_data:179DEBUG[-1.]  
2002-12-24handle_data:179DEBUG[-1.]  

Very interesting topic though am having a lot of fun with it

Clone Algorithm
24
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 5243435d92cd0106d0976d1d
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

I did see that too and I'm not quite sure why that is. Will have to look into that. It is probably something quite trivial related to how those simple NN's are implemented. Proper understanding of NN's is an art in itself. That's the reason why I'm cautious in using them and only apply them to relatively simple things (so far). What I did in some of my programs was to introduce a termination condition, for example, if 90% of the last N events are recognised correctly: stop learning and refine the training conditions.
Even though those NN's are relatively simple in their essential functionality, the routines that you have to build around them to make them work for particular problems can be quite complex. My philosophy is to keep the basic algos as simple as possible and so I'm not sure it's always worth the effort. It's so easy to introduce unmanageable complexity when you write new algorithms for optimisation. Even though there are a number of very fancy toolboxes in Python I like to use basic code, which I can see, change and understand rather than relying on black-box magic. As you can see, even that simple code above shows some properties, which aren't immediately obvious if you are not in the field.

Thanks for the reply Tom,

I went ahead and tried using the neural network in an actual algorithm albeit very simply done. You can find it here : https://quantopian.com/posts/neural-network-that-tests-for-mean-reversion-or-momentum-trending

I am new to the Hurst exponent and algorithmic trading in general. While the mathematics are above me, I am intrigued by the idea behind the Hurst exponent. Would there be a way to use the Hurst exponent identifying a trending market, and then combine it with a simple EMA or SMA crossover trading system? As noted here, it seems that the Hurst exponent in and of itself is not enough for a trading system. Thanks.

Josh,

I think that's a very plausible idea, and as Tom mentioned, it'd be wise to pair it with something else (in the neural network's case he used the Sharpe ratio).

Are you thinking something like

if hurst(context, data, context.stock) > 0.5 and price > moving_average:  
    order(context.stock, order_amount)  

Where if the hurst exponent is greater than 0.5 (which would indicate a momentum trending time series) and the price is greater than the X day moving average (e.g. 30 day moving average) go long on the stock

My feeling is that most of the findings of values of H in stock returns that differ from H=0.5 can be tracked down to methodology problems. Here is my article on this subject. I would be very grateful for your comments. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2564916

Thanks, I'll take a look tomorrow!

What ever happened to Dr. Tom Starke's blog? Leinenbock is gone, and drtomstarke.com has a password...

This is an interesting concept...

Hi, Peters mentioned in the book that using log return instead of the plain price change is more appropriate. Could you modify the codes?

Derek,
Nice! Thanks for sticking with this!
Given that you can now compute this for large numbers of assets over periods of time,
do you have any guidance on what the hurst exponent characteristics for an asset looks like over a period of time?
Also, with this increased computing power, any new ways to use this exponent with co-integrated pairs/baskets of stocks?
alan

Derek,
Thanks so much for the notebook and algo!
Lots to chew over here, so let me get back to you with some questions and comments.
alan

Apologies if this is the wrong forum to ask a question such as this.
I was wondering if someone could help me understand where the sqrt and std come from in the following line of code?
tau = [np.sqrt(np.std(np.subtract(ts[lag:], ts[:-lag]))) for lag in lags]

I believe it comes from the following principles, but I just can't reconcile how the formulae (copied below) lead to being able to use the code given, with the std and sqrt. I am also struggling to reconcile what the <| ..... |> symbols mean as well, am I right in thinking that the | | means absolute value or norm of a vector and as stated < ... > means as average of all the data points?

I have seen this question pop up in 2 different forums, so I'm not the only person that is lost, but I've never seen it answered.

Thanks in advance and apologies if this question is not clear, I'm happy to provide further clarification, just let me know.

Formulae
Var(τ) = 〈|z(t + τ) − z(t)|^2〉

where

z = log(price)  
τ is an arbitrary time lag  
〈…〉 is an average over all t’s

For a geometric random walk:

〈|z(t + τ) − z(t)|^2〉 ∼ τ

But if the series is either mean reverting or trending, this relationship will not hold, and instead we get

〈|z(t + τ) − z(t)|^2〉 ∼ τ^2H

I remember wondering how hurst was calculated once, and I remember figuring it out. The trick was that it was a slope of a linear regression of a variety of point estimates in log space, or something like that. The confusion comes from there being two sources of time: there's the lag between time points, and there's the number of points of each lag that you use to estimate the value of that lag. IIRC. Sorry I can't be of more help.

Thanks for the reply. I've actually managed to go back to the source which I believe is Dr Ernie Chan's blog and using the principles he outlines there, put together my own code. It appears that if I use variance instead of standard deviation, get rid of the sqrt (which I couldn't understand the purpose of in the first place), and use a 2.0 divisor instead of a 2.0 multiplier (because of the 2H), it actually gives the same answer, which is one that I can understand from first principles.

Dr Chan doesn't give any code on this page (I believe he works in MATLAB not Python anyway). Hence I needed to put together my own code from the notes he gives in his blog and answers he gives to questions posed on his blog.

  1. Dr Chan states that if z is the log price, then volatility, sampled at intervals of τ, is volatility(τ)=√(Var(z(t)-z(t-τ))). To me another way of describing volatility is standard deviation, so std(τ)=√(Var(z(t)-z(t-τ)))

  2. std is just the root of variance so var(τ)=(Var(z(t)-z(t-τ)))

  3. Dr Chan then states: In general, we can write Var(τ) ∝ τ^(2H) where H is the Hurst exponent

  4. Hence (Var(z(t)-z(t-τ))) ∝ τ^(2H)

  5. Taking the log of each side we get log (Var(z(t)-z(t-τ))) ∝ 2H log τ

  6. [ log (Var(z(t)-z(t-τ))) / log τ ] / 2 ∝ H (gives the Hurst exponent) where we know the term in square brackets on far left is the slope of a log-log plot of tau and a corresponding set of variances.

# Let's try it the Ernie Chan way  
# http://epchan.blogspot.fr/2016/04/mean-reversion-momentum-and-volatility.html

def hurst_ernie_chan(p):

    variancetau = []; tau = []

    for lag in lags:  
        #  Write the different lags into a vector to compute a set of tau or lags  
        tau.append(lag)

        # Compute the log returns on all days, then compute the variance on the difference in log returns  
        # call this pp or the price difference  
        pp = subtract(p[lag:], p[:-lag])  
        variancetau.append(var(pp))

    # we now have a set of tau or lags and a corresponding set of variances.  
    #print tau  
    #print variancetau

    # plot the log of those variance against the log of tau and get the slope  
    m = polyfit(log10(tau),log10(variancetau),1)

    hurst = m[0] / 2

    return hurst  

Good stuff. I was actually trying to re-create this using R. I posted a few comments on ernie chans blog.
For the different lags. What is your procedure here. Can you break it down for me in a short example, perhaps if you do not want to pollute the community here maybe you can drop me an email? (sent you a msg!)

Hi, I just saw the post on the Hurst. Sorry, my blog has disappeared because my employer wasn't happy with it. Just a quick note: when you use it, please pay careful attentions to the lag values. I ran a wide range of lags over the SPY for example and found (unsurprisingly) that for shorter lags the data are strongly mean reverting while for longer lags they are trending. So, all of this is really a matter of perspective. Please keep that in mind when you use this tool.

@Andrew Bannerman, I saw your posts on Ernie Chan's blog and it was actually his answers to your questions that helped me put my code together. I'm not sure I'm the right person to help you as I have no idea about R, and I'm far from an expert on the Hurst exponent as until 3 days ago I didn't even know what it was. I've pieced together what I know from various blogs I've found online. But the last 3 days of fighting with this Hurst exponent have shown me that misguided and uneducated help (what I can probably offer) is probably better than no help at all, so I'm willing to give it a crack. I'm going to email you an iPython notebook, where I've broken down my reasoning step by step, which will probably be a lot clearer than what I've written here. All the reasoning comes from what Dr Chan has written on his blog, plus the clarifications he wrote to your posts. Hence, I'm not going to be able to explain anything that Dr Chan hasn't already explained to you, but perhaps I can explain it in a different way. And maybe that is all you need.

@Tom Starke, thanks for the time to reply and the warning. I'm aware of the implications with the different lag periods and it's one of the things that interests me about the Hurst, as your description would tend to indicate why short term MR systems can be profitable and long term TF systems can also be profitable on the SPY. Thanks for the warning, and all the work you did originally on the Hurst exponent.

Too bad about the blog Tom, it was a good one! Practical and realistic.

In an effort to understand how the hurst exponent is calculated. Can anyone help me explain in the code, when setting the lag range, lets say : for lag in range(2,20). When we input the SPY price into this, how does the lags 2,20 form... does it do a subtraction of today's SPY close - 2 days SPY close... and again today's SPY close - 20 day previous SPY close. It steps back -1 day through the entire data set? It means for every day... there is a new 2-20 leg subtraction? Is this thinking correct?

@Simon, I just changed my working arrangements, so there is a good chance that the blog will start again but it's probably more focussed on machine learning for financial applications.

The SPY data is input as an array (vector). We then iterate through each lag, so yes starting with 2. It creates another array that is the SPY minus the SPY 2 days ago. It takes the standard deviation of that array, then calculates the square root to give you a scalar number. Then it does that for 3 days ago. Then 4 days ago. All the way to 20 days behind. Now you have 19 different scalar numbers in an array called tau. You also have an array from earlier called lags which is just [2,3,4, ....., 19, 20]. You plot the log-log plot of both, find the slope of the line from the polyfit function and then multiply it by 2.0 to get the Hurst exponent. Note that is using the code higher above from Derek Tishler, the code I provided uses var instead of std, doesn't use sqrt and divides by 2.0 instea dof multiplying by 2.0, but the theory and answer is the same.

How would this look like for the SP500 Universe?

if hurst(context, data, context.stock) > 0.5 and price > moving_average:
order(context.stock, order_amount)

Thank you!