Hurst Exponent

Originally taken from this thread, the Hurst Exponent tells you whether a series is

1. Geometric random walk (H=0.5)
2. Mean-reverting series (H<0.5)
3. Trending Series (H>0.5)

If H decreases towards zero, the price series may be more mean reverting and if it increases more towards one, the price series may be more trending. As for testing whether H really = 0.5, there's something called the variance-ratio test ( more to update later ).

Much of the credit goes to Tom Starke as the mathematics behind the code is beyond my reach

Enjoy!

-Seong

Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
'''
Hurst Exponent implemented from Ernie Chan's 'Algorithmic Trading: winning strategies
and their rationale'. One of several tools to test whether a strategy is mean-reverting
'''
import numpy

def initialize(context):
context.past_prices = []
context.spy = sid(8554)

def handle_data(context, data):
log.info(hurst(context,data,context.spy))

@batch_transform(window_length=40)
def gather_data(data):
# Gathers data for an arbitrary number of days and refreshes
# Also serves to make sure that data exists
return data

'''
Adjusts a list of past prices to the number of periods you want
So if you want the nummber of prices in the last forty days, set period = 40
'''
def gather_prices(context, data, sid, period):
context.past_prices.append(data[sid].price)
if len(context.past_prices) > period:
context.past_prices.pop(0)
return

'''
Hurst exponent helps test whether the time series is:
(1) A Random Walk (H ~ 0.5)
(2) Trending (H > 0.5)
(3) Mean reverting (H < 0.5)
'''
def hurst(context, data, sid):
# Gathers all the prices that you need
gather_prices(context, data, sid, 40)
# Checks whether data exists
data_gathered = gather_data(data)
if data_gathered is None:
return

tau, lagvec = [], []
# Step through the different lags
for lag in range(2,20):
# Produce price different with lag
pp = numpy.subtract(context.past_prices[lag:],context.past_prices[:-lag])
# Write the different lags into a vector
lagvec.append(lag)
# Calculate the variance of the difference
tau.append(numpy.sqrt(numpy.std(pp)))
# Linear fit to a double-log graph to get power
m = numpy.polyfit(numpy.log10(lagvec),numpy.log10(tau),1)
# Calculate hurst
hurst = m[0]*2

return hurst

This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.
Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

52 responses

Hi Seong,

I am not a specialist in statistical models but I found that, while the Hurst has some relevance in determining the nature of self-correlation in a time series (i.e. trend: future return are positively correlated with past returns) it is by itself not very reliable. I build this little neural network tool which uses the Hurst and Sharpe ratio to identify the different domains more effectively (https://www.leinenbock.com/training-a-simple-neural-network-to-recognise-different-regimes-in-financial-time-series/). I had much better results with it than with the Hurst alone. The tricky bit is the training of the network. In my example I produce randomly generated time series for which I know whether they are mean reverting, trending or random. Initially, I train with very 'obvious' data sets and then move to more subtle ones. It does a good job in picking them up and I found it a good one to use especially for micro-trends.
My choice of Hurst and Sharpe for this job is because Hurst seems better with mean reversion than trends while Sharpe obviously does trends well. Since putting this on my site I've much improved this by using a third indicator but this is not ready yet.
Play around with it some time. It's good fun actually.

Tom,

Thanks for you reply, I've been playing around with the neural network and am wondering why after a certain point, the output begins to remain at [-1]

8379 0 [-1.] TRUE 1.6839304718 90 %
8380 1 [-1.] FALSE 1.68333333333 90 %
8381 1 [-1.] FALSE 1.68273661822 90 %
8382 1 [-1.] FALSE 1.68214032601 90 %
8383 1 [-1.] FALSE 1.68154445625 90 %
8384 0 [-1.] TRUE 1.68189868934 90 %


It seems to happen with both the Quantopian version as well as one that I've compiled for Python on my desktop

2002-12-24handle_data:179DEBUG[-1.]
2002-12-24handle_data:179DEBUG[-1.]
2002-12-24handle_data:179DEBUG[-1.]
2002-12-24handle_data:179DEBUG[-1.]
2002-12-24handle_data:179DEBUG[-1.]


Very interesting topic though am having a lot of fun with it

Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
# -*- coding: utf-8 -*-
"""
Created on Wed Sep 25 12:43:16 2013

@author: seongboii
"""

import numpy
import random

'''
This part is exclusively for the data simulation/learning part
'''

class FeedForwardNetwork:

def __init__(self, nIn, nHidden, nOut, hw = [], ow = []):
# learning rate
self.alpha = 0.1

# number of neurons in each layer
self.nIn = nIn
self.nHidden = nHidden
self.nOut = nOut

if not hw == [] and not ow == []:
# initialize weights with previous data
self.hWeights = hw
self.oWeights = ow

else:
# initialize weights randomly (+1 for bias)
self.hWeights = numpy.random.random((self.nHidden, self.nIn+1))
self.oWeights = numpy.random.random((self.nOut, self.nHidden+1))

# activations of neurons (sum of inputs)
self.hActivation = numpy.zeros((self.nHidden, 1), dtype=float)
self.oActivation = numpy.zeros((self.nOut, 1), dtype=float)

# outputs of neurons (after sigmoid function)
self.iOutput = numpy.zeros((self.nIn+1, nOut), dtype=float)      # +1 for bias
self.hOutput = numpy.zeros((self.nHidden+1, nOut), dtype=float)  # +1 for bias
self.oOutput = numpy.zeros((self.nOut), dtype=float)

# deltas for hidden and output layer
self.hDelta = numpy.zeros((self.nHidden), dtype=float)
self.oDelta = numpy.zeros((self.nOut), dtype=float)

def forward(self, input_node):
# set input as output of first layer (bias neuron = 1.0)
self.iOutput[:-1, 0] = input_node
self.iOutput[-1:, 0] = 1.0

# hidden layer
self.hActivation = numpy.dot(self.hWeights, self.iOutput)
self.hOutput[:-1, :] = numpy.tanh(self.hActivation)

# set bias neuron in hidden layer to 1.0
self.hOutput[-1:, :] = 1.0

# output layer
self.oActivation = numpy.dot(self.oWeights, self.hOutput)
self.oOutput = numpy.tanh(self.oActivation)

def backward(self, teach):
error = self.oOutput - numpy.array(teach, dtype=float)

# deltas of output neurons
self.oDelta = (1 - numpy.tanh(self.oActivation)) * numpy.tanh(self.oActivation) * error

# deltas of hidden neurons
self.hDelta = (1 - numpy.tanh(self.hActivation)) * numpy.tanh(self.hActivation) * numpy.dot(numpy.transpose(self.oWeights[:,:-1]), self.oDelta)

# apply weight changes
#        print self.hWeights, self.hDelta, self.iOutput.transpose()
self.hWeights = self.hWeights - self.alpha * numpy.dot(self.hDelta, numpy.transpose(self.iOutput))
self.oWeights = self.oWeights - self.alpha * numpy.dot(self.oDelta, numpy.transpose(self.hOutput))
#        koiuh
def getOutput(self):
return self.oOutput

def hurst(p):
tau = []; lagvec = []
#  Step through the different lags
for lag in range(2,20):
#  produce price difference with lag
pp = numpy.subtract(p[lag:],p[:-lag])
#  Write the different lags into a vector
lagvec.append(lag)
#  Calculate the variance of the differnce vector
tau.append(numpy.sqrt(numpy.std(pp)))
#  linear fit to double-log graph (gives power)
m = numpy.polyfit(numpy.log10(lagvec),numpy.log10(tau),1)
# calculate hurst
hurst = m[0]*2
return hurst

def sharpe(series):
ret = numpy.divide(numpy.diff(series),series[:-1])
return(numpy.mean(ret)/numpy.std(ret))

def simulate_coint(d, n, mu, sigma, start_point_X, start_point_Y):
#  This becomes a random walk if d = 0
X = numpy.zeros(n)
Y = numpy.zeros(n)
#  These are the starting points of the random walk in y
#  Be aware that X and Y are NOT coordinates but diffent series
X[0] = start_point_X
Y[0] = start_point_Y
for t in range(1,n):
#  Drunk and his dog cointegration equations
X[t] = X[t-1] + random.gauss(mu,sigma);
Y[t] = d*(X[t-1] - Y[t-1]) + Y[t-1] + random.gauss(mu,sigma);
return X,Y,X - Y

def simulate_momentum_data(n,offset,sigma):
#  This becomes a random walk if offset is 0
# produce the trending time series
return numpy.cumsum([random.gauss(offset,sigma) for i in range(n)])

def teach():
k = random.randint(0, 2)
if k == 0:
dummy, dummy, F = simulate_coint(0.3, 1000, 0, 0.5, 0.0, 0.0)
elif k == 1:
F = simulate_momentum_data(1000,0.1,0.9)
elif k == 2:
F = simulate_momentum_data(1000,0,0.9)
return k, sharpe(F[1:]), hurst(F[1:])

'''
Data simulation ends here
'''

def initialize(context):
context.spy = sid(8554)

context.hw = []
context.ow = []
context.ffn = FeedForwardNetwork(2,8,1, context.hw, context.ow)

def handle_data(context,data):
# Over time, the score is decreasing which should be the opposite
# What I'm thinking is that handle_data acts like the while loop

'''
Network Learning Phase for context.days_traded < 100
'''
context.ffn = FeedForwardNetwork(2,8,1, context.hw, context.ow)
true_count = 1
untrue_count = 1
uncertain_count = 1
count = 0
while(count < 100):

regime_desc, sharpe, hurst = teach()
context.ffn.forward([sharpe, hurst])
context.ffn.backward(regime_desc)

f_output = context.ffn.getOutput()[0]

if f_output >= 0.8 and regime_desc == 1:
true_count +=1
elif f_output <=0.2 and regime_desc == 0:
true_count += 1
elif f_output >=0.8 and regime_desc == 0 or f_output <0.2 and regime_desc==1:
untrue_count += 1
else:
uncertain_count +=1
total = float(uncertain_count) + float(true_count) + float(untrue_count)
context.hw = context.ffn.hWeights
context.ow = context.ffn.oWeights
count +=1
log.debug(f_output)

# The max i've seen is around 76% with results hovering around 60%
# Yet in the non-quantopian version, the numbers are able to go up and up
# I think it's because certain_answer is computer everytime while before, it's
# calculated as a running sum of all digits
# The statistical noise of 100% return at short term is 50%....
else:
log.debug("This is when the training wheels come off!")


This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

I did see that too and I'm not quite sure why that is. Will have to look into that. It is probably something quite trivial related to how those simple NN's are implemented. Proper understanding of NN's is an art in itself. That's the reason why I'm cautious in using them and only apply them to relatively simple things (so far). What I did in some of my programs was to introduce a termination condition, for example, if 90% of the last N events are recognised correctly: stop learning and refine the training conditions.
Even though those NN's are relatively simple in their essential functionality, the routines that you have to build around them to make them work for particular problems can be quite complex. My philosophy is to keep the basic algos as simple as possible and so I'm not sure it's always worth the effort. It's so easy to introduce unmanageable complexity when you write new algorithms for optimisation. Even though there are a number of very fancy toolboxes in Python I like to use basic code, which I can see, change and understand rather than relying on black-box magic. As you can see, even that simple code above shows some properties, which aren't immediately obvious if you are not in the field.

I went ahead and tried using the neural network in an actual algorithm albeit very simply done. You can find it here : https://quantopian.com/posts/neural-network-that-tests-for-mean-reversion-or-momentum-trending

I am new to the Hurst exponent and algorithmic trading in general. While the mathematics are above me, I am intrigued by the idea behind the Hurst exponent. Would there be a way to use the Hurst exponent identifying a trending market, and then combine it with a simple EMA or SMA crossover trading system? As noted here, it seems that the Hurst exponent in and of itself is not enough for a trading system. Thanks.

Josh,

I think that's a very plausible idea, and as Tom mentioned, it'd be wise to pair it with something else (in the neural network's case he used the Sharpe ratio).

Are you thinking something like

if hurst(context, data, context.stock) > 0.5 and price > moving_average:
order(context.stock, order_amount)


Where if the hurst exponent is greater than 0.5 (which would indicate a momentum trending time series) and the price is greater than the X day moving average (e.g. 30 day moving average) go long on the stock

My feeling is that most of the findings of values of H in stock returns that differ from H=0.5 can be tracked down to methodology problems. Here is my article on this subject. I would be very grateful for your comments. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2564916

Thanks, I'll take a look tomorrow!

What ever happened to Dr. Tom Starke's blog? Leinenbock is gone, and drtomstarke.com has a password...

This is an interesting concept...

Hi, Peters mentioned in the book that using log return instead of the plain price change is more appropriate. Could you modify the codes?

Derek,
Nice! Thanks for sticking with this!
Given that you can now compute this for large numbers of assets over periods of time,
do you have any guidance on what the hurst exponent characteristics for an asset looks like over a period of time?
Also, with this increased computing power, any new ways to use this exponent with co-integrated pairs/baskets of stocks?
alan

Derek,
Thanks so much for the notebook and algo!
Lots to chew over here, so let me get back to you with some questions and comments.
alan

Apologies if this is the wrong forum to ask a question such as this.
I was wondering if someone could help me understand where the sqrt and std come from in the following line of code?
tau = [np.sqrt(np.std(np.subtract(ts[lag:], ts[:-lag]))) for lag in lags]

I believe it comes from the following principles, but I just can't reconcile how the formulae (copied below) lead to being able to use the code given, with the std and sqrt. I am also struggling to reconcile what the <| ..... |> symbols mean as well, am I right in thinking that the | | means absolute value or norm of a vector and as stated < ... > means as average of all the data points?

I have seen this question pop up in 2 different forums, so I'm not the only person that is lost, but I've never seen it answered.

Thanks in advance and apologies if this question is not clear, I'm happy to provide further clarification, just let me know.

Formulae
Var(τ) = 〈|z(t + τ) − z(t)|^2〉

where

z = log(price)
τ is an arbitrary time lag
〈…〉 is an average over all t’s


For a geometric random walk:

〈|z(t + τ) − z(t)|^2〉 ∼ τ

But if the series is either mean reverting or trending, this relationship will not hold, and instead we get

〈|z(t + τ) − z(t)|^2〉 ∼ τ^2H

I remember wondering how hurst was calculated once, and I remember figuring it out. The trick was that it was a slope of a linear regression of a variety of point estimates in log space, or something like that. The confusion comes from there being two sources of time: there's the lag between time points, and there's the number of points of each lag that you use to estimate the value of that lag. IIRC. Sorry I can't be of more help.

Thanks for the reply. I've actually managed to go back to the source which I believe is Dr Ernie Chan's blog and using the principles he outlines there, put together my own code. It appears that if I use variance instead of standard deviation, get rid of the sqrt (which I couldn't understand the purpose of in the first place), and use a 2.0 divisor instead of a 2.0 multiplier (because of the 2H), it actually gives the same answer, which is one that I can understand from first principles.

Dr Chan doesn't give any code on this page (I believe he works in MATLAB not Python anyway). Hence I needed to put together my own code from the notes he gives in his blog and answers he gives to questions posed on his blog.

1. Dr Chan states that if z is the log price, then volatility, sampled at intervals of τ, is volatility(τ)=√(Var(z(t)-z(t-τ))). To me another way of describing volatility is standard deviation, so std(τ)=√(Var(z(t)-z(t-τ)))

2. std is just the root of variance so var(τ)=(Var(z(t)-z(t-τ)))

3. Dr Chan then states: In general, we can write Var(τ) ∝ τ^(2H) where H is the Hurst exponent

4. Hence (Var(z(t)-z(t-τ))) ∝ τ^(2H)

5. Taking the log of each side we get log (Var(z(t)-z(t-τ))) ∝ 2H log τ

6. [ log (Var(z(t)-z(t-τ))) / log τ ] / 2 ∝ H (gives the Hurst exponent) where we know the term in square brackets on far left is the slope of a log-log plot of tau and a corresponding set of variances.

# Let's try it the Ernie Chan way
# http://epchan.blogspot.fr/2016/04/mean-reversion-momentum-and-volatility.html

def hurst_ernie_chan(p):

variancetau = []; tau = []

for lag in lags:
#  Write the different lags into a vector to compute a set of tau or lags
tau.append(lag)

# Compute the log returns on all days, then compute the variance on the difference in log returns
# call this pp or the price difference
pp = subtract(p[lag:], p[:-lag])
variancetau.append(var(pp))

# we now have a set of tau or lags and a corresponding set of variances.
#print tau
#print variancetau

# plot the log of those variance against the log of tau and get the slope
m = polyfit(log10(tau),log10(variancetau),1)

hurst = m[0] / 2

return hurst


Good stuff. I was actually trying to re-create this using R. I posted a few comments on ernie chans blog.
For the different lags. What is your procedure here. Can you break it down for me in a short example, perhaps if you do not want to pollute the community here maybe you can drop me an email? (sent you a msg!)

Hi, I just saw the post on the Hurst. Sorry, my blog has disappeared because my employer wasn't happy with it. Just a quick note: when you use it, please pay careful attentions to the lag values. I ran a wide range of lags over the SPY for example and found (unsurprisingly) that for shorter lags the data are strongly mean reverting while for longer lags they are trending. So, all of this is really a matter of perspective. Please keep that in mind when you use this tool.

@Andrew Bannerman, I saw your posts on Ernie Chan's blog and it was actually his answers to your questions that helped me put my code together. I'm not sure I'm the right person to help you as I have no idea about R, and I'm far from an expert on the Hurst exponent as until 3 days ago I didn't even know what it was. I've pieced together what I know from various blogs I've found online. But the last 3 days of fighting with this Hurst exponent have shown me that misguided and uneducated help (what I can probably offer) is probably better than no help at all, so I'm willing to give it a crack. I'm going to email you an iPython notebook, where I've broken down my reasoning step by step, which will probably be a lot clearer than what I've written here. All the reasoning comes from what Dr Chan has written on his blog, plus the clarifications he wrote to your posts. Hence, I'm not going to be able to explain anything that Dr Chan hasn't already explained to you, but perhaps I can explain it in a different way. And maybe that is all you need.

@Tom Starke, thanks for the time to reply and the warning. I'm aware of the implications with the different lag periods and it's one of the things that interests me about the Hurst, as your description would tend to indicate why short term MR systems can be profitable and long term TF systems can also be profitable on the SPY. Thanks for the warning, and all the work you did originally on the Hurst exponent.

Too bad about the blog Tom, it was a good one! Practical and realistic.

In an effort to understand how the hurst exponent is calculated. Can anyone help me explain in the code, when setting the lag range, lets say : for lag in range(2,20). When we input the SPY price into this, how does the lags 2,20 form... does it do a subtraction of today's SPY close - 2 days SPY close... and again today's SPY close - 20 day previous SPY close. It steps back -1 day through the entire data set? It means for every day... there is a new 2-20 leg subtraction? Is this thinking correct?

@Simon, I just changed my working arrangements, so there is a good chance that the blog will start again but it's probably more focussed on machine learning for financial applications.

The SPY data is input as an array (vector). We then iterate through each lag, so yes starting with 2. It creates another array that is the SPY minus the SPY 2 days ago. It takes the standard deviation of that array, then calculates the square root to give you a scalar number. Then it does that for 3 days ago. Then 4 days ago. All the way to 20 days behind. Now you have 19 different scalar numbers in an array called tau. You also have an array from earlier called lags which is just [2,3,4, ....., 19, 20]. You plot the log-log plot of both, find the slope of the line from the polyfit function and then multiply it by 2.0 to get the Hurst exponent. Note that is using the code higher above from Derek Tishler, the code I provided uses var instead of std, doesn't use sqrt and divides by 2.0 instea dof multiplying by 2.0, but the theory and answer is the same.

How would this look like for the SP500 Universe?

if hurst(context, data, context.stock) > 0.5 and price > moving_average:
order(context.stock, order_amount)

Thank you!

Hurst Exponent & Practical Trading Considerations: Estimation of the classical Hurst exponent (H) requires a long look-back period, in fact considerably longer than that of most of the "technical analysis indicators" commonly used by traders. The reason for the long look-back requirement for H is that Hurst's calculation methodology involves finding the best fit line on a set of data points from windows of length 256, 128, 64, 32, 16, 8. This then gives six data points for fitting to obtain the resultant value of H. However, in the case of financial markets, the 256-bar data-point (which is from a year ago using EOD data), and even the 128-bar (6 month ago EOD) data points may or may not be at all representative of current market conditions, so probably only 4 out of the 6 data points are likely to be genuinely representative of what the market is actually doing recently. Trying to include the next data point down (based on 4 bars) in calculating H generally just increases the noise in the calculated output value. So estimation of H in a trading environment suffers from two main problems: 1) It is noisy, as it involves calculating the slope of a line based on only a small number of points, some of which are of questionable reliability anyway, and 2) It requires a long data window to obtain those points. As anyone with practical trading experience knows, good "indicators" have short windows and small lags. The Hurst exponent, at least if calculated based on Hurst's original methodology, just doesn't fit that bill very well at all. So H, although conceptually an interesting tool for investigating long-memory processes and evaluating market regime (random / mean reverting / trending) was never designed for financial markets and doesn't really work very well in practice. HOWEVER, if we can find some proxy for H which does not require the long window and incur the inherent lag that Hurst's calculation does, then we have something really useful! Based on my work in the past, I would suggest to anyone looking to use the Hurst exponent that instead they try to find something that approximates H in the specific context of financial market data behavior, but requires a shorter data window and has a smaller lag, and then use that as a proxy in preference to H itself. Best wishes.

it is possible to estimate Local Hurst Exponents H(t) But it is an indicator about past returns, whether they are in trend or not (in persistent or anti-persistent behavior )
check
Ihlen, E.A.E. (2012). Introduction to Multifractal Detrended Fluctutation Analysis in Matlab. Frontiers In Phsysiology, 3(141). doi:10.3389/fphys.2012.00141

Thank you for your comment. In fact there are quite a few papers dealing with multi-fractal analysis, not only for financial markets, for example see:
https://www.aut.ac.nz/__data/assets/pdf_file/0003/322950/Economics-WP-2012-08.pdf, or
http://people.physics.anu.edu.au/~tas110/Teaching/Lectures/L14/Material/PhysA_Scaling03.pdf

but also for other areas of knowledge where time series are involved, for example:
https://arxiv.org/abs/1208.6174 and https://arxiv.org/pdf/1208.6174.pdf for language processing,
in addition to the example you cite from the field of physiology.

Your implied suggestion to look for ideas outside the field of finance is a good one, and in fact inter-disciplinary searches (particularly in the areas of quantitative medical research & biophysics) often turn up some great ideas with potential applicability to trading!

Your comment that H is an indicator of PAST returns is of course true, but in fact most conventional technical indicators are inherently lagging in nature. The trick is usually to find ones that minimize the lag. Various people such as John F. Ehlers (see Amazon for his books) have endeavored to do this by applying DSP methods borrowed from Electrical Engineering theory and applying them to financial market data. Understanding the inherent differences between typical engineering type "signals" and financial market data explains why this is only partially successful, but nevertheless the idea of lag minimization is definitely a useful one.

When cloning this algorithm it pulls up an error:
Line 15 Error Runtime exception: NameError: name 'batch_transform' is not defined batch_transform is no longer supported. Please use history instead.
What would be the correct to fix this issue?

@Tommy,
Needed to use new data.history methods and get rid of a funtion not needed anymore.
(See https://www.quantopian.com/tutorials/getting-started#lesson5 )
I've included a running backtest, with the Hurst coeff also recorded on the Overview plot screen.
alan

24
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
'''
Hurst Exponent implemented from Ernie Chan's 'Algorithmic Trading: winning strategies
and their rationale'. One of several tools to test whether a strategy is mean-reverting
'''
import numpy

def initialize(context):
context.past_prices = []
context.spy = sid(8554)

def handle_data(context, data):
hurst_val = hurst(context,data,context.spy)
log.info(hurst_val)
record(Hurst = hurst_val)

'''
Adjusts a list of past prices to the number of periods you want
So if you want the nummber of prices in the last forty days, set period = 40
'''
def gather_prices(context, data, sid, period):
context.past_prices = data.history(sid, fields="price", bar_count=period, frequency="1d")
if len(context.past_prices) > period:
context.past_prices.pop(0)
return

'''
Hurst exponent helps test whether the time series is:
(1) A Random Walk (H ~ 0.5)
(2) Trending (H > 0.5)
(3) Mean reverting (H < 0.5)
'''
def hurst(context, data, sid):
# Gathers all the prices that you need
gather_prices(context, data, sid, 40)

tau, lagvec = [], []
# Step through the different lags
for lag in range(2,20):
# Produce price different with lag
pp = numpy.subtract(context.past_prices[lag:],context.past_prices[:-lag])
# Write the different lags into a vector
lagvec.append(lag)
# Calculate the variance of the difference
tau.append(numpy.sqrt(numpy.std(pp)))
# Linear fit to a double-log graph to get power
m = numpy.polyfit(numpy.log10(lagvec),numpy.log10(tau),1)
# Calculate hurst
hurst = m[0]*2

return hurst
There was a runtime error.

As the Hurst Exponent is somehow an appealing concept, I made some tests on the number of lags and the number of lock back periods based on the code of Tom Starke:

• Geometric Brown Motion seems to be a good indicator as it should be always around 0.5
• With lags above 22, better above 25, the H value seems to stabilize towards a certain value
• If the look back period is too small, the H value tends to decrease much below 0.5, leading to wrong conclusions
• This is also in line with visual comparison of synthetic reverting and trending datasets with real stocks

So the values I am now working with are the following. I hope it is not too far away from the theory.

gather_prices(context, data, sid, 128) # in line with the comments of Tony Morland
for lag in range(2,30):


On daily data, this is a time window of 6 months, so probably more helpful as an input for a market regime indicator. Good for Mr. Hurst that he could use some 847 year record of Nile overflow data as he came up with the idea...

Hi @Alan, thanks for your algo code.

I had wondered if you might be working with the Hurst exponent, and i would be most interested to know if you are having practical success with it.

For me personally, the question: "What is the current market regime?" is actually THE BIGGEST single question in trading because, if you get it right, then you have solved the problem of all those trades that would be in the wrong direction at the worst possible time. I keep coming back to the Hurst exponent because it SEEMS like it should be capable of providing a good answer, but in practice it keeps disappointing me. The large number of bars that are required if one uses the classical calculation of Hurst's original work is not practical for most trading purposes. Methods that use a small number of bars tend to produce results that are either unstable, or highly dependent on the actual number of bars used, or simply don't seem to be in accordance with reality in the markets. Sure almost everyone says that they know that stocks & stock indices are mean-reverting, and the calculated values of H which are usually in the range 0 to 0.5 strongly support that notion. However the reality of the last 6+ years is that most of the time the S&P500 has been in a strong uptrend, and even the most skeptical person could hardly make a case that the S&P500 has been in anything but "trend mode" for the past year or so since Nov 2016. Yet despite the clearly evident trending behavior, 0 < H < 0.5 says "no, its MR, not trending"!

I continue to have a lot of hope for a robust, short calculation period (i.e. short lag) proxy for H as a market regime indicator, but H itself seems to be a disappointing choice when it keeps signaling a condition (MR) that is not supported by the visually evident reality of an amazingly long trend. Any comments most welcome.

Hi @Frank, although it might seem that Mr. H was lucky to have so much more data than we do, actually each year was only 1 data point for him so, in terms of data availability, he was really no better off than a trader with about 3 years of EOD data. To my mind, the biggest difference between him & us is that, unlike the markets, the River Nile is (presumably) not a semi-sentient complex-adaptive entity evolving in clever ways to continually outsmart the people trying to study it! ;-))

@Alan, Frank, Tony,

I believe this line in the above code is incorrect:

    **# Calculate the variance of the difference
tau.append(numpy.sqrt(numpy.std(pp)))**


I don't think you have to get the square root of the standard deviation (which is the square root of the variances) . Prof. Hurst did a rescaled range analysis of different lags divided by its corresponding standard deviations. Here's my corrected version of Hurst Exponent using a period of 256 days and lags of 8,16,32,64,128:

7
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
'''
Hurst Exponent implemented from Ernie Chan's 'Algorithmic Trading: winning strategies
and their rationale'. One of several tools to test whether a strategy is mean-reverting
'''
import numpy as np

def initialize(context):
context.past_prices = []
context.spy = sid(8554)

def handle_data(context, data):

hurst_val = hurst(context,data,context.spy)
log.info(hurst_val)
record(Hurst = hurst_val)

'''
Adjusts a list of past prices to the number of periods you want
So if you want the nummber of prices in the last forty days, set period = 40
'''
def gather_prices(context, data, sid, period):
context.past_prices = data.history(sid, fields="price", bar_count=period, frequency="1d")
if len(context.past_prices) > period:
context.past_prices.pop(0)
return

'''
Hurst exponent helps test whether the time series is:
(1) A Random Walk (H ~ 0.5)
(2) Trending (H > 0.5)
(3) Mean reverting (H < 0.5)
'''
def hurst(context, data, sid):
# Gathers all the prices that you need
gather_prices(context, data, sid, 256)

tau, lagvec = [], []
lags = [8,16,32,64,128]
# Step through the different lags
for lag in lags: #range(2,20):
# Produce price different with lag
pp = np.subtract(context.past_prices[lag:],context.past_prices[:-lag])
# Write the different lags into a vector
lagvec.append(lag)
# Calculate the variance of the difference
tau.append(np.std(pp)) #tau.append(np.sqrt(np.std(pp)))
# Linear fit to a double-log graph to get power
m = np.polyfit(np.log10(lagvec),np.log10(tau),1)
# Calculate hurst
hurst = m[0]*2

return hurst
There was a runtime error.

@James
Thanks for the interest! I updated the code to reflect your changes (powers of 2)a nd info I found, and also ran the two methods side by side( hurst versus hurst_new(Villa).
They differ by a factor of two, which I believe is captured in the argument at the end of the post:

https://stackoverflow.com/questions/39488806/hurst-exponent-in-python

So I got rid of factor of 2 on the return for your method (hurst_new) to reflect the fact that you noticed that the actual definition of the hurst coeff was different than the code.
If this doesn't resolve what is the actual Hurst Estimated value, we'll have to go to Matlab or R implementations...would rather not...

@Frank, @Tony,
I changed the code so that it doesn't eval every minute, but only at the end of the day.
I also put in a simple threshold test for MeanRev and Trending that plots on the Q ide, so you can see the regions where these signals fire off...in this case for AAPL.

I have ben using a faster version of Hurst from Derek Tishler as a factor in my code, but with no compelling, or for that matter any results.
As a factor, you compute Hurst coeff for every stock in your Universe, over a window...obviously, this can get computationally expensive.
If you wish, I could publish that factor here.

In researching the problem @James pointed out, I found the link:

http://www.bearcave.com/misl/misl_tech/wavelets/hurst/

which I'm in the middle of reading...it's fascinating, and points out, perhaps, some problems in using the Hurst coeff...
e.g,
How many and which lags?
Prices or Returns or ?? ?
Does it work at all for financial data?

24
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
'''
Hurst Exponent implemented from Ernie Chan's 'Algorithmic Trading: winning strategies
and their rationale'. One of several tools to test whether a strategy is mean-reverting
'''
#Analysis of difference between hurst=slope*2 and hurst_new(Villa)=slope
#From: https://stackoverflow.com/questions/39488806/hurst-exponent-in-python
#By:   https://stackoverflow.com/users/4013571/alexander-mcfarlane
#  All that is going on here is a variation on math notation
#  I'll define
#  d = subtract(ts[lag:], ts[:-lag])
#  Then it is clear that
# np.log(np.std(d)**2) == np.log(np.var(d))
# np.log(np.std(d)) == .5*np.log(np.var(d))
# Then you have the equivalence
# 2*np.log(np.sqrt(np.std(d))) == .5*np.log(np.sqrt(np.var(d)))
# The functional output of polyfit scales proportionally to its input

# Deeper Analysis of Hurst: http://www.bearcave.com/misl/misl_tech/wavelets/hurst/

import numpy as np
LAGS  = [8,16,32,64,128,256]
WIN  = LAGS[-1]*2 # power of 2 larger than LAGS[-1]
STK = sid(24) #AAPL

def initialize(context):
context.past_prices = []
context.spy = sid(8554)
context.stk = STK
log.info('Stk for Hurst={}'.format(context.stk.symbol))

schedule_function(func=rebalance,
date_rule=date_rules.every_day(),
time_rule=time_rules.market_close(minutes=15))

def handle_data(context, data):
pass

def rebalance(context, data):
# Gathers all the prices that you need
gather_prices(context, data, context.stk, WIN)

hurst_val = hurst(context, data, context.stk)
hurst_new = hurst_v2(context, data, context.stk)
# For finite values, isclose uses the following equation:
#absolute(a - b) <= (atol + rtol * absolute(b))
Trend_sig = np.isclose(hurst_val,0.5, atol=0.1, rtol=0.05)*(hurst_val-0.5)
Mnrev_sig = np.isclose(hurst_val,0.5, atol=0.1, rtol=0.05)*(0.5-hurst_val)
#hurst_fast_val = hurst_fast(context, data, context.spy)

#log.info(hurst_val)
record(Hurst_old=hurst_val, Hurst_new=hurst_new, Hurst_diff=hurst_val-hurst_new)
record(Trend_sig=Trend_sig>0, Mnrev_sig=Mnrev_sig )
#record(Hurst = hurst_val, Hurst_fast=hurst_fast_val, Diff=hurst_val-hurst_fast_val)

'''
Adjusts a list of past prices to the number of periods you want
So if you want the nummber of prices in the last forty days, set period = 40
'''
def gather_prices(context, data, sid, period):
context.past_prices = data.history(sid, fields="price", bar_count=period, frequency="1d")
if len(context.past_prices) > period:
context.past_prices.pop(0)
return

'''
Hurst exponent helps test whether the time series is:
(1) A Random Walk (H ~ 0.5)
(2) Trending (H > 0.5)
(3) Mean reverting (H < 0.5)
'''
def hurst(context, data, sid):
# Gathers all the prices that you need
#gather_prices(context, data, sid, WIN)

tau, lagvec = [], []
# Step through the different lags
for lag in LAGS:
# Produce price different with lag
d = np.subtract(context.past_prices[lag:],context.past_prices[:-lag])

# Write the different lags into a vector
lagvec.append(lag)
# Calculate the variance of the difference
tau.append(np.sqrt(np.std(d)))
# Linear fit to a double-log graph to get power
m = np.polyfit(np.log10(lagvec),np.log10(tau),1)
# Calculate hurst
hurst = m[0]*2

return hurst

def hurst_v2(context, data, sid):
# @Villa hurst
# Gathers all the prices that you need
#gather_prices(context, data, sid, 256)

tau, lagvec = [], []
lags = LAGS
# Step through the different lags
for lag in lags: #range(2,20):
# Produce price different with lag
pp = np.subtract(context.past_prices[lag:],context.past_prices[:-lag])
# Write the different lags into a vector
lagvec.append(lag)
# Calculate the variance of the difference
tau.append(np.std(pp)) #tau.append(np.sqrt(np.std(pp)))
# Linear fit to a double-log graph to get power
m = np.polyfit(np.log10(lagvec),np.log10(tau),1)
# Calculate hurst == slope of line
hurst = m[0]

return hurst
There was a runtime error.

@Alan, I think your computation is now correct (almost!), you just need to use log prices so that when you subtract lag differences you get log returns.
Man, that link to bearcave brings back memories of my early years of research into Chaos Theory. That was a good read back then and still is.

There is also another version of Hurst coefficient by John Ehlers, it's sort of an abbreviated one. He uses the coefficient as a parameter to an adaptive moving average known as Fractal Moving Average. I have the code in a different language, I'll try to code it in python and post it later.

Hi @Alan, thanks for your comments and, yes, i would be pleased if you would publish your fast H factor here. I will certainly enjoy taking a look at it.

I assume that your main interest in H is probably as a tool to assist in discriminating between Ranging & Trending market regimes for switching between MR & Trend Following trading styles. I think this is a great topic to work on, but my own experience in using H for it has also been lacking in useful results and i'm not surprised to read that you haven't found anything compelling using H. My conclusion was that the tool (H) is just not very well-suited for the job. I keep looking for short-period, minimal-lag alternatives that might be better suited to the specifics of financial markets. I have had at least some limited partial success with the idea in my own personal trading research using a different language & platform, but have not tried it in Q algos.

Was literally just about to search for this; thank you for sharing!

@Tony,
Enclosed is a Hurst Notebook, modified from a D.Tischler one that computes a fast Hurst Exponent factor. using the log(prices), as @Villa recommends above...I haven't changed the code above, which is in the Q-IDE, to reflect that...log(prices)... change.
The notebook then applies that factor over a year's time frame, across a simple culled universe of ~10 stocks.
Finally, we plot the Hurst Exponent traces all together to get an idea of what they look like across multiple stocks.

To me, the short take-away by glancing at the chart is that there is not much action above the 0.5 level.
I've tried to use this factor to compute the HurstExponent on a massive level and use it as a factor, yet, without compelling results.
I am now convinced that to do so requires some thesis wrt to usage(Sector specific) and meta-parameter optimization(e.g. lags, window-size, etc.),
along with a far amount of research to get to a useful place. Also, one runs out of computation time if trying to use the Q-IDE for live trading.
alan

18

Hi @Alan, many thanks for the Hurst notebook.

As you say, there isn't much H > 0.5 and, at least from a casual visual inspection, what there is doesn't seem to have a unique relationship with the price trends (or lack of them) that can be seen on the corresponding stock charts. Although there seem to be some general areas of agreement, e.g. AAPL with highest H is mostly trending during the period shown, and MSFT with lowest H in early 2017 is flat during that period, overall it is a less than convincing relationship. I continue to assume that your main interest in H is as a trend vs MR indicator and, regarding your comments about getting from this to something useful, here are my thoughts.

• Any indicator of "trending vs non-trending" will invariably give results that may be ambiguous for several reasons. Not only are the responses of many indicators (and especially H) highly dependent on window length (which then gives rise to associated lag effects as you say) but also, in the case of "trend or not trend" which i assume is what you are using H for, it is not only the INDICATOR but also the underlying PRICE SERIES itself that might be trending, or not trending, or even trending in the opposite direction, depending on the period over which the price is being considered, even in the absence of indicator artifacts!

• If, as i suspect, H is generally "too slow" (in terms of impulse response) to be really useful as a trend vs. MR indicator, then one might perhaps argue that during genuine trends H follows too slowly and therefore often does not manage to complete its buildup to values > 0.5, in which case it is not only the value of H that needs to be considered, but also the slope of H, i.e. if H is < 0.5 but is rising steadily towards 0.5, then this might also be sufficient to indicate a trending market.... perhaps?

My thoughts are that, if one's intention is to discriminate between trending & MR market regimes, then one can devise other indicators that appear to be more accurate, more responsive, and more intuitive than H. Although I did "want H to work" as an indicator, it doesn't seem to, and your final comment about "running out of computation time" reinforces my conclusion that efforts can be more productively focused elsewhere.

Cheers, thanks & best regards,
Tony

@Alan -

just pointing out that if you like to perform a more thorough analysis you can always use Alphalens. You can filter the assets which have hurst < 0.5 and check if they are mean reverting (and the same with assets with hurst > 0.5 and check if they are trending)

20

Hi @Luca, i'm not skilled with using Alphalens and so, although you say that "one can perform a more thorough analysis with it", for my part i'm not sure what to conclude from the graphs you have presented. Perhaps you would care to make some comment on your interpretation, for the sake of dummies like me ;-)

Hi @Alan & @James, i remembered the EasyLanguage code you mentioned from Ehler's book "Cycle Analytics for Traders" (Wiley, 2013). I don't know what you think of Ehlers but, as my own background in Physics, Electrical Engineering & Signal Processing is somewhat similar to his, i think i'm reasonably competent to comment on his work. Basically as Engineering / Digital Signal Processing it is conceptually sound, although sometimes i think he tries to force good DSP methods on to market situations that are outside their domain of applicability, and he usually tends to be a little more enthusiastic about his indicators than is justified. However it is interesting to note his comments on the Hurst exponent (which he calls Hurst coefficient). In particular, from Chapter 6 of his book he states the following: " I would like to make it perfectly clear that the Hurst coefficient or the fractal dimension has no direct practical application to trading .... it has no predictive value". Then in Key Points to Remember at the end of the chapter, Ehlers concludes: "It [H] is related to the power spectral density of the noise", and "... has no predictive value and therefore no direct usefulness in trading", but then he appears to contradict himself by stating (as we would expect) that H< 0.5 indicates a cycle mode and H> 0.5 indicates a trend mode. So it appears that Mr. Ehlers himself is a little confused about the utility of H. If his assertion that H has no direct usefulness in trading, then one can only wonder why he spent ten pages describing it and his version of code for it!

@Tony, I totally agree with your assessment.

It's like having the Hurst exponent for a randomly generated price series. You could randomly have an H>0.50 or an H<0.50. As a matter of fact, statistically, you would rarely get exactly: H = 0.50.

We accept that a randomly generated price series ends up with no predictive value, and then want to make a huge distinction for price series that are almost randomly generated. In the sense that they deviate little from randomness.

At the right edge of any price chart, your expected H will still be about: H = 0.50, even if the past data series had something else to show.

@Tony, Mr. Ehlers is quite an enigma! As I am not an engineer, I could not fairly assess his work in applying DSP concepts to the financial markets. But having said that, I was fascinated by some of the filtering algorithms and trading systems he has come up with, most specially with regards to identifying whether the market is trending or cycling which is what both you and I think is the crux of formulating a successful trading system. His R-MESA (Maximum Entropy Spectrum Analysis-Burg Algorithm) was rated as one of the top 10 S&P trading systems continuously for over 10 years, as rated by Futures Truth.

Let me just quote him on something that really intrigues me to do some further digging:

The spectral shape of market data is pink noise, as described by Mandelbrot and the slope of the spectral dilation is measured by the Hurst Coefficient. Therefore, accurate measurements of the dominant cycle must include compensating filters to remove the imbalance of cycle amplitudes across the spectrum. While this can be done, I discovered that an autocorrelation periodogram automatically removes the effects of spectral dilation because the correlation function is normalized to swing between -1 and +1, regardless of the cycle period. Therefore, the autocorrelation periodogram is currently the preferred method to measure market cycles.

,@Tony, having similar backgrounds and expertise in Engineering/DSP, what do you think of this view? Seems like he now prefers autocorrelation periodogram over MESA or Hilbert Transform, what an enigma!

Hi @James,
Firstly. thanks for your acknowledgement regarding " .... what both you and I think is the crux of formulating a successful trading system" Basically with that piece of info, if known reliably, everything else gets remarkably easy, as long as one has the discipline to adhere to the caveat: "... and if in doubt then just stay out" :-)

I have read all of John Ehlers books and experimented quite a bit with all of the ideas in them. His first work on MESA was adapted from a technique used in geophysical processing for oil exploration (also an area that I worked in). Most of Ehlers filtering ideas and things like Hilbert transforms come directly from Electrical Engineering & DSP. All of these techniques are valid, widely used and well documented in Engineering literature & textbooks. Ehlers innovation was to apply these ideas to trading, based on the notion that bars of trading data are like digitally sampled signals. In that area Ehlers has certainly created a niche and a reputation for himself. I like a lot of his ideas. The ones related to low-pass filtering are particularly good. Butterworth filters are well known within Electrical Engineering, but Ehlers modified 2nd order Butterworth filter is the best minimum distortion, maximally flat within its passband, compact, minimum lag smoothing filter that I know of (even better than his "Super-Smoother").

I understand Ehlers ideas very well, and in particular his considerations of signal-to-noise. However I do have two mild criticisms of his work. The first is a simple practical one. In the signals that Electrical Engineers deal with there are often many cycles (for example of a carrier wave with some fixed frequency) with an amplitude or frequency that is modulated at a slower rate (for example by an audio signal in AM or FM radio). All the DSP ideas that Ehlers uses work very well in those situations. However in trading the difference is that we do not have a large number (e.g. hundreds or thousands) of underlying cycles to be able to process. In fact the most we usually ever see is about two cycles with a decaying envelope in the case of a classical triangle pattern. In fact often we don't even get one full cycle of price data before traders figure out what is happening, trade in anticipation of the cycle completion, and thereby destroy it. This represents a major limitation and, at least as i see it, the first breakdown in the assumption that securities price data can be adequately treated with conventional Engineering DSP methods.

The second problem (criticism) that i have is a little more subtle. In conventional DSP we usually have a data stream that is a mixture of a signal which is either made by humans (e.g. voice, music, the trajectory of a vehicle or projectile, etc) or by nature (e.g. the geology of a sedimentary basin containing oil, etc) in some reasonably uniform way, and in either case there are some inherent regularities because of the process that generated the signal. Then, contaminating the wanted signal, there is inevitably added some unwanted noise, so what we observe is a mixture of signal + noise. Usually we can easily tell what is signal and what is noise, and the job of filtering or DSP is to separate them. In the context of trading this is NOT so easy; What really is "signal" and what really is "noise"? This is not just a philosophical question. Some people say that in trading there is no such thing as "noise". Experts in Price Action Trading, such as Al Brooks for example, contend that ALL price bars contain meaningful information for trading and none of them are "noise".

Personally i also have another philosophical issue with the idea of "signal". In trading data, we have (mostly reasonably) sentient beings or their algos, continually trying to outguess what the market is about to do next and responding as fast as they can within their own individual time-frames. This degree of responsiveness of the target in trading is very different to the usual "signals" that Engineering DSP methods have to deal with. Just imagine if the Earth's geology, or a piece of music was self-aware and was continually trying to "trick" the Engineering-types into mis-predicting it!! Traders need to be very careful about what they call "signal" and what they call "noise".

With regard to Ehlers quote that you provide, @James, the words that stand out most to me are: " ... accurate measurements of the dominant cycle ..."
A simple and often quite useful conceptual model of market data is that it consists of a trend component, a cycle component, and noise. It is often quite a good model and one i have experimented with a lot. I have read Ehler's comments that he believes that usually there is only ONE dominant cycle. I think this is his preferred conceptual model rather than a statement of fact. Even if it were true, then the period (and/or phase) of the "dominant cycle" are non-stationary and keep drifting. Anyone who has seriously tried to use the trend+cycle+noise model for trading has found that determining the varying cycle period is difficult. Anyone who has done careful spectral analysis of market data has seen that there is usually more than one significant cycle period in play, and these are not always just the harmonics that give rise to all the usual Fourier synthesis effects like double tops, classical H&S patterns, etc.

My own experience with Ehlers Hilbert transform (which I coded from one of his books) is that for trading data it just does not work as well as it does in its more usual domain of applicability in Electrical Engineering DSP, and again the reasons for that are as mentioned earlier.

Although Ehlers may seem enigmatic in some ways, i think there is an underlying explanation. Although Ehlers may (or may not, i don't know) be a trader, first and foremost he is an author, and presumably also a consultant and still a marketer of his software. As with any kind of marketing, ideas that have been around for a long time often benefit from some rejuvenation or change. MESA has now been around for a long time and in the trading area MESA is very much Ehlers software. Maybe he just figures it is time for something new.

My conclusion as to why Ehlers wrote more than 10 pages about H in his book and then most ambiguously concluded that " it is useless" / "it is useful" / "it is not", was probably he thought that SOMEONE would be interested in it and it might help to sell his book (which it did .... to me at least ;-))

Autocorrelation periodograms are interesting. In EasyLanguage in one of his books, but not very difficult to re-write in other languages, Ehlers has some code that produces interesting 2-D visual display plots. I tried to improve on it using python with some of its scientific libraries. Although conceptually easy, i found in practice that it was difficult to get from a frequency vs time display to a period vs time display in python, but that was probably just a reflection of my own very limited skills with python library tools. However that's probably getting a bit off topic now. If you want to take up the topic of periodograms further offline, then most welcome to email me at [email protected] or alternatively let's break from this post about H and start a new one.

Cheers, all the best, Tony

@Tony, again, your assessment, and conclusions are right to the point. DSP, in any field can have some value if, and only if, the signal is not almost totally buried in the ambient noise. When a long-term forecast is about 2-10% signal and 90-98% noise, there is no way to really extract the signal with some kind of reasonable confidence level.

From my own limited research in the late 70's I had to come to the same conclusion you did: there is no value in the H coefficient, at least not in the trading world. There is no value in the Fourier transforms as predictors of what is to come. You can make nice pictures of it all over past data, but that does not make them predictive in any way.

We all have to try all that stuff, and ascertain for ourselves their value.

@Tony, thank you for your in depth commentary and insights on John Ehler's Engineering/DSP concepts being applied to the financial markets. It has given me a better understanding of the limitations and possible pitfalls of its application to financial trading. Empirical studies of market price structure conclude that they are nonlinear, nonstationary with Pareto like distribution, yet most analytical/modelling tools out there predominantly still use linear analysis with Gaussian distribution assumption. Ehler's approach is another take in modelling using DSP concepts. Nonlinear Dynamics/Chaos Theory/Deep Learning, which I'm a disciple of, is yet another approach. Having said that, all these approaches have the capacity to make profits in trading and to me personally,the holy grail is the approach that can give me sustained profits with manageable risks that adapts to different market conditions over time. This has been a very elusive quest but the search goes on.

Tony, when I gather my thoughts I want to further explore the concept autocorrelation periodograms with you. There is something that Ehlers said that strikes me and seem to be consistent with finding order in low dimensions from Chaos Theory and I quote Ehlers:

Although I knew about Spectral Dilation I was shocked to find that the phenomenon exists all the way down to shorter cycle periods – even below a 10 bar cycle period. At this point Spectral Dilation intercepts the aliasing noise resulting from using sampled data. If we just want to eliminate Aliasing noise, we do not need a smoothing filter longer than a 10 bar cycle period. In fact, using a smoothing filter longer than 10 bars attenuates the data we are trying to use in trading.

Have a good weekend!

Hi @Guy,
Your comment that:" ... there is no value in the Fourier transforms as predictors... " , [with the words "in trading" being implied even if not explicitly stated], actually goes right to the heart of my "philosophical issue with the idea of signal" in a trading context. Of course we know that Fourier transforms (FT) are extremely useful as predictive tools in contexts such as acoustics, astrophysics, geophysics, and many others where there are regular repetitive process at work with extended durations. No doubt it seemed like a reasonable assumption to many people that FT or FFT might also be useful in trading. Unfortunately in trading many apparently "reasonable assumptions" turn out to be either wrong or useless or both and, as Guy points out, FT only gives a picture of the past, as it takes all past information in the time domain and uses it to extract the corresponding frequency domain information, after which all of the time domain info is lost, and there is no reason why the frequency spectrum of market time series data in future would be the same as it was in the past over the extended period required to generate a valid FT. The only exceptions that i can think of might be due to market effects that are driven by genuine, continuously repeating drivers, such as annual or quarterly reporting. If these are in fact potentially useful inputs, then they might show up on a Fourier spectrum as peaks at the corresponding values of N = 252, etc on EOD bars.

My own philosophical contention is that the whole meaning of "signal" is not necessarily the same in market data as it is in a DSP context, and therefore traders have to be very cautions about what can and what can't be taken from a general DSP context into a trading context. I feel that Ehlers is often a little over-enthusiastic in embracing DSP ideas into trading. In some cases they are applicable, but in some cases they are not.

@James, as you say, our goal as traders is to make profit, and to do this in a sustained way comes from understanding and responding appropriately to market behavior. Understanding doesn't have to be perfect to be profitable and, as the famous statistician George Box wrote: "All models are wrong, but some are useful". As Guy implies, H and FT's are just not useful enough! But that doesn't mean giving up on frequency-domain or mixed time-domain & frequency-domain analysis methods altogether. The problem with FT or FFT is that it gives away ALL of the time domain info.

However other newer methods such as Wavelet Transforms and Periodograms manage to keep some time-domain resolution while also extracting some frequency-domain (i.e. periodicity) info. These sort of techniques are potentially a lot more promising. Why? Because we know that market behavior generally does have some degree of persistence over time. Trends continue over time, and even mean-reverting behavior may continue for a while, especially if there is some overshoot and another reversion from the opposite direction. So not only Trends but also Trading Ranges as examples of MR behavior do tend to persist. This then leads to the idea that if we can accurately describe what the market has been doing in the immediate past, then maybe, just maybe, we can actually use that info for a short time into the future. I think this is probably a valid reason why Ehlers now has more enthusiasm for tools like periodograms, and is certainly the basis for my own enthusiasm for "Market Regimes" that may have at least some persistence.

The topic of aliasing that Ehlers refers to relates to the problem that arises if any data series is sampled at a frequency higher than (i.e. periodicity shorter than) that set by the Nyquist criterion, which means that DSP methods are likely to produce erroneous results if used on less than 4 bars. Ehlers does not show any apparent interest in these short timeframes which require using {OHLC} data and Price Action methods unrelated to DSP.

Let me know if/when you want to explore these topics further, but again this is getting away from the Hurst theme of this thread, so let's pursue those topics elsewhere. For now, i feel like we have just about done all we can with H as such, although good luck to anyone who might still be able to squeeze a bit more out of it. Cheers, best regards, Tony.

@Tony, I totally agree with your analysis. It is really well said.

That it be DSP, spectral analysis, density functions, Hurst coefficients, FT or FFT, wavelets, periodograms, ML, DL, patterns, and I will skip indicators, they all try to extract some information from some lookback period over past data hoping that some of it might survive beyond the right edge of a price chart. It's understandable, that is all they have to work with: past price series. The surprise is: at times, coincidentally, they will profit from it.

However, they all suffer from the same sigma assumptions. That there is some meaning buried in all that noise. People stop calling it noise just because they are looking at it from real close, seeing patterns, regression lines that could be extended forward. It is ignoring that what they see is part of larger (long-term) picture. Ignoring that we might have to navigate, for a long time, in this tumultuous storm, on this ocean of variance, where there might be little in the form of real short-term predictability.

It reminds me of this old Artemis video.

@Luca,
Thanks for the notebook! I've used your other similar one on ML to good effect.
Seems like the mean_rev factor is capturing something, but the trending is not.
I'll keep trying to use it, and will look at using your framework to test out:

@James, @Tony, @Guy,
I appreciate your thoughts here, and have started looking at Ehler's work, especially the periodgram work, as I have recently looked at some similar things involving finding signals(radio) in the wild. I can see the Hurst work being frustrating for a signal usage as a threshold.
I still have hopes for this and other measures, as any measure that can produce information about financial time series can be leveraged.
Yet perhaps my Pollyanna hopes are because I haven't worked with financial time series that long!

alan

Hi @Alan, i don't at all consider your hopes as being "Pollyanna", and in fact i also keep coming back to some ideas (not only H, but even Gann) again & again from time to time, each time hoping to find something that i might have missed each of the previous times! However eventually it comes down to time as a precious resource and there just isn't enough of it to look at everything, so have to prioritize to where the most bang-for-the-buck (or for the hours) is likely to be. I think periodograms are likely to be considerably more rewarding, as are also various proxies for the info that you wanted to get from H. There are a few of them if you think carefully about it and some are computationally quite simple. Cheers, Tony

Thanks @Tony...and I'll keep you updated on any progress I make.
alan

Hi @Alan, clearly we all have our own different research methods and different skills & backgrounds. Your mention of "radio signals in the wild" makes me curious about your own background & other interests outside of (but perhaps in some ways related to) trading. I would be interested to discuss with you some time. Meanwhile, i have kept on dropping hints about "proxies for H", but no-one seems to care to pick up on it. Email me at [email protected] if interested. Cheers, all the best, Tony.