Back to Community
Trading algo: how to interpret resuts

Hi guys,

The "getting started with futures" lesson has a complete implementation of a pair trading algo. I'm attaching the algo here for convenience.

Let me give you some context in case you don't remember this specific example.

The idea is that some commodities' prices should be related. In this case they picked crude oil (CL) and gasoline (XB). If they are indeed related then we expect that a price difference should mean revert: if the short moving average is larger than the long moving average we should short the difference, and vice-versa.

To implement the above idea they define variables short_ma = 5 and long_ma = 65 and then compute a zscore. When the zscore is larger (or smaller) than 1.0 (or -1.0) that's a signal to short (or long) the difference between the CL and XB futures. They also define exit signals, which are triggered in the opposite direction when the zscore is 0.0.

Now lets discuss results and how to interpret them.

Lets pick a time period to discuss specific numbers: from 2016-01-01 to 2017-09-29.

  • If you run the algo on that time period as is it, you get total return 34.27% and sharpe ratio 2.16.

Lets play a bit with the entry signal.

  • If you set short_ma = 2 you get total return -7.71% and sharpe -0.47.

  • If you set short_ma = 3 you get total return 6.02% and sharpe 0.46.

  • If you set short_ma = 4 you get total return 20.5% and sharpe 1.45.

  • If you set short_ma = 6 you get total return 7.44% and sharpe 0.56.

  • If you set short_ma = 7 you get total return 16.92% and sharpe 1.20.

  • If you set short_ma = 8 you get total return 2.52% and sharpe 0.22.

Now, I understand that there's other things that we can play with other than short_ma. We could change long_ma, we could try to use a different signal than zscore, or we could change the exit signal etc etc. There's many different detailed ways in which to implement a specific idea.

But how to optimise isn't the point of this post. What I'd like to discuss is: How much optimisation is too much optimisation? Here for example the idea is sound, but the results seem way too sensitive to tiny changes.

This is important if you want to go and find other viable pairs. I've tried dozens of other highly correlated pairs and couldn't get good results with short_ma = 5. Is it that for some pairs the idea doesn't work at all (even if the pair is correlated), or could it be that short_ma = 6 or 7 would yield better results? And if it does, how can you detect if a good result for a specific free parameter is a fluke? If you try enough signals with enough assets, it will happen sooner or later.

Clone Algorithm
Backtest from to with initial capital
Total Returns
Max Drawdown
Benchmark Returns
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 59d2320c00b2c6506efe6c4b
There was a runtime error.
2 responses

We have this problem in machine learning as well. I will share our solution for it:
You have 3 sets of data:

  1. Data you fit/train your model on, called train data in ML,
  2. Data you use to test and optimize your model free-parameters (also called hyperparameters), called validation data in ML,
  3. Data you evaluate your final model (called test set in ML).

If the parameters perform well on the validation set and on the test set then they are good, if only on the validation set then you have overfitted the validation set.

Hey @Illjia, thanks for coming back about this. You're right, of course, however in this particular case there was a bigger issue here: that notebook had a critical bug. Have a look here.