Back to Community
Predictron 1.0: a machine learning attempt

Hello everybody,

I want to share with you my first algo with an attempt of implementing machine learning. This algo isn't optimised, so its results aren't reliable. However, my intention is to improve it with your help.

Basically this algo is called "Predictron 1.0" because it uses the Perceptron class of machine learning (if you want to know more on Perceptron, just click on the following link: Perceptron on Wikipedia). Perceptron is used for classifying a trade in the two classes long and short: if Perceptron gives 1 as result, the trade will be classified as long; on the opposite, if it gives -1, the trade will be short. For achieving this result, Perceptron uses the results of 3 different strategies, each one of these gives a buy/sell signal alone; then the Perceptron "mixes" these 3 results for giving the final signal. Obviously, the final results of the Perceptron will be more reliable if the 3 inputs signals are as more efficient as possible.

My initial goal was to make this algo working. And now it does.

The next step is to find 3 strategies (or more, if necessary) that are more efficient than the ones that I'm already using, remembering that the new strategies have to produce only 2 kind of signals: buy or sell.

Another feature to improve is the "make_pipeline" function. In few words, we have to answer to the following question: given 3 strategies, which are the best stock we should use? Until now, as my 3 strategies aren't so efficient, I wrote a draft of the "make_pipeline" function. So, this is another feature to improve.

After these improvements, also the "my_assign_weights" function should be improved, in my opinion. Now it assigns equal weights to all the stocks, without considering any parameter.

I hope this algo will be useful for all the ones who read my post.

Clone Algorithm
83
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 588dd4b4220bd9620ac2daed
There was a runtime error.
21 responses

Hi all again,

I have a newer version of this algo with a lot of bugs fixed. I strongly suggest to use this new one instead of the previous one.

Clone Algorithm
83
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 5890c14076e6155e37beed5b
There was a runtime error.

Really impressive python coding!

I think it is a good general rule to always test first with max leverage=1, and include the 2008 crash. From 2010 on is a strong bull market so that "almost anything" works. Here is the same code with leverage=1 starting from 2006.

Clone Algorithm
6
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 5890dbef5943a95e398519b0
There was a runtime error.

Thanks for your advice Greg!

Actually this second version is affected by the same problems of the first version (3 strategies, make_pipeline and my_assign_weights).
I decided to publish it for trying to improve it all together.

I'm also working on another kind of machine learning algo, based on Adaline. But I haven't finished it yet.

i am not a advance coder the only coding i can do is vba sparingly  but i think i can and to the prediction of this i have model / indicator that ive coded in vba/excel that gives buy and sell signal for stocks currency commodities and a random indicator on a site called binary.com if  u wish i can give u a live demo of this where ever u are, this particular site offers API which i already set up get a API token for a demo account and i can trade on it from where i am  

Hi Don,

Ok. If you want to share the code you can do it here, or if you prefer you can send me an email at [email protected]

I will try to translate from VBA (code that I know) to Python. Then I will publish the code here, if you agree. Otherwise I will send the code back to you.

Hi Gregorio,

I would add one more to general rules for back testing:
use default slippage and comissions models,
they are not perfect but protect you from self-deception.
If you will comment lines 97 and 98, you will get...
I think "liquidizer" draining money.

Overall nice idea, knowledgeable coding .
Good luck to make it profitable.

Clone Algorithm
7
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 5897f2e5a386bc62151696c0
There was a runtime error.

Hi Vladimir,

Thanks for your suggestions.

Actually this algo is intended only as a "structure": I mean, it doesn't have reliable strategies, so I wasn't expecting it to work well.

Moreover, as the baskets of stocks traded aren't optimised (make_pipeline function is a draft), I didn't care about their liquidity. That's why I set default slippage to 0. And in your last version if you look at the maximum leverage it goes up to 6.79: this means that buying happens before selling. Another thing to improve and that I still haven't considered.

About commissions, actually I could leave them at the default value.

@Loic: Hi, I will consider for sure your suggestion about SPY. Feel free to modify this algo as you wish :).

Hi Gregorio,

The version without using "liquidizer" and some other changes.

Clone Algorithm
7
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 589849a673a762623b1b1a27
There was a runtime error.

Hi all,

Thanks Vladimir for your contribution.

I've update all the code. I think that now we have a better universe, but the strategies used are still not efficient.

Here's the code with the backtest.

Clone Algorithm
7
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 589bffa6e330285e04de9d49
There was a runtime error.

For defining the new universe I used the algo from this article: Link

Hi Grogorio,
Very impressive work! I am rather new to ML and am studying your newest code posted on Feb 9, and have some confusion. I hope that you can give me a little explanation.

Regarding the prevision part, here is how I understand it.
Say I had a "long" signal for this stock, we long it for a week. Within this week, everyday, I will get 3 signals from the 3 strategy as the input, and the gain/loss as the output, which I will use to train the model. When the new week comes, based on the created model, and the signal of the day, I will buy/sell that stock.

Here is my confusion. Suppose I receive a (1,1,1) signal and the model tells me to short (-1) , and I have a high enough accuracy score to take that order. Now, suppose that in the coming week, i receive (1,1,1) as the signal everyday and the stock actually dropped everyday. We will be getting a a (1) for y on a daily basis, but the model will still want to predict (-1) , giving it a 0% accuracy score, when the strategy has been doing correctly.

What i am getting at here, is that it seems that we would like to correspond 3 signal inputs into a buy/sell recommendation, but if the output data we use for training is gain/loss, which flips around when we switch from buy or sell, we may not be training the model correctly right?

Is there something wrong with how I interpreted this code?
Sunny

Hello Sunny,

You are right. Following your suggestions, I decided to rewrite completely this algo.

Main features of the upgrade:

  • Now it's possible to add strategies easily
  • Now the previsions are done every day and every day the portfolio is rebalanced.
  • The positions are closed every evening
  • It's easier to change the parameters in the pipeline
  • This new algo is hedged, thanks to the new my assign weights function
  • It's possible to switch to other Machine Learning Classifiers easier
  • You can easily choose the number of traded stocks and the training period

This new algo is an upgrade so it looks a bit different from the previous ones, but the logic is the same.

Here we go with the backtest :)

Clone Algorithm
17
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 58cb8afedac11f1c00134b21
There was a runtime error.

It is quite difficult to get these types of classifiers to work on time series data. If Quantopian provided scikit-learn 0.18 at least you'd have a built-in MLPClassifier and wouldn't have to code it yourself but that still won't help you much for time-based predictions.

What you may find them useful for is automated trade management. Meaning you want to optimize a portfolio subject to some known set of conditions. I've found Support Vector Machines and Random Forests to be somewhat useful here. Likely you'll have to do an outside of Quantopian grid search optimization to determine the best values (SVC: C, gamma, etc.). Similar usage as Quantopian's optimize api.

Really we need support for tensorflow/tflearn/keras with offline trained models. Deep Multilayer networks with LSTM's and dense fully connected layers can no way be trained with the current execution/memory restrictions. My neural networks often take day(s) to train on high end GPU's. Once trained they can execute efficiently.

There are some really great things about this "Quant" platform but support for machine learning isn't very strong yet. I really hope that gets improved soon.

Hi Gregorio,

Thanks for the reply! For the perceptron that you used, may I ask if there is a reason that you uses L2 over L1 and elastic net? I have seen arguments for all 3, but never in the context of quant trading. I would like to hear your reasoning if possible.

Thank you,
Sunny

Hi John,

I agree, but considering what we can do now, in my opinion we should at least try to make this thing working.

Hi Sunny,

Actually penalty='L2' stands for a kind of "optimisation" of the MLC: it helps to reduce overfitting and it helps managing extreme values (adding bias). As I know, this kind of penalty, L2, is the most used one.

I'm still working when I can on this algo. I hope to release a new updated version soon.

I'm sorry if this is obvious, but does the horizon refer to how far the prediction is made into the future? If so, how would you go about optimizing that?
And I looked, but I didn't see you mention what the algorithm is using as buy/sell signals; did I miss something?
Thanks

It is quite difficult to get these types of classifiers to work on time series data.

John, the big question is does ML have any place in making predictions in the financial markets. You mention "automated trade management" but that assumes you have some sort of system to choose the stocks in the first place.

As to "optimisation" all of ML seems to have that object at its heart. But is the prediction of financial instrument pricing a suitable topic for ML? If not, Keras et al is a waste of time.

As to "time series data" it presumably boils down to the independence or otherwise of the price movements of financial instruments. Determinism or randomness. Some claim they are a random walk others that patterns such as trends or mean reversion are to be found.

As a laymen in the world of ML (but an interested one) it seems to me so far that ML's great successes have come in deterministic systems. The rules of chess, Go, cards, poker. Where the probabilities can be accurately assesses and odds forecasted.

Although some improvement seems to have been made in predicting chaotic systems such as the weather over the last 50 years would it be accurate to say that ML is better at telling us what is than what may be? Recommender systems on Netflix and Amazon rely on the fact that while people's taste can change in general it doesn't and the same people will carry on liking the same books and films.

Reinforcement learning to fly a toy helicopter seems to have been a great success but again I assume that there may be at least a partly deterministic environment involved with the exception of the wind; Newtonian laws presumably dominate.

Out of the vast plethora of machine learning algos out there from Baysian classification to neural networks to genetic algorithms most of them actually, broadly, aim to do the same thing. And so the question remains that while ML may be of use in image recognition and astronomical classification, is it actually of much use in predicting the behaviour of an apparently chaotic system such as the stock market?

Or is the stock market perhaps NOT random and chaotic? Certainly not in the long term - it is merely a reflection of increased wealth from 300 years of technological revolution.

But in particular have you personally had any success in generating alpha from using ML? And if so, what approach and class of ML algo do you prefer?

It has deterministic and random components. The strongest deterministic one is inflation, then the general market trend (a sin wave with period of 5 to 10 years, driven by momentum until limited by some saturation factor??). There may be hidden deterministic periodic events such as a large fund that buys stocks at a certain time every month, ML may be good for these. Then there is a large random component which can never be modeled or predicted. Like all noise, the ramdom component becomes larger on shorter time scales.
If a pattern isn't apparent when looking at a chart, it probably does not exist, humans are superb at identifying patterns. ML may be useful since it can look at thousands of stocks on many different time scales. However since overfitting has been proven to be a problem, and ML is just fitting with hundreds of coefficients, I don't see how overfitting can be avoided with ML.

Plus I have yet to see someone post a backtest that works.

This is interesting. It is important to remember though that machine learning, in relatively elementary implementations, is best used as "insurance" here is an example using mean reversion and stocks that just entered the Q500/ are falling out of the Q500.
I will post the same algorithm, but with zero prediction aspect.

Clone Algorithm
57
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 58d59e14cf56ff1bdac17ce5
There was a runtime error.

said id post it so here it is. same methods, just without prediction.

Clone Algorithm
57
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 58d612bcad5fd91c7b33f70c
There was a runtime error.

@Anthony FJ Garner - I can't say what I do but for some insight into how certain algorithms can be used you can see Tucker Balsh's QuantCon 2016 presentation here