Back to Community
Porting a scikit-learn model from the Research environment to the Algorithms environment


I've successfully used Quantopian's Research environment to generate a Random forest model to predict forward price movement on stock prices.

Now, I'd like to port the model to the Algorithms environment to run full back tests on the model.

The Pickle module is not supported in Quantopian's environments. What's the best way to port my model to the Algorithms environment?

7 responses

The requests for either pickle or joblib have been ongoing for years now. Just ridiculous, especially if you want to deploy a predictor.

I understand the risk of being able to export data with pickle or joblib from Quantopian, however implementing a load only function from either pickle or joblib would mitigate the risk. And give everybody what they want.

However it is 2016 and scikitlearn is becoming more popular. Having support for to implement trained models outside of Quantopian would be awesome.

I gave up on building these sort of models since you can not use Pickle. Its a mystery to me why cant Q make this work within the data licencing requirements.

We need Pickle to work in Q !

Yes, we I'm running into same problem. I want to build model externally and use it in the strategy. Or build model in research and use in a live strategy.

Unless you're building something that is overly complex with a metric crap load of features, as a work around, checkout Jason Brownlee's book on implementing many of the scikitlearn algos in raw python. He breaks down how many of the algos work and breaks open the black box of scikitlearn. If you substitute pandas where necessary, you can streamline his code even more.

Really the amount of data you would be pickling from the scikitlearn object, on a trained model, is fairly minimal. Of course that changes depending on the number of features and the algo type. It's minimal enough that you could follow Jason's examples and load in the trained data you need via CSV and be ready to do some damage.

As more food for thought, there is a recent discussion on Quora from Quant Jesus Ernie Chan about machine learning algos, worth reading many times over. Unless of course it's a mirage and he's secretly converting his bank account into an extension of the Federal Reserve banking system with a XG Boost algo he's put together.

I just wanted to add my voice here: the fact that we can't load a pre-trained model is utterly ridiculous. As of December 2017 this feature is not on any timeline and Q cannot offer even an idea of when it might be available.

Such as shame as the infrastructure is great; this is a major oversight that the competitors (e.g. Quantiacs) have already addressed. Thus far, this is preventing me (and I'm sure plenty of others) from using Quantopian.

Yes please. What everyone else has already said...