Back to Community
Machine Learning

For most machine learning algorithms it is necessary to use separate training and validation sets. Can this be realized with Quantopian? Ideally the training data would be available before handle_data is called the first time.

6 responses

Hi Manuel,

I've been doing a lot of experimenting with machine learning on Quantopian. I'm not sure if what you're describing is possible... However, you can defiantly think of creative ways to split training/testing data. I've gotten around it using a "moving" training set with @batch_transform. However, you could also just build your training set as handle_data is called...once the set becomes a certain length, do you training and only start trading afterwards for validation..

Check out for my code: https://www.quantopian.com/posts/can-naivebayes-tell-us-anything-about-momentum-trading#51265f83f9d6c327a00001ff

Alex

Hi Alex,
thanks for your answer. I'll try your workarounds for now. But it would be great if the Quantopian developers are reading this and implement a more elegant solution.
Manuel

Alex, great work arounds thank you for sharing!

Hi Manuel,

Thanks for raising this point. I wanted to explore the design a bit with you to make sure I understand what you would like to see. What popped to mind first was another decorator, but for functions to be exclusively called from the initialize function. Here's what I was thinking:

@train(training_window_length=30)  
def my_training_method(datapanel):  
    # your training code goes here  
    return parameters 

def initialize(context):  
    context.parameters = my_training_method()  
def handle_data(context, data):  
    # use context.parameters in your logic  

The behavior would be that the first training_window_length days of data would be reserved as training, and sent as a datapanel (same format as batch) to your training method. You'd operate on it, and return whatever results you want to use in your algorithm. Then, handle_data would start receiving data starting with the first trading day after your training period.

What do you think?

thanks,
fawce

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

If I can jump in here, I think that sort of a decorator would make this very easy

Thanks
Alex

Thanks, fawce, for your reply. The behavior your describing would be perfect.

One more thing we'd need is serialization of the model parameters. You don't want to start with an untrained method in a live environment.

Is this decorator working right now?