Back to Community
yet another feature request

few ideas: correlation checks on returns, histograms/plots of returns, sample statistics of returns, bootstrapped statistics of returns (i.e. maximum drawdowns, etc.) let me know if you think these would work, wouldn't work, have already been requested.

7 responses

Hello Taylor,

Thanks for the request. I'm not quite following yet. Are you thinking about a monte carlo type simulation where some possible future is played forward repeatedly? Or are you thinking about having some parameters and looking at the results on the basis of the parameter variation? Or are you talking about comparing many different algos entirely?

Thanks

Dan

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

The first one. In the case of the maximum drawdown, you just take your returns, assume they're iid (I think), sample with replacement, compute the the maximum shitty run, and do it a bunch of times to approximate the sampling distribution. I know this resampling stuff can get hairy, so you might want to talk to an expert first to see whats feasible.

Understanding what you mean by the second one is difficult for me since I don't know what sort of models we're talking about. But I would say generally after you fit a model you get the parameter estimates, and you don't really change them. If you're changing them, you never really fit anything. I might be missing something, but this seems like it might be going the way of data snooping, which leads us astray on how well we think our algos perform. With the third thing, yeah, model selection is important but this wasn't what I had in mind.

Maybe I'll put together a batch transform example that periodically selects the best arima model, then uses it to trade the next "x" days out. There are more elegant ways to capture this quasi-periodicity (not totally sure on if I'm using this word correctly :) ), but the periodic refitting of an arima is probably easiest to follow. Still learning python, but I'm guessing that the functions that choose the models for us return model objects with model selection things built in as data members...this takes care of model/algo comparison for you I think.

edit: looks like someone is already headed that way: https://www.quantopian.com/posts/arma-timing-out-and-r2py I know much less about computer stuff than you guys, but doesn't that suck for you if everyone is fitting stuff all day with your computers? Maybe a way to offload some of the work onto client machine? I'm way out of my depth on this one, though.

OK, thanks for all of that explanation.

We don't have any immediate plans for a monte carlo simulator, but I understand the appeal of it. Something for us to do at some point down the road, for sure.

I think there are a couple ways to do parameter optimization that aren't data snooping. The obvious one is to do your fitting to your in-sample data, and then see if the fit holds up for your out-of-sample data. But the point is well taken - you have to be smart. I think that your arima example is the other way - to me that's a form of parameter optimization. We need to make that easier, for sure.

Computation isn't a big cost for us at this point. At some point it may become more of an issue, and we may have to start charging if you exceed some threshold. We really want backtesting to be easy, so we'd only do that if we really had to.

Proper walk forward optimiser will be the trick. You can have a batch transform like function to evoke the optimiser at different intervals without looking forward.

You can already do this currently but the performance will be very bad. One area that can be looked into os how your guys can implement walk forward optimisation in the most effective and efficient way. Perhaps you can evoke the different optimisers in parallel for the window if there is no dependencies between each optimisation.

A few items Taylor asked about: "histograms/plots of returns, sample statistics of returns"

I agree, seeing a distribution/histogram of the daily returns would be helpful in evaluating an algo. (But, I'm happy to hear pushback as to why that might be either misleading or worthless!) Or, alternatively, the ability to export the ful position details and maybe transaction details too into a csv for off-line analysis. Not sure if that still violates data issues, since it is just the results.

my guess is that they don't like this because you can choose systems that effectively let you download the data. e.g.: get long and stay long 100 shares on first day until last day. and then in R: mySeries <- cumsum(rets/100)

I feared that Taylor, but you came up with the hack quicker than I :).

But, I'm still ok with a daily returns histogram plot. That would seem benign from data hounding, and I think a helpful metric.