Back to Community
Skip, avoid, bypass, ignore handle_data -- using schedule_function only?

If schedule_function is used exclusively to perform specific market time event handling then is there a way (or can one be added) to avoid having to vector every minutely event through handle_data?

Sure,
def handle_data(context, data):
pass

works, but if one can avoid even the function call perhaps accelerating a minutely backtest (in any way) could be accomplished. No doubt the backend is still processing all of the data for all of the minute bars (yes?) but skipping handle_data might speed it up a little...

I'm guessing that the construction of temporary timeseries set with the given begin/end dates as well as the specific scheduled events identified is still not an option or capability. (And that this has been discussed previously here.)

Of course, if all of this is moot and I've simply missed the technique for avoiding the 388 minute bar events I don't care about is already available I'd welcome a refresher.

11 responses

If your strategy does not involve trading on margin or shorting stock whereby it is important to monitor minute by minute data to see if your strategy runs into intraday margin issues, or if you're not trading intraday, a daily backtest should suffice.

@Adam, yes, one would think so. But to actually get a strat into production mode one must use minutely bar data. Unless I'm missing something here, all production strats are run in one way only -- on minutely bars. Back testing is obviously quicker on daily data. And modeling entries on the open (+/- slippage) can help achieve more accurate EOD/MOO orders but none-the-less, if you want to launch production you have to handle minute data. Tell me I'm wrong, someone, please, and I'll be grateful. (About to do some research on this now.)

Hello Market Tech,

I'm not aware of any way to speed up minutely backtests by skipping calls to handle_data. My sense is that adding securities (even if they are not traded) will tend to slow down a backtest. So, just list the ones you need. Also, you can run as many backtests as you want in parallel, by using separate browser tabs. So, you could fire off a bunch, go to sleep/work/whatever, and check back for the results.

To adjust the execution price for daily backtests, have a look at https://www.quantopian.com/posts/trade-at-the-open-slippage-model. There's no efficiency penalty for using a custom slippage model, so you can get the benefit of quick daily backtests but also have some control over the execution price.

Grant

Market Tech,

You are correct. Minutely is the only option for live/production trading.

Grant

Thanks for the replies Grant. I've got an additional questions into Jessica and Alisa about additional "live trading" concerns, namely, restarts.

If you're going to risk real money here, read the IB API and connect to it yourself directly. You could run your whole trading operation on a car battery with a raspberry pi and a broadband USB dongle. Why add an unnecessary risk of your algorithm gone dark if Quantopian has some server snafu, the terms of use are quite explicit on this risk to you.

This interface is an awesome proof of concept machine, but wouldn't you want to be in total control with real money on the line?

Ah @Adam, you make me chuckle. Tell you what, when you get your solar powered raspi spinning java threads and dealing with all the TWS/API foibles, the disconnects, the exceedingly useful IB Help desk, and all the dozens of issues one meets using IBs API, take a picture of it and post it trading. Many here may not realize but 95% of the code required for trading is not about trading at all. If Quantopian wants to build and maintain and support that VERY tall stack, then so be it. Gott Geschwindigkeit. If you yourself have built something that requires the many dozens of software pieces necessary to support a real time trading system then I'm not sure why anyone would recommend striking out on one's own. That task alone is a multi-year, 5000 hour project that, even in the end, you may never get to use because your strategy(s) fail to make money. The Q is offering some real value here, that can only get better over the years.

Regarding the implication that the Quantopian system is "real time" what I've been able to sort out is that they use the timestamps on their Nanex Nxcore derived minute bar feed for synchronization. To my knowledge, they have not published details on the degree to which the timing of the data feed, algo execution, order execution at IB, etc. is maintained. It's probably not a big deal for most/all Quantopian algos, but I thought I'd point out that there is some murkiness in the Nanex Nxcore-Quantopian-IB timing.

Grant, I'm really not sure there's much "murkiness." We've made it clear from the start that algorithms that live or die based on a few seconds' delay in the data feed don't belong on Quantopian. That should be manifestly obvious from the fact that we only provide minutely data and have stated clearly and repeatedly that we don't intend to change that.

We strive to call each algorithm's handle_data within a fraction of a second after the close of the minute, i.e., after all the ticks from the prior minute are rolled up into a minute bar. Sometimes we fail to do that, because there is some sort of unexpected anomaly in our platform (when those occur, we do our best to figure out what caused them and fix our platform to prevent them from recurring), because our data ingestor falls behind because of a spike in trading volume (this happens occasionally, and we have it on our roadmap to improve our processing speed to make it happen less frequently, but we can't guarantee that it will never happen), or because NxCore delays our data feed (also on our roadmap is adding a second, redundant data feed, but even with a redundant data feed there may be delays caused by the exchanges themselves).

I don't have data at my fingertips about how long after the close of the minute we call handle_data, but I do have second-granularity data about when handle_data finishes. Keep in mind that a user's handle_data's completion can be delayed by CPU-intensive work within the function. Having said that, looking at last Friday's data, the average completion time for handle_data was 1.06 seconds after the close of the minute, with 69% of the calls finishing less than 1 second after the close of the minute, 92% in less than 4 seconds, and 99% in less than 8 seconds.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Like I said.... You're not going to hi-freq on Q's backbone. I have no delusions of grandeur here about developing something that belongs on a co-located blade server, I'm here to playtest with a $10,000 historical data subscription. That is the really sweet service being provided for free.

Challenge accepted, I'll get that picture when I can.

Thanks Jonathan,

You've explained a few things. At some point, I'll follow up with a new thread, or directly to you.

Cheers,

Grant