Hi All,

We've just implemented a fix that we believe will solve some of the issues with before_trading_start (BTS) timeouts that have been reported in the community. From this point going forward, all pipeline computation will be done before BTS runs, and therefore all pipeline output for a given chunk will be cached prior to BTS getting invoked. There will now be a separate 10 minute timeout specifically for pipeline computation, which should allow you not only to complete longer pipeline computations, but also to make more use of the current 5 minute timeout in BTS, which will remain as is. It's important to note that there were no changes made to the API with this fix. All changes made were on the back-end, so there are no changes needed in your algocode. Calling pipeline_output will retrieve the output of your pipeline that was executed in the 10 minute computation window prior to before_trading_start. The following is an example pointing out the parts of an algorithm affected by this change:

from quantopian.algorithm import (
attach_pipeline,
pipeline_output,
)
from quantopian.pipeline import CustomFactor, Pipeline
from quantopian.pipeline.data.builtin import USEquityPricing

class TestFactor(CustomFactor):
window_length = 1
inputs = []

def compute(self, today, assets, out):
# previously, this computation would be included in BTS's 5 minute timeout,
# but now, it has its own separate 10 minute timeout
some_computation()

def initialize(context):
p = Pipeline()
test = TestFactor()
attach_pipeline(p, "test")

# this still has a 5 minute timeout, but pipeline_output has already been computed
# in its own 10 minute timeout
results = pipeline_output('test')


We hope this fix will make it easier for those with more time-intensive contest entries, and are looking forward to hearing your feedback! Feel free to reach out to us with any questions you may have.

19 responses

Hi Jacob -

Is the limit 10 minutes for all pipelines, or 10 minutes per pipeline? For example, I have:

def before_trading_start(context, data):
context.pipeline_data = pipeline_output('long_short_equity_template')


So, presumably, the two pipelines have 10 minutes to execute, not 20 minutes.

Hi Grant,

The limit is 10 minutes for all attached pipelines combined.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Thanks Jacob -

I don't understand the execution flow. From your description, it sounds like pipelines will execute even if pipeline_output is not called; pipeline_output simply retrieves the results. Presumably, the logic is that the pipelines will be executed in the order they are encountered in initialize. For example, I have:

    attach_pipeline(make_pipeline(), 'long_short_equity_template')


So, these two pipelines would run just prior to before_trading_start, with the long_short_equity_template run first, followed by the risk_loading_pipeline?

So is it still necessary to retrieve the pipeline output in before_trading_start or could a call to pipeline_output be elsewhere in the code? If the latter, where can one use pipeline_output? And if the latter, is before_trading_start required at all?

Also, how long can a backtest run before it will be timed out?

Hi Jacob,

Are pipeline calculations run in batches?
I have a pipeline that calculates under 2 minutes.
However in backtest mode it gets timeout error on the second day or test.

Is there any way to run pipeline for 1 day at a time?

Are pipeline calculations run in batches?

Yes.

Is there any way to run pipeline for 1 day at a time?

No.

If you move your computation outside of Pipeline, you can take advantage of the 5 minutes per trading day in before_trading_start. However, there will still be a backtest time-out of TBD hours, as I understand.

Hey,

Top ! Works fine for me. I'm now able to run some old algos that I'd left aside because of the timeout issue.

Thanks !

Great, thank you! Much appreciated!

Thank you!

Hi Grant,

I have some responses for your earlier post:

From your description, it sounds like pipelines will execute even if pipeline_output is not called; pipeline_output simply retrieves the results. Presumably, the logic is that the pipelines will be executed in the order they are encountered in initialize.

Yep, thats correct. Pipelines will be calculated in the order in which they're attached in initialize, and pipeline_output is just a way to retrieve the results.

So is it still necessary to retrieve the pipeline output in before_trading_start or could a call to pipeline_output be elsewhere in the code? If the latter, where can one use pipeline_output? And if the latter, is before_trading_start required at all?

pipeline_output can be used anywhere in your code after pipelines are attached. before_trading_start is no longer necessary for the particular purpose of retrieving pipeline output, but is still useful for running extra computations on your pipeline with a separate 5-minute timeout (rather than the 50 second timeout of schedule_function).

Hi Jacob -

Is there a backtest duration limit? And will a backtest be slowed down, once it exceeds a certain time limit (I'd heard comments along these lines, but maybe folks are just seeing varying load on your system)?

Hi Grant,

In general, there's no set backtest duration limit, but there are a few nuances to that answer:

• When we replace backtest servers, which usually happens daily, every backtest on the old server will have 2 hours to complete before that server shuts down and is replaced by a new one. Within that 2 hour window, the server won't accept new jobs, so you wouldn't see a case where your backtest lands on a server that's just about to shut down. Effectively, your backtest has until the next server replacement, plus 2 hours, of time to run. There's no fixed single time for server replacement, but like we saw above, the lower bound on max backtest duration is 2 hours. Keep in mind though that hitting the lower bound is relatively rare - the backtest would have to be one of the last jobs accepted by a server before its 2 hour window begins.
• Backtests being run as part of the contest evaluation have 5 hours to complete.

In most cases, backtest timeouts will be from the time limits imposed on specific steps like pipeline execution or scheduled functions, but the above examples (rare in practice) are cases where you could bump into an overall duration limit.

Hope this helps,
Abhi

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

In general, there's no set backtest duration limit

Well, it sounds like most days, it will be between 2 and ~26 hours, unless it is a contest entry, which you presumably run within 3 hours of replacing backtest servers (giving a total of at least 5 hours for the backtest to run). However, even for contest entries, it is not clear, since as I recall, a backtest is run when the entry is submitted. So, if the submission is at the time you replace backtest servers, it would be limited to 2 hours, right?

This is an odd, inconsistent mix of time-outs. For example, I'd think that if you support 5-hour backtests for the contest, you'd increase the 2-hour lower bound to 5 hours. And if one has a backtest that takes a long time to run, the only way to ensure that it won't hit the 2-hour limit would be to enter it into the contest (although as noted above, this would only be for re-running the backtest).

Keep in mind though that hitting the lower bound is relatively rare - the backtest would have to be one of the last jobs accepted by a server before its 2 hour window begins.

Yes, but from a user standpoint, one day, a backtest might run just fine, and the next, it'd be terminated. You are basically saying that backtests shouldn't run more than 2 hours if one wants a fully reliable Quantopian SaaS. For your paying Quantopian Enterprise customers, I'd be sure to make this clear.

For SaaS predictability, it seems like you should just pick a number for the max backtest duration and specify and impose it. Or at least provide guidance on the last time you replaced backtest servers (this information could be posted on https://status.quantopian.com/). The latter approach would allow users interested in long-running backtests to be able to optimize their use of your service, without requiring any changes to your current system.

@ Jacob -

It turns out one now doesn't even need to call before_trading_start; if you haven't already, I recommend making this clear in the documentation.

Also, would it make sense to be able to schedule before_trading_start now that you are running pipeline independent of it?

@ Grant -

It turns out one now doesn't even need to call before_trading_start; if you haven't already, I recommend making this clear in the documentation.

We will definitely take that fact into account when updating our docs.

Also, would it make sense to be able to schedule before_trading_start now that you are running pipeline independent of it?

Customizable scheduling of before_trading_start hasn’t been on our roadmap so far - changing this would have implications on our contest and other systems.

Well one could kludge scheduling before_trading_start, so why not make it part of the API now that it doesn’t need to be called every day?

thank you!

Is it just me, or do the backtests take significantly longer to complete now after this fix was implemented? Was there any 'performance analysis' done before and after the implementation? Then again it could be just me as I'm not basing this on anything other than it 'feels' slower.

Note, I'm not complaining, just curious really, though it would be nice to 'have the cake and eat it too.' :)

Its definitely slower!

It'll be nice to have configurable timeouts