It is painful to run a meaningful algorithm on quantopian. I spent weeks on research and have to run my backtests overnight losing a lot of time.

If quantopian were to fail it won't be because of lack of ideas from community but rather because the platform only allows you to do certain things within the computational constraints.

I would love to see following improvements:

1. Faster backtests.
2. 10 minutes of before_trading_start (5 minutes is too slow)

I hope quantopian listens to the community.

Best regards,
Aqua

24 responses

You're doing enough computation to be hitting the 5 minute timeout in before_trading_start?

[Edited because there was an email in my inbox coincidentally that countered what I was saying]

Yes I agree. They mostly answer queries on usage of platform but when it comes to feature requests or anything else that requires someone to take leadership and drive the product there is no direction. We don't even know what their plans are regarding infrastructure in the coming months. I think there is lacking a leader who can take ownership and drive things engaging community.

You're doing enough computation to be hitting the 5 minute timeout in before_trading_start?

Yes. Try doing something like fitting a model or calibration on 1000 securities. In fact, I wanted to add clustering to improve performance but with current speed that does not seem possible.

Same here, I have the same problem with any non trivial strategy, I always hit a timeout due to the excessive computation time.

They mostly answer queries on usage of platform but when it comes to feature requests or anything else that requires someone to take leadership and drive the product there is no direction. We don't even know what their plans are regarding infrastructure in the coming months. I think there is lacking a leader who can take ownership and drive things engaging community.

Not sure if you guys heard, but Dan Dunn's last day with Quantopian was last Weds., 2/7. Historically, he was the guy doing this kind of thing.

One thing to note is that before_trading_start has to accommodate the Pipeline chunking for backtesting (as I understand, for live trading, there is no chunking). So, ironically, if you don't use Pipeline (which is supposed to help with computations), you should get more time for computations (and your backtests will be one-to-one with live trading).

Overall, my read is that they have a plan to implement the workflow and are on the hook to complete it before they get distracted with building out platform performance. The other issue, I think, is that it is a make/buy decision. To do actual ML, they'll need to do a lot more than just increase the time-out of before_trading_start; they might be better off partnering with someone who already has the infrastructure in place, versus building it from the ground up. It would also incur support costs, which would have to have a payout in terms of better algos. Lots to consider before adding infrastructure costs (given that I kinda think they aren't rolling in the dough yet as a hedge fund).

Hi Grant,

I'm sorry to hear Dan Dunn is no longer with Q. He is one of the nicest guy I've conversed within Q, a man with good reason. Wish him luck on his new endeavors.

they might be better off partnering with someone who already has the infrastructure in place, versus building it from the ground up. Blockquote

This is key to fast track the transition to ML/AI algos and other computational intensive algos. Q can host via a AWS or similar services, where users can buy at reasonable prices compute time. Better yet, Q can incentivize by awarding compute time tokens to users. This way, virtually no capital outlay is needed by Q, only operational and support costs to be budgeted.

Hi all,

First, thanks to all of you. You all participate and help make the Quantopian community an invigorating, helpful place.

With respect to performance, I want to let you know that earlier today we deployed 3 changes to production that should help increase the speed of backtests. One of the changes was specific to order_optimal_portfolio and should improve the speed of backtests using that function, the others were more broad in nature and should help most any backtest.

On the topic of the timeout duration for before_trading_start, this post spurred on a nice debate internally with respect to that time out. Many of the assumptions that influenced the 5 minute duration originally are no longer valid and we're giving consideration to changing this value in the near future.

More broadly, I want to let you know that we are always trying our best to listen. There are handful of us at Q that read practically every forum post each day. I certainly understand frustration around not understanding the direction of our platform, given the time and energy you commit to working here in the community. I count it as a shortcoming for Quantopian that you have the level of frustration regarding direction of the platform. I think we can do a better job to communicating about our product plans. That said, there are definite downsides to disclosing our product roadmap publicly (it changes; our competitors would also like to know our plans; etc). I don't have an answer for you right now but I'll be giving it more thought during this long-weekend about how we can better address this problem.

Thanks again and happy coding,
Josh

PS, Be sure to enter the new daily contest! It's going to be a lot of fun.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Thanks Josh -

One consideration is that before_trading_start has to handle the Pipeline chunking during backtesting (as I understand, there is no chunking under live trading). It would seem that the chunking is the real killer. Perhaps you could explain the benefit of the chunking? I gather that it has something to do with your I/O (e.g. latency in grabbing data from disk and moving it into working memory), such that for backtesting, there is an overall benefit to Pipeline chunking.

Another thought would be to support backtesting in the research platform, where potentially you would have more flexibility (e.g. I see that Pipeline chunking can be controlled there). This would have the added advantage of not needing to copy-and-paste backtest IDs into notebooks for analysis. Everything could be done in one shot. Is there a reason why backtests can't be run in a research notebook?

The other consideration here is that increasing the before_trading_start time-out is really lipstick on a pig relative to the commodity computing power that you could unleash. If you are spending time debating if you should increase the time-out of before_trading_start I submit that you are thinking too small. You could put your effort into getting the platform ready for the kind of machine learning/deep learning algos that users have stopped asking for, because there has been no substantive response.

Regarding the concept of a Quantopian "product" I'd love to hear how you guys think about it. Your product, as I understand, is the 1337 Street Fund--it is your sole vehicle to make money. Everything else is just a means to an end--a tool. Is this correct? Or is there another angle? I guess as a start-up, it is really all about the net valuation of the enterprise--if you were to sell the company, how much would you get. The valuation is the product, right?

We already have the “pay to play” paradigm of pricey data sets, which, now that real money trading has been discontinued, could only be used for trying to get an allocation. Why not un-level things a bit more, with paying for computing resources?

Seriously I don’t see the point. Q should be able to offer a lot more for free if they noodle on it a bit. RAM/multi-processor/GPU/parallel processing/etc. if shared across users should be dirt cheap. Perhaps part of the problem is the current paradigm of supporting an infinite number of backtests/notebooks by an infinite number of users 24/7/365. It is not obvious that this limitless on-demand approach is best (but then there may be very little demand for more horsepower).

@ Enigma -

All sorts of pay-for-performance arrangements could be considered, but personally, this ends up being a bit like Tom Sawyer getting his buddies to pay him for the privilege of painting a fence for him.

Perhaps Josh can articulate it, but I'd think the long-term game plan would be to recruit a cadre of contracted quants, and then provide them with more computing resources as part of their employment, versus asking the entire crowd to pay (which again would be kinda ridiculous...I don't have to pay for tools at my job), but my read may be wrong.

Well, we’ll see if Josh sheds any light on things. In the end, it is a business so adding cost without clear benefit won’t make sense. My intuition is that an alternative configuration might be feasible that would move the platform forward technically, without adding operational cost (of course, it would take some up-front engineering and support cost could go up).

As I mentioned in my previous post:

This is key to fast track the transition to ML/AI algos and other computational intensive algos. Q can host via a AWS or similar services, where users can buy at reasonable prices compute time. Better yet, Q can incentivize by awarding compute time tokens to users. This way, virtually no capital outlay is needed by Q, only operational and support costs to be budgeted.

There are different ways for Q to incentivize compute time to their users. Take the case of Numerai, they payout about 300-350 users who are controlling the the allocation of their hedge fund with both USDollar and their own crypto tokens every week. These users either use these tokens to buy credit in AWS or buy their own GPU hardware from their winnings.

So a model that would fit Q restrictions on sharing data is for them to host their framework in AWS, users will then log into their AWS host site and either buy compute time or use their dollar and/or the incentivize tokens to get compute time in a stacked GPU environment that is suitable for ML/AI based algos. Numerai is way ahead of Q, in terms of amount of payout, transparency, innovation and hedge fund implementation.

Well first Q would have to sort out the high-performance computing (HPC) platform. If they are having water-cooler debates over increasing a time-out from 5 minutes to 10 minutes, I’d agree that there is a larger problem relative to their competition.

Last I checked, Numerai had the advantage in their special data encryption allowing users to download data, allowing users to apply their own hardware. But it just becomes a pure ML problem; no hypothesis about the market needed.

@Josh Payne

On the topic of the timeout duration for before_trading_start, this
post spurred on a nice debate internally with respect to that time
out. Many of the assumptions that influenced the 5 minute duration
originally are no longer valid and we're giving consideration to
changing this value in the near future.

Would that be difficult to increase the 5 minutes timeout? At least it should be 5 minutes times the number of days the pipeline is computed for (the chunk size) or something like that. Currently it is simply not possible to run backtests for many complex strategies.

My current (non-ML) algorithm with 40 securities per sector (11 sectors x 40 = 440 securities) takes 48 hours to run from Jan 2016 to Feb 2018. I ported this algorithm to quantconnect to see how their platform does. It runs within 1 hour. Seriously there is something flawed with Q's backtester. I guess it has to do with two things:

1. Pipeline
2. More importantly the trades because if I remove trading it runs much much faster.

Best regards,
Pravin

Hi Pravin:

I wonder if there are any optimizations we could make in that algorithm, because that runtime doesn't sound right to me.

Is there any way for you to build a version of that algorithm that hides the secret sauce, in a way that we can take a look and see what's going on in it? If you are willing, you can reach me at [email protected] or send it into our support line and we will be able to profile and see what's going on.

thanks,
Jean

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Jean,

Thank you very much. I sent you a note on your personal email with details.

Best regards,
Pravin

Any progress?

An update? Seemed like we had some good engagement from the Q team here, and then radio silence. I'm interested, since I am noticing limitations in Pipeline related to backtest chunking of computations, where I am trying to run a series of optimizations within Pipeline, which are reasonably efficient, except with the chunking, they go over the 5-minute limit, if I try to run them over the entire standard QTradableStocksUS. I could move the computations out of Pipeline, and then decide whether to put them in before_trading_start or within my call to run the Optimize API and place orders (or in a dedicated function).

More speed please. And consistency too.
The work my algos perform on a daily basis must be broken down into many(>> 45) 50 second or less sub-routines in order to run in the backtest/contest.
In addition to that, it appears that algos sometimes take more time to run in the content than in the backtest environment. I say that, because my algos crash in the contest due to timeout exceptions, while they always execute flawlessly in backtests.
I think the time limitations are holding developers back from producing better systems. I know that is the case for me.
The bar is set pretty high already when it comes to our algo's performance benchmarks. Surely some time constraints are necessary, but they also make our jobs more difficult.

Here is a link to a help request I made regarding timeout exceptions.

Thank you

TimeoutException - Contest Entries Failing

I say that, because my algos crash in the contest due to timeout
exceptions, while they always execute flawlessly in backtests.

Same for me.

Hi All,

We've recently implemented a fix that separates pipeline computation from before_trading_start, and adds a separate 10 minute timeout for this computation -- please see the original post here for more information.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.