Back to Community
Best way to have thousands of stocks in your pipeline

Is there an efficient method to having thousands of stocks in your pipeline without causing backtests and your algorithm to lag? I am looking to create an algorithm that will need to minutely poll as many stocks as possible.

When I created an algorithm that had minimal screening on the pipeline so that it can return 5000+ stocks, the backtest lagged so much that it could not complete.

Note: The reason this is necessary is because I do not know what stocks to trade at the start of the day, and need a particular trigger to occur during the day. This trigger can happen to any security in the stock universe and could be effective on the security regardless if it's in the Q500 or a thinly traded stock, so I need to cast a very wide net.

5 responses

You could try running another screen first, such as a volume screen. If you're looking at 5000 stocks that probably includes lots of thinly traded ones.

My intention is to have minimal screens. Trading thinly traded stocks is part of the algorithm's strategy. I have a volume screen, but it is intentionally very generous on the minimal number of average daily shares.

Sofyan -- consensus seems to be that supposedly Quantopian's slippage model is not accurate with low volume stocks, so you can easily get inflated backtest results. So keep that in mind to at least be skeptical of those results.

Otherwise, I've also run into the same problem with less than a couple thousand stocks... couldn't apply to the Q Open because of the 2-hour backtest limit. Backtests randomly fail with no specific error message.

To help speed things up, instead of using handle_data, I use schedule function to run my checks only once every 5 to 15 minutes. In some cases for whatever reason this actually improved my returns/stats.

Other than that, I've just used simple optimization tricks. Like when you're looking for your trigger there is probably a set of criteria, so start by weeding the stocks out with whatever criteria requires the least computation. Don't compute anything you don't need to. You can even create a redundant criteria to put at the front if it will help weed out a significant percent of the stocks with minimal computation.

Hi Viridian, yes the slippage model is absolutely, 100% not accurate with low volume stocks and I am quite aware as I have created prior algorithms trading fast-moving, thinly traded stocks :D. Fortunately, I was able to extrapolate a prior post to create a custom slippage model which helps remediate this issue (link:

I will try optimizing the algorithm a little bit further, but a couple questions I have (which may need a Q employee familiar with the processing limits of the platform):
- Is the slowness happening whenever I am iterating through the pipeline or is it happening when I am creating the 5000+ stock pipeline?
- If the algorithm is slow in the backtest, will there be problems when the algorithm is actually running live? Will I face memory issues or will the algo fail to enter certain trades if it can't iterate through the entire pipeline of stocks within a single minute?

If you're worried about not being able to process all the stocks within a minute, perhaps the only solution would be to split them up so that every minute you only run your checks on 1/10th of the full list or something along those lines. Or, and I haven't tried this, you could keep a timer and so when you're nearing the 1-minute mark save yr current position and break out of the loop. An algo that runs that slow though would be excruciating to backtest.