Back to Community
overnight processing of large data sets

Hi guys,

I'm new here. I'm looking to do the following,

  • at market close, load adj close and volume of 6000-8000 securities for the last 2-3 years. (so two [700x7,000]-ish dataframes)
  • run my algorithms over many hours, and generate an orders report
  • execute on orders on the next opening day at desired price (if available at price).

is this possible? the last one should be easy, but so far it appears to me that Quantopian is systematically constructed to prevent you from loading large data and doing long processing.


2 responses

There is a limit of 5 minutes processing time for before_trading_start() and you can only update the universe with a max of 500 securities per day (although you can hold positions in more than 500). Pipeline can operate over the entire database.

You might be able to do it, so long as your computations can be completed within 5 minutes. Can your computations be written in vectorized form?

Grant, thanks for the reply.

the code is about as close to optimized as possible and running on numpy. i guess i have 3 options. write my own backtester, use the open-sourced zipline, or load a crippled version online and see how it performs. Interestingly, i think i can finish the backtester in a few hours and getting something running on Q's website would take a few days because i don't know the Q interface at all.