Back to Community
The Q500US and Q1500US

Say hello to the Q500US and Q1500US: our two brand new universes. These universes are implemented as regular pipeline Filters, so they can be added to any algorithm in a single line of code. To import the Q1500US to your algo or notebook, all you need is:

from quantopian.pipeline.filters import Q1500US

There are two main benefits of the Q500US and Q1500US. Firstly, they greatly reduce the risk of an order not being filled. Secondly, they allow for more meaningful comparisons between strategies as now they will be used as the standard universes for algorithms.

It was clear that the community would also like a parameterized function that can be used to generate any universe. We heard this and have also added make_us_equity_universe to the Quantopian product.

In the workflow of producing a strong alpha factor, selecting a base universe is a quant's first job. Different strategies will perform better or worse based on the pool of stocks from which it can choose. We hope that the Q500US and Q1500US provide good bases for any strategy. We also hope make_us_equity_universe gives you the flexibility and power to create custom universes that can express subtle and incisive opinions on the market.

Check out the notebook below for an in-depth look at the Q500US, Q1500US and make_us_equity_universe. For a look at the original post outlining the methodology, have a look at this previous thread. Also, see the filtering criteria in the help docs.

Hope you enjoy!

Loading notebook preview...
Notebook previews are currently unavailable.
Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

55 responses

Wow, awesome work. Looking forward to digging into it.

Would you be willing to provide a Q500US predefined universe also for non-pipeline use?

Thanks Gil! Looks like y'all put a lot of work into it. Looking forward to taking it for a spin.

Awesome Gil. Thanks for this.

I want to pick only profitable stocks from Q1500. Is this the right way to do it? Please advise.

def make_pipeline():  
    profitable = mstar.valuation_ratios.ev_to_ebitda > 0  
    pipe = Pipeline(screen=Q1500US() & profitable)  
    sectors = Sector()  
    pipe.add(sectors, 'sector')  
    return pipe  

Nice! I like the way you answered the communities calls for tinkering with the universe. Back of the net!

Hi Gil -

I haven't dug into it yet, but how did you end up handling issues with bad data (e.g. missed stock splits)? Or did it not get rolled in, as I had suggested. I just saw another one reported on the forum that mucked up and confused a backtest, until the author caught the problem.

Grant

@Pravin, that should do the trick. If you need any help getting to grips with Pipeline please feel free to reach out.

@Grant, we did not add this functionality. Any automatic screening of stocks with known bad data will lead to lookahead bias in the backtester. While this may make the backtest more accurate, it does this at the expense of not being able to port code to live trading (it is impossible to screen out bad data that has yet to come in).

Thanks for this Gil. Any word on Futures compatibility? I know you guys have been working on that for a while, just curious what kind of outlook you have on ETA.

Thanks for your work Gil and for the interesting NB. I'd like to ask some more details regarding the "Limiting Turnover: Smoothing" part. Lots of efforts have been invested in that detail and I am disappointed I don't clearly see the bad consequences a high turnover can have on an algorithm. Could you elaborate on that please?

@Nicholas, No updates on that I'm afraid, sorry about that.

@Luca Here's a scenario that might elucidate the situation. Let us say I have a portfolio of 50 stocks in a universe of 100 stocks. Every day this universe changes (so at t there are 100 stocks and at t + 1 there are 100 completely different stocks). This will force my algorithm to sell my portfolio at t and buy 50 new stocks at t + 1. These transactions cost money, and these subtract from the returns that the algorithm itself will generate. Ensuring a low turnover reduces these transaction costs.

A better example might be my universe has 500 stocks. Of these, the top 480 stocks that are pretty steady, hardly moving in and out of the screen, just reranking a bit. The bottom 20 are constantly changing, falling just in and just out of the top 500. My strategy might be short mid-cap growth stocks, in which case it would be short many of these constantly changing bottom 20, increasing transactions costs. The hard cut off at 500 is pretty arbitrary, so this would be unnecessarily punishing.

Another way of smoothing would be to keep the top 500 black and white, and instead smooth entries and exits. So, if something falls out of the top 500, I might take a few days to exit, meaning it's less of an issue if it pops back in the next day. This would be fine for a system that holds for weeks.

Another way would be to remove the size cut off altogether, and just trade all stocks, being careful to keep orders small for low volume small cap stocks.

@Dan, all the examples you mentioned are definitely things we considered and your first example describes a behaviour known as "boundary thrashing" that proved to be quite tricky to reduce. Our downsampling smoothing method accomplishes the reduction of this type of turnover well. I think you will find that the Q1500US behaves like your second example.

Gil,

This is awesome, thank you. As for the bad data/stock split issue that Grant brought up, this was brought to light in James Christopher's Long-Short Multifactor algo.

Essentially KGC went through two splits in 2005 and the pipeline doesn't import the split correctly. Splits should be neutral wrt price and the backtest registered it as a multi-million dollar loss. Correcting that issue isn't look ahead bias.

Also, what is the distribution of stocks in the tradeable 1500 among capitalization?

All in all, I plan on using this to do some backtests. Thank you for the hard work! :-)

@Cory I take your point completely and that's definitely something we will seek to fix, my point more is that those splits should be dealt with in the data itself as opposed to these universes. TL;DR we strive to clean up all incorrect data, but that should not be addressed in the universe selection process.

With regard to your second question, I do not have exact figures at the moment but if you come across anything interesting while looking into the universes, please share it on the thread!

Hi Gil,

I'm trying to digest the rationale not to include some tools for filtering out bad data. I guess the argument is that the data errors would have been there historically, and would have affected trading real money. But then why does Quantopian fix them? By fixing them, they are creating look-ahead bias, per your thinking, right? I guess I don't follow. It seems that for split errors, in particular, a broker would eventually fix the problem, right? The money just doesn't appear or disappear from one's account; the accounting error would be fixed. So, there is no look-ahead bias to fix historical data (at least for missed splits).

Yes, you could not do the clean up within the universe selection process, but rather you should not admit erroneous data into the universe selection process in the first place.

Or am I missing something?

Grant

@Grant. The bad data should be cleaned up in the data source itself not this methodology. Adding checks for this would lead to arbitrary code and longer run-times for the universe selection process.

As for look-ahead bias, this would only occur if there was a "no trade list" based on present evaluation of past information.

You're definitely right that bad data is an issue, but the solution would be to write more defensive code rather than encapsulate that process in universe selection.

Allow me to repeat my question, please.

For those of us who do not use pipeline, will there be a way of using the Q500US universe as well?

That would be greatly appreciated.

Hi Gil -

I don't want to detract from the primary effort here on the Q500US and Q1500US, and I can see that you would not want to filter out specific stocks that have data problems from within the tool itself; the solution to the garbage-in-garbage-out problem should be to eliminate the garbage in the first place, theoretically. However, I gather that you'd have to point your universe selection tool to entirely different databases? You'd have the garbage databases, that contain errors, and the clean databases, that have been purged. You'd have to give users the option of picking which databases to use. Am I understanding correctly? But then it starts to get complicated and expensive, since you actually need separate databases? Is this the problem?

There are two problems as I see it:

  1. Users don't have access to the list of data problems presumably maintained by Quantopian. At one point recently, your support listed all the stocks with (split?) problems in a forum posting, but I can't seem to find it. I can't understand why there isn't a public database, ideally that could be queried from within an algo and the research platform. I guess the idea is that data errors are not pervasive, and you will eventually converge to a state of instantaneous cleanliness; as problems are found, they will be fixed immediately.
  2. Assuming that a user can pull together a list of stocks to be avoided, due to erroneous data, then it seems that they would need to be removed after running your Q500US or Q1500US universe selection process. This is not ideal, either, although since we are talking about hundreds of stocks in a long-short portfolio, as long as the "no trade list" is short, it'll work (although if critical benchmarks, e.g. SPY, have errors, it could be a problem).

You point toward some solutions:

The bad data should be cleaned up in the data source itself not this methodology.
the solution would be to write more defensive code rather than encapsulate that process in universe selection.

Do they have any traction at Quantopian? In the limit of lots of data errors, your Q500US or Q1500US universe selection process would be kinda pointless, so I'd think it would be something you'd care about.

I guess I continue to see a basic philosophical difference in my thinking and the Q. What am I missing?

Cheers,

Grant

@Gil

Thanks for the explanation. I'll see what I can do go figure out market caps for Q500/Q1500. I suspect they'll be large cap companies, mostly.

Thanks again,
Cory

@Tim Unfortunately, we are not going to be implementing these universes for non-Pipeline algorithms. If you need any help getting started with Pipeline, feel free to reach out, or have a look at this tutorial. The learning curve is a little steep but I can promise that Pipeline really is an incredible tool for developing algorithms.

@Grant I'm not sure I follow your separate databases point. I think the core of your point is that bad data is bad for algorithms. We agree with this. I think we disagree in the way that the issue should be solved. We believe the database itself should have its bad values corrected. If you want to talk more about data issues I suggest creating a thread on the forum. I think that would definitely encourage some good discussion and be a good place for people to flag up possible data errors.

@Cory Here's a small notebook I wrote up to show the capitalization breakdown of the universes. Your intuition that they are large cap companies is definitely right

Loading notebook preview...
Notebook previews are currently unavailable.

Hello Gil,

The separate database idea was in response to your:

Adding checks for this would lead to arbitrary code and longer run-times for the universe selection process

I figured, just create another set of databases that have stocks with erroneous data removed (not fixed). Then, you could avoid the longer run-times associated with filtering the culprits out within your universe selection process code. Users could select which databases to use--with or without errors.

I'll stop discussing the topic here, and consider starting another thread (which would likely go nowhere, since there seems to be little interest in making it straightforward to avoiding errors prior to their being fixed in the databases, which seems to take a long time, typically).

Hi Gil,

I started trying to get the Q500 running, and see that there appears to be a limitation on the algo start date. Below is a simple modification to the template that is provided for writing an algo (automatically populates the editor). For a start date of 01/03/2002, I get the error:

KeyError: Timestamp('2001-03-13 00:00:00+0000', tz='UTC') There was a
runtime error on line 48.

If I move the start date to 11/01/2002, the error is resolved; the backtest runs; it will not run for earlier dates. Presumably, this is because you are using a trailing window of data in the universe selection process, and there are no data prior to 01/03/2002 (or it is an incomplete data set).

Normally, the API generates an error message that tells the user the earliest possible date on which the backtest can start. Could you do that for the Q500 & Q1500 universes?

Grant

"""
This is a template algorithm on Quantopian for you to adapt and fill in.  
"""
from quantopian.algorithm import attach_pipeline, pipeline_output  
from quantopian.pipeline import Pipeline  
from quantopian.pipeline.data.builtin import USEquityPricing  
from quantopian.pipeline.factors import AverageDollarVolume  
from quantopian.pipeline.filters import Q500US  
def initialize(context):  
    """  
    Called once at the start of the algorithm.  
    """  
    # Rebalance every day, 1 hour after market open.  
    schedule_function(my_rebalance, date_rules.every_day(), time_rules.market_open(hours=1))  
    # Record tracking variables at the end of each day.  
    schedule_function(my_record_vars, date_rules.every_day(), time_rules.market_close())  
    # Create our dynamic stock selector.  
    attach_pipeline(make_pipeline(), 'my_pipeline')  
def make_pipeline():  
    """  
    A function to create our dynamic stock selector (pipeline). Documentation on  
    pipeline can be found here: https://www.quantopian.com/help#pipeline-title  
    """  

    # Create a dollar volume factor.  
    dollar_volume = AverageDollarVolume(window_length=1)  
    # Pick the top 1% of stocks ranked by dollar volume.  
    high_dollar_volume = dollar_volume.percentile_between(99, 100)  
    pipe = Pipeline(  
        screen = (high_dollar_volume & Q500US()),  
        columns = {  
            'dollar_volume': dollar_volume  
        }  
    )  
    return pipe  
def before_trading_start(context, data):  
    """  
    Called every day before market open.  
    """  
    context.output = pipeline_output('my_pipeline')  
    # These are the securities that we are interested in trading each day.  
    context.security_list = context.output.index  
def my_assign_weights(context, data):  
    """  
    Assign weights to securities that we want to order.  
    """  
    pass  
def my_rebalance(context,data):  
    """  
    Execute orders according to our schedule_function() timing.  
    """  
    pass  
def my_record_vars(context, data):  
    """  
    Plot variables at the end of each day.  
    """  
    pass  
def handle_data(context,data):  
    """  
    Called every minute.  
    """  
    pass

Hi Gil,

I'm getting:

There was a runtime error.
ZeroDivisionError: float division by zero
USER ALGORITHM:69, in my_rebalanceGo to IDE
weight = 1.0/len(context.security_list)

The error occurs after about 80% of the backtest has run. Why would the number of securities drop to zero out-of-the-blue?

Grant

Clone Algorithm
68
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 57ca9d5bb4efaa1018edfb5d
There was a runtime error.

@Grant, you will see from the notebook that the key metric for the Q500US is AverageDollarVolume(window_length=200). As our database stretches back to 2002 it will only work 200-days after the start of our data. This still gives 13+ years of backtesting potential.

As for the second point, you are using two screens, high_dollar_volume & Q500US(). It is possible that there is no overlap between these two screens, giving you len(context.security_list) equal to 0, causing your error

Gil,

Thanks for the quick feedback. I wasn't so concerned about whether it is 13 or 14 years of backtesting, but that the error is not obvious. I suggest that you make it consistent with what I've seen: the error report includes the earliest possible date that the backtest can be started when the history look-back window requires data prior to 1/3/2002. Is this possible to include? I can provide an example if you are not familiar with this behavior.

I'll re-run without the high_dollar_volume criterion to see if that clears things up.

By the way, I suggest adding the Q500US() (or Q1500US()) to the auto-template that is provided when one initiates an algo from scratch.

Now I'm getting:

There was a runtime error.
MemoryError
Algorithm used too much memory. Need to optimize your code for better performance. Learn More

Also, mid 2015 there is a huge spike in turn-over.

Clone Algorithm
68
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 57caba6e3acde112d0361cc2
There was a runtime error.

@Grant The memory error is something we are aware of and is a result of two factors (1) The long backtest length that you selected and (2) The amount of data used to store transaction details for 500 orders placed every day. We are looking into ways to solve this, but the bottom line is that the number of transactions that your algorithm is using is very high. If you want to run this "Equal Weighted Q500US" algorithm, I suggest running it from 2006 onwards. TL;DR the Q500US is not the issue, the number of transactions that your algorithm places is.

It is great that you bring this up as this is something we are actively trying to fix. However, this algorithm presents and upper bound for Q500US usage and we have ensured that it is good to go for alpha factor algorithms, where fewer transactions need to be stored.

As for turnover, I do not see where you are observing these metrics, but if you want a sense of the turnover in 2015, this is shown in the notebook above,

Gil,

If you run the backtest, you'll see:

There is a big spike in the transactions in mid-2015. I'll also send a screen shot to Q support.

Gil,

Here's an example that trades weekly, and it also shows the spike in turnover in mid-2015, relative to the transaction activity at other times.

Grant

Clone Algorithm
68
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 57cb1b815825b70ffc5d6b37
There was a runtime error.

TL;DR the Q500US is not the issue, the number of transactions that your algorithm places is.

I would agree Q500US is not the issue but still the issue is in your side which you have to try to address to make Q more viable and easy to use for wider set of scenarios.

For a simple algo like this is running into memory issues you cannot say it is the fault of the algo. I would expect a user should be able to run more complicated algos without memory issues. Have a system which off loads past calculation into a DB or serialise it on disk or use memory mapped files to store the data where only needed data is in memory.

A similar memory limit issue exists for the research platform. Loading a large backtest data set via get_backtest will consume RAM, and eventually crash at the 4 GB limit. I'm finding that anything that trades hundreds of stocks daily starts to bump up against the limit, for backtests back to 2002.

A topic for a different thread, but it is perplexing, since Q must have infinite disk space. Maybe the read/write is not as efficient as one would have in a desktop computing environment? Maybe the disk storage is distributed/cloud-based and relatively slow, so everything needs to be in RAM local to the processor?

Hi Gil,

Here's a run on the Q1500US. If I'm reading things correctly, mid-2015, it lands on a single stock (PFG). Is this due to the pipeline universe selection or something in my implementation of it? Is it coming from the Q1500US only picking PFG?

Grant

"""
This is a template algorithm on Quantopian for you to adapt and fill in.  
"""
from quantopian.algorithm import attach_pipeline, pipeline_output  
from quantopian.pipeline import Pipeline  
from quantopian.pipeline.data.builtin import USEquityPricing  
from quantopian.pipeline.factors import AverageDollarVolume  
from quantopian.pipeline.filters import Q1500US  
def initialize(context):  
    set_commission(commission.PerTrade(cost=0))  
    set_slippage(slippage.FixedSlippage(spread=0.00))  
    """  
    Called once at the start of the algorithm.  
    """  
    # Rebalance every day, 1 hour after market open.  
    schedule_function(my_rebalance, date_rules.week_start(days_offset=1), time_rules.market_open(hours=1))  
    # Record tracking variables at the end of each day.  
    # schedule_function(my_record_vars, date_rules.every_day(), time_rules.market_close())  
    # Create our dynamic stock selector.  
    attach_pipeline(make_pipeline(), 'my_pipeline')  
def make_pipeline():  
    """  
    A function to create our dynamic stock selector (pipeline). Documentation on  
    pipeline can be found here: https://www.quantopian.com/help#pipeline-title  
    """  

    # Create a dollar volume factor.  
    dollar_volume = AverageDollarVolume(window_length=1)  
    # Pick the top 1% of stocks ranked by dollar volume.  
    high_dollar_volume = dollar_volume.percentile_between(99, 100)  
    pipe = Pipeline(  
        # screen = (high_dollar_volume & Q500US()),  
        screen = Q1500US(),  
        columns = {  
            'dollar_volume': dollar_volume  
        }  
    )  
    return pipe  
def before_trading_start(context, data):  
    """  
    Called every day before market open.  
    """  
    context.output = pipeline_output('my_pipeline')  
    # These are the securities that we are interested in trading each day.  
    context.security_list = context.output.index  
def my_assign_weights(context, data):  
    """  
    Assign weights to securities that we want to order.  
    """  
    pass  
def my_rebalance(context,data):  
    """  
    Execute orders according to our schedule_function() timing.  
    """  
    # pass  
    weight = 1.0/len(context.security_list)  
    for stock in context.security_list:  
        if data.can_trade(stock):  
            order_target_percent(stock,weight)  
    for stock in context.portfolio.positions.keys():  
        if stock not in context.security_list:  
            if data.can_trade(stock):  
                order_target_percent(stock,0)  
def my_record_vars(context, data):  
    """  
    Plot variables at the end of each day.  
    """  
    pass  
Loading notebook preview...
Notebook previews are currently unavailable.

@Suminda To address your point "I would expect a user should be able to run more complicated algos without memory issues", that is the case. Many members of the community are coming up with interesting, statistically creative algorithms. The one here just trades an equal-weighted basket of Q500US stocks. The issue here is not complexity, but memory space (taken up by transactions). Also, with regards to your point on memory management, the team here at Q has made it so that the way algorithms use memory is highly optimized, using advanced techniques to make your memory allocation on our servers go as far as possible.

@Grant Q definitely does not have "infinite disk space". We have a large amount of computing power, but we need to ensure that that is available to use for our entire community. Again, the issue you are talking about here is due to the vast number of transactions in these algorithms that you are writing. 500 transactions per day, 252 business days per year over 13 years gives 1638000 transactions, all of which can be inspected and queried in get_backtest. I encourage you to turn on commission and slippage models and you will see how much if the algorithm's return is eroded by transaction costs. This is the same issue that the Q500US tries to solve by reducing turnover.

As for your Q1500US spot, this is something we will look into. We subjected this product to extensive testing and this never came up. We will get back to you when we have an idea of what caused this, but there is certainly something odd going on here.

Just testing for one factor with 500 stocks in Alphan seems to hit the memory limit in anything more than 2 to 3 years so there is infact a memory issue which ideally should be overcome. It is frustrating to have a tool like Alphan but not being really able to use it due to memory issues.

Thanks Gil -

I would back up Suminda's comment. I am working on an algo that trades a portfolio of 200 stocks daily (1 set of trades per day), with the commission & slippage models set as standard--it is profitable, and realistic at $10M. If I run a backtest back to 2002, I can't load all of the backtest data into the research platform w/ get_backtest. I have a support request in, but there's been no response.

It is my contention that you are over-constrained on computing resources, relative to those available to developers at hedge funds. I think you need to find a way to break through this limitation, while still providing a base level of support to all 80,000+ users. It'll eventually come, I expect. It is a different paradigm. For me, if I were running out of resources, I'd just get a used workstation-class pc, load it up with memory and drive space, maybe a GPU, and the problem would be solved. I understand you have to work the solution from a different angle.

Regarding the apparent glitch/bug, glad to help. Note that I might not have found it without running an extensive backtest, so sometimes it helps to go to extremes to really "kick the tires."

Hi Gil,

Does the Q500US and Q1500US automatically remove anything in security_lists.leveraged_etf_list or would I need to purge the universe, post-pipeline?

Grant

EDIT - As I understand, the Q500US and Q1500US will be, by default, completely devoid of ETFs. Correct?

Hi Gil,

Another quick question -

I don't understand why the attached backtest takes so long to run, in light of my understanding of your implementation of the Q1500US. I had the impression that you would be pre-computing the universe, point-in-time. There would be no computations associated with determining the Q1500US universe, for backtesting. However, the backtest is painfully slow. Is this expected? Note that there are no transactions, so I have to think that all of the overhead is somehow associated with pipeline.

Grant

Clone Algorithm
68
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 57de92d632953b10440072bf
There was a runtime error.

Hi Grant,

The universes do remove ETFs, so no need to do this.

The Q1500US is not precomputed, it is a series of Pipeline function calls that can be used out of the box. Your algorithm is doing most of the heavy lifting that one would associate with an algo, with the exception of ordering logic.

I thinks it makes sense to pre calculate this so more resources are available for the users. Q500 and Q1500 will be the same across all users.

Hi Gil,

I thought I'd re-run the code above (https://www.quantopian.com/posts/the-q500us-and-q1500us#57cc0a2ead60fa742e0008e6) to confirm that you'd fixed the reported problem, and I got this error message:

Something went wrong. Sorry for the inconvenience. Try using the built-in debugger to analyze your code. If you would like help, send us an email.  

Perhaps unrelated to application of the Q1500US, but I thought I'd bring it to your attention, just in case. Was the problem fixed (apparently associated with the stock PFG), by the way, since I was unable to verify?

Any thoughts on the execution speed for my code posted above (https://www.quantopian.com/posts/the-q500us-and-q1500us#57dea554f4b69fa9b1000aa8)? There's an awful lot of overhead just to spit out a list of stocks (that could be determined ahead of time). What's going on?

Clone Algorithm
68
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 57e068131840671036853830
There was a runtime error.

Hi Suminda and Grant,

Your idea of precalculating the universe is definitely something we considered in developing the Q500US. Although this would be incredibly useful in backtesting, live trading using the Q500US would be far harder.

I believe that he problem has been fixed. It was based on a way we were dealing with fundamental data as opposed to an issue with the universe itself.

Although I have not delved into your algorithm in great detail, I would say that the slow speed is due to the fact that you are rebalancing your algorithm once every week. The main time cost here will be the ordering logic used in this rebalance. I would recommend using the Q1500US on an alpha factor to see how it performs for its intended purpose.

Hi Gil -

I just ran the attached backtest, and it took ~ 30 minutes to run. Yet it does nothing other than keep the universe up-to-date (no transactions, no recording of variables, no logging, etc.). So, to run a full backtest, over the maximum lookback period incurs an overhead compute time of ~ 30 minutes. It seems too long, no?

I don't understand your point regarding precalculating the universe for live trading. Why would this be problematic? If it takes me 30 minutes to run the attached algo, wouldn't it take about the same time on your end to write out the new point-in-time universe? I don't follow. It doesn't sound difficult. Or maybe it just doesn't make sense, because if users are doing other things in pipeline, their backtests will be slow, anyway? Or maybe some of your databases don't get updated until just before market open (you wouldn't have 30 minutes to run the script, do QC, etc.).? How did you decide not to do the precalculating, given that it would seem to be a natural approach?

Would it be feasible for individual users to do the precalculating in the research platform, store the point-in-time universe in memory, and then run backtests using the list? Would this speed things up dramatically? I'm just wondering, if all I want is the point-in-time universe, could I just spit it out, and then use it in a backtest within the research platform, versus chug-chug-chugging along in the backtester, waiting for it to do whatever it is doing?

Grant

Clone Algorithm
6
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 57e4e0d889543112e2e65b13
There was a runtime error.

I think what grant is trying to get at is that Q500US and Q1500US (an perhaps may be Q1000 and Q100 - my thinking) should be should be a dataset. This will solve the time issue for the users.

I'm also seeing that backtest performance is very negatively affected once I started integrating QUS500 and QUS1500 into my algos. In fact the ones I really care about are timing out now.

It is not clear that the slowness is related to the Q500US & Q1500US. It may just be under the hood in pipeline. Somewhere I'd heard that the fundamentals are slow. Maybe I should try a price-only query? See if it is a lot faster?

@Grant - I had given up running this particular algo a couple of days ago because of time outs. I just tried running it again and they are no longer timing out and are running a little faster. Just catching up on the thread, Gil mentioned they fixed an issue a couple of days ago so that probably was the culprit. Sorry for raising an alarm before catching up on the whole thread.

Same here. It self-corrected or was fixed by a human.

Hi,

Can you add these universe filters to zipline?

Best,

Hi Gil,

I use the Q1500US in my algo but find out, many of the stocks in Q1500US are penny stocks.

Cheers

Thomas

Hey, I think the name Q500US can be a bit misleading. It's the top 500 by liquidity, not by market cap. Therefore, you may want to do this, to ensure you have liquid large caps:

universe        = Q500US() & MarketCap().top(500)  

Or even:

universe        = Q1500US() & MarketCap().top(500)  

Brilliant.

Hello,
if many of Q1500US are penny stocks with a small market cap, is that still ok to trade these in my Algo? Is there a problem in trading penny stocks ?
Wouldn't be better that these stocks are excluded directly from Q1500US?
Thanks

The definition I believe is a share-price below $5. Not that this means a huge amount, but it can be evidence the company has tanked in value, or that the trading costs will be high as a percentage. It's generally applied to illiquid stocks that have little published information. Given that Q1500US is top 1500 by volume, I suspect iliiquidity isn't an issue.

Do you have some examples?

@Gil, does the universe also include REIT? Because I found (WRI US Equity) which is REIT in the Q1500US