Simulating S&P 500, Russell 1000, Russell 3000 in Research

It is fairly common in Research to want to run analyses on well known indices. Here are code snippets that try to approximate the S&P 500, Russell 1000, and Russell 3000 Indices. If you want to know more about each index's construction check out the links below.

242
Loading notebook preview...
Notebook previews are currently unavailable.
Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

22 responses

Thank you for the welcome proxy of the indices!!!!

I have a few questions:

1)
In Research, in the help page for get_fundamentals, it is stated that:

Before calling get_fundamentals, you first need to call the
fundamentals initializer once with
fundamentals = init_fundamentals()

Is the requirement valid for Research only?

In the IDE, get_fundamentals seems to work without initializer

2)
It seems that the parameter
base_date: str in the format "YYYY-MM-DD"
is currently not used by the get_fundamentals.
Can you confirm it?

In the IDE I tried to provide a date as a string in the required format or as datetime.date or not to provide it at all.
The outcome of get_fundamentals seems to depend only from the backtesting date at which it is called.

3)
How often is the "universe" in get_fundamentals updated?

In my backtesting I queried get_fundamentals the first 16 days in April 2014 obtaining apparently always the same output (including only GOOG_L).
Beginning from April 17, I started to observe together GOOG_L and GOOG in the first positions of the results.
But according to https://en.wikipedia.org/wiki/List_of_S%26P_500_companies the GOOGLE stock split happened on April 3, 2014.

Is this split (and all the other S&P500/capitalization table changes) taken into account with the correct date in Q fundamental data?

Hey Nicola!

Let me address your questions

1) Yes. You only need to init fundamentals in Research. We take care of that in the IDE so you can focus on writing your algorithm.

2) With regard to the base_date parameter you can't change that in the IDE because when you are back testing your algo it must be getting the data at the time in the backtest, that prevents you from getting fundamental data from a different time period, e.g. the future. However, when researching a strategy it might be nice to know for example earnings growth, and in that case you would need to be able to get the earnings data for mulitple dates. So that is why you can use the base_date in Research but not in the IDE.

3) When you mean universe in get_fundamentals are you talking about in the IDE or in Research?

James Christopher

Many thanks for the clarifications, James!

3) IDE
Is there any difference between the results you can obtain from get_fundamentals in the IDE and in Research?

What about the fundamental data, in particular the market caps used to build a proxy of the S&P500?
How often are they updated?
Are special events (e.g new stocks, like with the GOOG split, or stocks removed) immediately reflected?

There is no difference between fundamentals in the IDE and in Research, they both pull from the same data source. In regards to your question about market cap calculations we use the most current market cap which would be the most recent closing price x the most recent reported shares outstanding. The calculation is slightly different if the security is an ADR.

*EDIT: @Josh Payne just sent me some more details on the market cap data...

The market cap is updated daily. Morningstar executes this
calculation. Prior to May 2014, we only have monthly updates from
Morningstar, our data provider for fundamental data. So though the
data is in reality updated monthly, your algorithm will only see
changes on a monthly basis from 2002 - 2014

This explains my observations in the backtesting.

Many thanks again!

How could I use this to pull the S&P 500 and run it with this algo?

"Look Back For Max High/Low Within A Certain # Of Days" I tried to stick in the front but it would hang on today.

Chad,

All you have to do is copy and paste this code into the IDE. However the IDE does somethings automatically that so take note of the following nuances...

1. You don't need the init_fundamentals() line.
2. You don't need the from sqlalchemy import or_ statement
3. You don't need the exchange filter in the query

and finally, as with any use of get _fudnamentals don't forget to update your universe with

update_universe(sp_500)


i think you do need the

from sqlalchemy import or_


@Peter

The sqlalchemy _or method is used for the exchange filtering, this is really just a safeguard to guarantee you only get equities traded in the US while in Research, and is really not necessary in the IDE. As a result of the IDE being so capable you can omit the exchange filter and consequently the sqlalchemy import statement.

J

I agree with Peter.
Without the import the or_ function is not visible and you get an error message in the IDE.

---------------------------------------------------------------------------
NameError Traceback (most recent call last)
in ()
12 .order_by(fundamentals.valuation.market_cap.desc())
13 .limit(500),
---> 14 today ) # S&P 500 rebalances on an as needed basis, so we'll just use today
NameError: name 'today' is not defined

Hey John,

I purposely left today undefined so someone can define it themselves with either a datetime.today() style method or with a string of their choosing, like in the examples for the Russell 1000, 3000.

You can get an updated list of the S&P 500 from here: https://www.quandl.com/resources/useful-lists

However, I believe you have to add start dates before it can be used in fetch_csv.

It is better that Q could provide a utility to fetch official index holdings/constituents online since there are other info like weights, shares which might be important for your strategies. For example, for SPY:

Do you have... a list symbols of Russel 3000 or a sid list. thanks

I'm getting very variable results for the S&P500 analog.

On 2009-01-02 it returns 74 stocks, but on 2015-04-01 it returns 498 stocks.

Why is this?

4
Loading notebook preview...
Notebook previews are currently unavailable.

Could it be that the check "fundamentals.valuation.market_cap > 4e9" should be year dependent, so that if you perform the same query a long time ago you have to lower down that threshold?

@JamesC, I've been trying to recreate the S&P500 index above in pipeline, but I get an error message saying that company_reference.primary_exchange_id and company_reference.country_id are not currently available within the pipeline API. Do you have any idea when they will become available?

Thanks.

Yeah, I totally get your frustration (I experience it too). Unfortunately, I can't really give you a good time table on that, if you're familiar with the software industry you know that bugs can crop up and delay things, or cause other projects to be prioritized. That being said we are aware of the algo creating restrictions this causes and have a few things in the works.

@Luca
Thank you that was the answer -- I wonder if there are historical market_cap values somewhere online? On second thoughts, it may be easier just to use a lower value and work with it.

I added a filter on (fundamentals.share_class_reference.is_primary_share == True) to remove some of the untradable oddball class B shares that the current rules pick up. Still probably not quite accurate but it improves the selection a little bit.

What's the rationale for the market cap filter cut off? All that's needed should be captured by the order_by + limit. I thought it was there to speed up the query or something.