Back to Community
New Dataset: FactSet Fundamentals

New Dataset

Today, we added a new dataset to the platform: FactSet Fundamentals. The FactSet Fundamental dataset provides access to more than 800 corporate fundamental data fields. The full list of available fields can be found on this reference page. FactSet Fundamentals data is available in Research and Backtesting via the Pipeline API. You can access FactSet Fundamentals data in Pipeline like this:

from quantopian.pipeline.data import factset

quarterly_sales = factset.Fundamentals.sales_qf.latest  

Usable in the Contest

Since FactSet Fundamentals are available in backtesting, you can use the dataset in the contest. As requested by a community member in another thread, we are increasing the limit on the number of contest entries per person to 4 so that you can make a new entry with FactSet Fundamentals without having to pull one of your existing entries. Algorithms that use FactSet Fundamentals are eligible to be considered for an allocation.

Attached is an example notebook that uses FactSet Fundamentals in a pipeline to help get you started.


New Features

There are some new features that are unique to FactSet Fundamentals (these features do not apply to the Morningstar Fundamentals integration):

Reporting Lag Simulated Based On Fiscal Quarter

Most fundamentals data comes from public reports that companies are required to publish either quarterly or annually. Companies file these reports after the close of each quarter/year, but the exact amount of time between the period end and the filing is different from company to company and even from period to period. In the US, for example, companies have 45 days to file their quarterly reports for Q1, Q2, and Q3, but they have 60 days to file for Q4.

Assuming that quarterly reports were available immediately after the close of a company's fiscal period is an easy way to introduce lookahead bias into a model. To prevent this form of lookahead bias, Quantopian timestamps each data point as it is downloaded from the vendor on a nightly basis. We use the timestamps to inform Pipeline when each data point can be introduced into the simulation.

Of course, the approach of timestamping data as it is downloaded doesn’t work with historical data that existed before Quantopian started collecting it. We model timestamps of historical data points by using the report dates provided by the vendor. When a report date isn’t provided, we lag the fiscal period end date. In our Morningstar integration, we lag the quarter/year end date by 45 days - the maximum allowed time to file the report for companies in the US in Q1-Q3. For our new FactSet integration, records without a file date are lagged by 45 days for Q1-Q3, and by 60 days for Q4, which is the maximum allowed time to file the year-end report for companies in the US.

The new lag approach will be applied to FactSet Fundamentals in other markets as well. As we roll out new markets, we will be adding fundamentals data with a similar lag algorithm. The number of days to lag will vary based on the reporting laws in each country.

Data Holdout Period

Quantopian Community access to the FactSet Fundamental data is subject to a trailing 1-year holdout period. For example, because today is October 11th, 2018, you can access FactSet Fundamentals data through October 11th, 2017. Unlike our subscription-based datasets, the free version of the FactSet data will be usable in contest algorithms without a subscription. Similarly, algorithms that use FactSet Fundamentals will be evaluated by Quantopian using up-to-date data. Submitting an algorithm to the contest that uses FactSet data will follow the usual submission process. The only difference is that your backtest prior to submission can’t run through the holdout period.

We are hopeful that the holdout will help to reduce the risk of overfitting. Since Quantopian will be using up-to-date data to evaluate and score contest entries, this is a good opportunity to build factors on in-sample data and test them in the contest using out-of-sample data.

Coming Soon: Global Coverage

FactSet Fundamentals data has global coverage. As we expand the Quantopian platform to support global equity research, we will be adding FactSet Fundamental data for each new market. Our global expansion is starting in Research. Backtesting will not support global equities right away. We will be making another announcement soon with details about the global integration.


Other Notes

  • This data is now usable in the contest. We encourage you to play around with it and see if you can incorporate it into your research! And as always, please let us know if you discover any issues or if you have any questions.
  • Pipeline allows you to access FactSet Fundamentals data back to 2004.
  • The update frequency of various fields are denoted with a suffix. For example, fields ending in _qf are updated at a quarterly frequency, _af are updated at an annual frequency, and _saf are updated at a semi-annual frequency. See the FactSet Fundamentals reference for more information.
  • Currently, our FactSet Fundamentals integration doesn’t include their Last Twelve Month (LTM) data, nor any per-share fields. We plan to add both of these types of fields at a later date.
  • We are expecting to apply a minor update to this dataset in the next couple of weeks. Specifically, there are some data points in September/October 2018 that are surfaced a day late. When the update is applied, some of your simulation results might change, but it should only be for the September/October 2018 period. Since this data is in the holdout period, you would only notice this if you use FactSet Fundamental data in a contest algorithm.

Happy coding!

Loading notebook preview...
Notebook previews are currently unavailable.
Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

28 responses

Wow Jamie! Thanks. This will be a game changer to get access to FactSet kind of dataset on Quantopian.

Very much looking forward to spend a good amount of time and putting it to good use. Great news!

Hi, this is great - thank you! Will we ultimately have access to consensus estimates? E.g., instead of using TTM earnings for a P/E ratio, would be great to be able to use estimated NTM earnings.

Really great post, thank you! And congrats to everyone involved for a BIG achievement!

Thanks also for the thorough and detailed documentation/description of the new data fileds. Very helpful and important in my opinion.

FactSet is better than I thought it would be. The new dataset makes it easier to tell the difference between annually and quarterly data. Also the data makes it easier for people to try different fundamental combinations. Not having to loop through a years worth of data is a time saver. Not happy about some of the new data fields because they are giving away some of my edge. Oh well! Anyhow, the new data set should help make better algorithms for everyone :)

great news!

will Morningstar be deprecated?

I'm glad everyone's excited to use the new dataset.

@Tom: Yes, we are planning to add consensus estimates data. We don't have a timeline yet, but it's in the cards.

@Maester: We don't currently have any plans to deprecate the Morningstar integration.

Jamie, How can we subscribe to up to date data?

"Quantopian Community access to the FactSet Fundamental data is subject to a trailing 1-year holdout period. For example, because today is October 11th, 2018, you can access FactSet Fundamentals data through October 11th, 2017. Unlike our subscription-based datasets, the free version of the FactSet data will be usable in contest algorithms without a subscription."

Thanks

@Kamran: The up-to-date FactSet Fundamentals integration is available as part of Quantopian Enterprise. The Quantopian Enterprise page gives a good overview of the product. If you want to inquire about the details, click through the 'Request Demo' --> 'Contact Us' links.

Is Factset only released for US equities right now?

If not, how do you filter between global markets?

Only US equities right now. They're working on rolling out other global EQ markets.

@Evan: Joakim is correct. The FactSet Fundamentals implementation referenced in this thread covers the US. However, the FactSet Fundamentals product has global coverage, and we are working on adding data for global equity markets, including FactSet Fundamentals. We're expecting to share an update on that project in the coming weeks.

Will the new integration with FactSet allow us to backtest further back in the past? I mean starting earlier than 2002?

From the initial post above:

Pipeline allows you to access FactSet Fundamentals data back to 2004.

In the future they may look at providing earlier data, but don't think it's currently a priority.

In a few cases there is a variant with suffix _cf (see below).

What does it mean? Or is it an error in the documentation?

dps_ddate_cf,dps_ddate_qf,dps_ddate_saf
pbk_cf,pbk_af,pbk_qf,pbk_saf
pcf_cf,pcf_af,pcf_qf,pcf_saf
pe_cf,pe_af,pe_qf,pe_saf
psales_cf,psales_af,psales_qf,psales_saf

I’ve wondered about this too. _cf = ‘current field” maybe? Could be confused with “cash flow” though.

The dps_ddate is particularly strange, because apart from _cf only _qf and _saf are listed (_af is missing, I've not checked if it's actually available in the system). The others are all ratios, it may have some meaning there.

I have only just started to look at what is available via Morningstar and via FactSet.
There seems to be a great deal of overlap. Why did you introduce this on top of and as well as Morningstar?

I am assuming both/and or either can be used in the contest?

Joakim, I think you're right that the "c" stands for "current". Maybe the ratios are calculated as the estimate for the current fiscal year divided by the current price. But as far as I can se these fields are never available (at least in the historical data) so I have no way to check.

By the way, I missed a few _cf fields in my previous list:

div_rate_cf
shs_closely_held_cf
shs_float_cf
div_yld_cf
div_yld_secs_cf
earn_yld_cf

Zenothestoic, you're right that some valuation ratios and yieds are included in the dataset. And given that price_close_fp is available, it's possible to calculate the corresponing "per share" fundamentals. But it seems that DPS, EPS, book value per share, etc are not directly available. I don't know why, if they can be calculated easily anyway.

Many thanks. Yes I think the ratios are only calculated as of the price the relevant data is published. But clearly one could do the calculations for historic "div_rate" and apply it to the current price. Can you use Morningstar in the competition?

Its very important to keep both data sets. The FactSet data is delayed and one has no way to look at the recent data.

Agree, Morningstar fundamental have the current valuation ratio, its important to keep both. Besides, a good trading strategy should work in both data set IMO.

@kamran @stanley which fields did you notice are delayed with Factset? Thanks!! Trying to do all the mapping and gauge update frequency...

Kay, I mean they are held back by 12 months based on the agreement between Q and FactSet. FactSet sells the up-to-date information separately and hence not possible to look at the current data in that dataset.

Hi @Jamie,

Regarding LTM:

Currently, our FactSet Fundamentals integration doesn’t include their
Last Twelve Month (LTM) data, nor any per-share fields. We plan to add
both of these types of fields at a later date.

It looks like these are available now? Or are there more LTM fields that you're in the process of adding?

Update: Looks like many have been added, but that there are still a few that needs to be added?

com_shs_out (Common Shares Outstanding) is reference a number of times in the documentation, but doesn't appear to be accessible by itself (only the com_shs_out_date field). Should this field (com_shs_out) be available directly?

There appears to be others as well. 'bps' for example (book value per share) is reference throughout the documentation for other fields, but not accessible directly. Shouldn't it be?