Back to Community
Contest Ready but I can do better

I have a version that performs better, but uses Fetch_CSV to get PMI, so it's obviously ineligible for contest entry. And with that limitation in mind I ask, why doesn't Quantopian expose PMI data? It's free on Quandl, you would think the biggest leading indicator (besides the market itself) would be of great importance to any quantitative strategy... especially ones that must conform to turnover constraints. My understanding is that GDP, CPI and Unemployment claims (lagging indictors) are largely only relevant as confirmations of economic trend with regard to PMI. Please correct me if I am mistaken in my understanding, but I'm really hoping someone can address why Quantopian doesn't offer this otherwise easily obtainable data set. Also, any criticism on the attached tear-sheet would be greatly appreciated.

Loading notebook preview...
Notebook previews are currently unavailable.
7 responses

Stephan - You should use Self-Serve Data to upload your own dataset. Custom datasets uploaded via Self-Serve Data are allowed in the contest. Given that you have some ideas on how to incorporate PMI in your strategy, I'd recommend uploading it via Self-Serve and adding it to your algorithm's pipeline.

fetch_csv is not allowed in the contest because it doesn't maintain a historical record of updated data, meaning it can be manipulated to make a backtest score highly after it was submitted (introduces lookahead bias). At some point, we will look to remove this function from the API. Self-Serve is currently the best way to upload custom datasets.


The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Does Quantopian have any plans on bringing in Quandl's PMI dataset?

Not at this time, no. We are currently focused on adding datasets from FactSet's catalog. If you want to use PMI data in your algorithm, creating a custom dataset via Self-Serve is your best option.

ok thanks

I created a self-serve data set for PMI as you suggested. I add a dummy symbol to the CSV that I import (SPY), since I just want to extract PMI for a given period. However, I'm having difficulty understanding how to lookback in my self serve data set. Typically I would implement a custom factor ala'

class getpmi(CustomFactor):  
        inputs = [test_dataset.pmi]  
        window_length = 20  
        def compute(self, today, assets, out, pmi):  
            out[:] = pmi[-1] - pmi[-2]  

...but it just returns:
getpmi([test_dataset<US>.pmi], 20)

I just want the difference of the last two periods. Sorry for my ignorance but how do i iterate through this self serve data set? I'm confused.

I hacked a workaround by holding last month in memory....

So how do I remove a self serve dataset from my list of datasets @ Or/And how do I update an existing self serve dataset? I understand that setting up live data is probably the best way to maintain a dataset but for the meantime I would just like to update but it dones't seem possible?