Back to Community
Is it possible to change the privacy setting for a personal self-serve dataset to "public"?

I put together a script that pulls from my broker a flag for whether stocks are "easy to borrow" (ETB) and outputs it in CSV format.

This could be useful to people for a number of reasons. ETB stocks will have a very low (~0.3%) borrow fee. Not all stocks in QTradableStocksUS, for example, will have borrow availability or such a low fee. Fees for stocks in QTSUS could theoretically be as high as 100% or higher in some cases. So if a stock is hard to borrow, it could mean that an apparent source of alpha wouldn't actually survive fees or would even get a fill. In addition, whether a stock is ETB signals that a stock is very heavily shorted, which could be a useful alpha signal when used in combination with other technicals, sentiment, or what have you.

So I was thinking this dataset would be useful to others, and I wanted to share it, but I thought it would be a bit redundant if a bunch of people were all importing duplicates of the same data into their Quantopian accounts. Would it be possible for me to set this dataset to "public" and share it?

6 responses

That would be juicy.

Hi Viridian,

Currently, you don't have the ability to share the dataset and make it public. That said, that sort of feature is something we'd like to add at some point. We're not working on it now, so I don't have any sort of timeline on when you'll be able to do it, but getting feedback like this is helpful as we organize and maintain or plans going forward.

Out of curiosity, did you setup the dataset as a live updating dataset, or was it historical only?

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

I guess in the interim then I can just share the download link for the CSV files with whoever is interested.

I have it set up as live updating. For this data source I was not able to obtain any historical data, so it's just collecting as it goes along... so maybe in a couple years it'll be useful.

@Jamie -- here are a couple feature requests:

  • I'd like to be able to delete data feeds I've uploaded in case I lose interest in them (say I do the research and it turns out I can't find any alpha) so they don't count against my 30 datasets limit.
  • I'd like to be able to update/replace the historical data in case I find data errors I would like to fix or discover additions I'd like to make to it.

@Viridian: Your interim solution sounds good. That's what we've been doing in posts that we make that use self-serve.

Regarding the ability to delete datasets. We hear you. The feature isn't available on the platform right now, but if you're close to your limit and you want to add new datasets, you can email in to [email protected] and we can delete them manually on our side to make some room.

For the update/replace request, have you tried issuing updates in a live updating dataset? You should be able to provide updates in a nightly update file by adding a record with the date and asset of the historical record you'd like to update. Note that this will create an 'update' record on our end with a timestamp according to when the update was issued and pipeline will only learn about the update based on the timestamp (i.e. you won't see the update processed until day N of a simulation if you posted the update at 1am ET on day N).

The above behavior is one of the most critical pieces of self-serve. Having a point-in-time history of updates to self-serve datasets means that algorithms using self-serve data can be evaluated for an allocation. If it were possible for for data to be overwritten, we wouldn't be able to tell whether or not the dataset has lookahead bias. By adding a timestamp to the live updates, the dataset can build up an out-of-sample period, which is critical to our evaluation process.

That said, I recognize you might just want to update a historical dataset where you discovered a data bug or other issue. Right now, the best way to make that update is to email in to us to get the old version deleted and then upload the new version with the fix. This further stresses the need for your first request of adding the ability to delete a dataset, but again, emailing in is the best solution right now. I've passed along the feedback to the team that works on self-serve to let them know about your requests. Thanks for the feedback.

P.S. I think it's great that you want to share your dataset with the rest of the community!

Hey, thanks. Sounds great.