Back to Community
Linking Research output to fetch_csv input

So, I know, you just released the modelling/pipeline API, and it's really great, but :) has anyone thought of a way to link the output of research to the input of an algo, apart from writing Research code which produces cut-and-paste output? I'm thinking along the lines of misshapen data like the current list of eligible pairs to trade, that sort of thing.

6 responses

Purely a hypothetical, but . . . the Quantopian Store allows us to store arbitrary data sets from vendors. Right now, we integrate those sets ourselves. Those data sets can be raw data but they could also be the output of models, like 'which pairs should we trade today'. I can imagine a world in which that arbitrary data storage from the store (with sid mapping and point-in-timedness enforced) is open to being written to via an API in research.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

This is one I've spent a bunch of time thinking about, and I'm not sure of the right answer.

On the one hand, I understand that you might have models you want to train in research or misshapen outputs that you want access to. On the other hand, I worry that if we head this path we end up in the place where you need to be able to schedule notebooks to run, in order to update your data. There is something about that which worries me.

I also wonder if the problem is that getting code from research to the IDE is a bit of a pain. If we had a code library, where you could "store" your functions written in research, so that they are callable from the IDE, would that make this less of an issue?

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Well I guess the real problem is that some things might take an hour to calculate, and they produce data which is not the same shape as the prices, so they don't fit into the factor/pipeline API. A solution is simply to do these calculations offline, then paste the results in, and stop, edit and restart the algo every month, but it's not ideal, and prevents the building up of a legitimate track record.

That is helpful. I think we do need to figure out a way to solve the problem of things that take too long to run. Obviously we want the pipeline API to be part of the solution. I definitely don't think we are done optimizing it to be faster, but there are limitations to how far that can get us.

I do think there might be a place for what Josh mentioned up above. I can see that being a nice solution too.

We'll keep chewing on it.

I'm just getting a feel for the factor/pipeline thingy, but I think it's apples to oranges. Besides being able to run code for a long time in any ol' fashion, there is versatile visualization in the research platform. And a standard IPython interactive IDE. Basically, the research platform is a decent computing environment, whereas the backtester/trading platform is a kind of application-specific tool.

I don't understand the worry of ending up "in the place where you need to be able to schedule notebooks to run, in order to update your data." Presumably, folks are doing this every day with fetcher, so why not make it easy to do the same thing with the research environment? Is it a matter of the cost of guaranteeing up-time and reliability? It seems that this could be addressed with a clear policy (e.g. without notice it could be down for up to 24 hours). Besides, wouldn't you want people scheduling notebooks to provide useful data to an algo (you could add a simple scheduler for this purpose)?

Another angle on this is, what is the path to high-performance computing at Quantopian? The research platform seems like the perfect sandbox for this sort of thing, but it won't be as useful if it does not have better integration with the trading platform.