Back to Community
Fundamental data for securities already in portfolio

There must be an efficient way to do this, but I can't figure it out.

Suppose for the sake of example I have a Pipeline that uses a custom factor with Fundamentals to restrict the universe of securities to only those whose trailing 12-month EBIT is greater than the previous year's EBIT, and that I generate orders for some subset of those securities.

Then on subsequent trading days - say four quarters later - I might want to evaluate the securities in my portfolio based on some other fundamental data, e.g., sell the ones that now have current assets less than the previous year's current assets. How can I get the fundamental data for the securities in my portfolio? The pipeline will only fetch data for ones whose TTM EBIT is less than the previous year's, but some of the securities in my portfolio may not satisfy that condition.

The only way I can think of doing this is to have the pipeline return fundamental data for the entire universe of e.g., QTradableStocksUS and then filter them in before_trading_start, but that is no doubt going to be tremendously inefficient.

I had thought of passing in the context to the make_pipeline function, but as a different thread pointed out, everything in that function is eagerly evaluated when the Pipeline object is created, not when pipeline_output gets called for some particular date -- and in fact the pipeline might asynchronously fetch multiple days of data at once, so it wouldn't even be in a position to know what securities are in the portfolio for days that haven't been traded yet.

Is there any other way of doing a lookup on fundamental data besides using the Pipeline API?

2 responses

Yep. I hear "we query for data in 6 month batches". And one can see those pauses happening sometimes during a backtest.
Possibly they might entertain a new function, a pipeline recalc-on-demand with the caveat that we would have to be well aware of how intensive that could be, and use sparingly, such as in the manner you have described. Perhaps even automatically limited to current positions not in pipeline_output() or something (from the thinking-out-loud department).

I'm not familiar with the innards of the architecture, but it seems to me there are really two different phases at work here. One is the screening phase, where you determine which securities you want in your universe, and one is the attributes phase, where you fetch the various data you need to make calculations to determine which of the securities in the universe to order, liquidate, or retain. Between these two phases you ought to be able to include the assets in your portfolio, since you have to make second-phase decisions about them as well. Right now there's a sort of halfway implementation of this in that you can use data.history to get historical price information about assets that might not be in the pipeline, but it might be a better architecture to have a real pipeline that can be constructed in stages.