Back to Community
Filter is no longer created from factor comparison

A few days ago, I was able to do the following:

sector_code = morningstar.asset_classification.morningstar_sector_code.latest  
pipe.add(sector_code, 'sector_code')  
pipe.set_screen(sector_code != -1)  

But now I get an error that "sector_code != -1" evaluates to a bool instead of a Filter, so the call to set_screen() fails. Did something change? Clone the attached algo and uncomment the set_screen() line to reproduce the problem.

Thanks.

Clone Algorithm
1
Loading...
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
from quantopian.algorithm import attach_pipeline, pipeline_output
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data import morningstar

def initialize(context):
    context.sec = sid(8554)
    pipe = Pipeline()
    attach_pipeline(pipe, 'example')
    sector_code = morningstar.asset_classification.morningstar_sector_code.latest
    pipe.add(sector_code, 'sector_code')
    # Uncomment the set_screen() line, run a backtest and it will hit this Runtime Error:
    # TypeError: zipline.pipeline.pipeline.set_screen() expected a value of type
    # zipline.pipeline.filters.filter.Filter for argument 'screen', but got bool instead.
#    pipe.set_screen(sector_code != -1)

def handle_data(context, data):
    pass
There was a runtime error.
5 responses

This is the fault of a recent Zipline change that added support for grouping expressions to the Pipeline API. I'm currently working on a full announcement post with examples, but the gist of the change is that .latest on dataset columns like morningstar_sector_code no longer return Factor instances. Instead, they return a Classifier type, which semantically represents labels rather than numerical values.

One of the consequences of that change is that Classifier doesn't provide most of the numerical operations that Factor provides. For most of the Factor operators, that's probably what you want: what does it mean, for example, to multiply or divide by a sector code? On the other hand, being able to produce a Filter by doing an equality comparison is a totally reasonable thing to want to do. The fact that Classifier isn't overriding that operator is an oversight. I've opened an issue about this in the Zipline repo here: https://github.com/quantopian/zipline/issues/1081.

For the particular use-case here of excluding null values, I have a change currently in review that adds isnull and notnull methods directly to Classifier. Barring unforeseen developments, I'd expect it to be out on the Q platform early next week.

Sorry for the breakage,
- Scott

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Thanks for the update. BTW, morningstar_sector_code does not seem to have any null values, but it does have values of -1.

As a work-around to this issue, I just wrote a small CustomFactor.

-1 is the value we insert for morningstar_sector_code when the real value is null or otherwise missing. Since pipeline inputs are represented as numpy arrays which don't have a native concept of missing data, we have to choose some real value to represent missing data. In general, that value is NaN for any data representing numerical quantities, and it's some value that never appears in the actual data for enum labels like sector_code. (So far it's -1 for all the classifiers that are backed by integers, but that could change in the future as we add more data.)

The isnull and notnull methods are designed specifically abstract away the fact that different columns might represent "this data is missing" in different ways. For sector code, isnull on morningstar sector data will check if a value is -1, but for pricing data, it will use np.isnan.

Thanks for the explanation.

These changes have been merged. See https://www.quantopian.com/posts/pipeline-classifiers-are-here for details.