Pipeline API Reference

Quick Reference

Pipeline

quantopian.pipeline.Pipeline([columns, ...]) A Pipeline object represents a collection of named expressions to be compiled and executed by a PipelineEngine.

Pipeline Methods

quantopian.pipeline.Pipeline.add(self, term, ...) Add a column.
quantopian.pipeline.Pipeline.remove(self, name) Remove a column.
quantopian.pipeline.Pipeline.set_screen(...) Set a screen on this Pipeline.
quantopian.pipeline.Pipeline.show_graph(self) Render this Pipeline as a DAG.

Pipeline Attributes

quantopian.pipeline.Pipeline.columns The output columns of this pipeline.
quantopian.pipeline.Pipeline.screen The screen of this pipeline.

Base Classes

quantopian.pipeline.CustomFactor Base class for user-defined Factors.
quantopian.pipeline.CustomFilter Base class for user-defined Filters.
zipline.pipeline.Term Base class for objects that can appear in the compute graph of a zipline.pipeline.Pipeline.
zipline.pipeline.LoadableTerm A Term that should be loaded from an external resource by a PipelineLoader.
zipline.pipeline.ComputableTerm A Term that should be computed from a tuple of inputs.
zipline.pipeline.Factor Pipeline API expression producing a numerical or date-valued output.
zipline.pipeline.Filter Pipeline expression computing a boolean output.
zipline.pipeline.Classifier A Pipeline expression computing a categorical output.
zipline.pipeline.data.DataSet Base class for Pipeline datasets.
zipline.pipeline.data.DataSetFamily Base class for Pipeline dataset families.
zipline.pipeline.data.BoundColumn A column of data that's been concretely bound to a particular dataset.
zipline.pipeline.data.Column An abstract column of data, not yet associated with a dataset.
zipline.pipeline.domain.Domain A domain represents a set of labels for the arrays computed by a Pipeline.

Factor Methods

Methods That Create Factors

zipline.pipeline.Factor.rank([method, ...]) Construct a new Factor representing the sorted rank of each column within each row.
zipline.pipeline.Factor.demean(self[, mask, ...]) Construct a Factor that computes self and subtracts the mean from row of the result.
zipline.pipeline.Factor.zscore(self[, mask, ...]) Construct a Factor that Z-Scores each day's results.
zipline.pipeline.Factor.pearsonr(self, ...) Construct a new Factor that computes rolling pearson correlation coefficients between target and the columns of self.
zipline.pipeline.Factor.spearmanr(self, ...) Construct a new Factor that computes rolling spearman rank correlation coefficients between target and the columns of self.
zipline.pipeline.Factor.linear_regression(...) Construct a new Factor that performs an ordinary least-squares regression predicting the columns of self from target.
zipline.pipeline.Factor.winsorize(self, ...) Construct a new factor that winsorizes the result of this factor.
zipline.pipeline.Factor.downsample(self, ...) Make a term that computes from self at lower-than-daily frequency.
zipline.pipeline.Factor.sin() Construct a Factor that computes sin() on each output of self.
zipline.pipeline.Factor.cos() Construct a Factor that computes cos() on each output of self.
zipline.pipeline.Factor.tan() Construct a Factor that computes tan() on each output of self.
zipline.pipeline.Factor.arcsin() Construct a Factor that computes arcsin() on each output of self.
zipline.pipeline.Factor.arccos() Construct a Factor that computes arccos() on each output of self.
zipline.pipeline.Factor.arctan() Construct a Factor that computes arctan() on each output of self.
zipline.pipeline.Factor.sinh() Construct a Factor that computes sinh() on each output of self.
zipline.pipeline.Factor.cosh() Construct a Factor that computes cosh() on each output of self.
zipline.pipeline.Factor.tanh() Construct a Factor that computes tanh() on each output of self.
zipline.pipeline.Factor.arcsinh() Construct a Factor that computes arcsinh() on each output of self.
zipline.pipeline.Factor.arccosh() Construct a Factor that computes arccosh() on each output of self.
zipline.pipeline.Factor.arctanh() Construct a Factor that computes arctanh() on each output of self.
zipline.pipeline.Factor.log() Construct a Factor that computes log() on each output of self.
zipline.pipeline.Factor.log10() Construct a Factor that computes log10() on each output of self.
zipline.pipeline.Factor.log1p() Construct a Factor that computes log1p() on each output of self.
zipline.pipeline.Factor.exp() Construct a Factor that computes exp() on each output of self.
zipline.pipeline.Factor.expm1() Construct a Factor that computes expm1() on each output of self.
zipline.pipeline.Factor.sqrt() Construct a Factor that computes sqrt() on each output of self.
zipline.pipeline.Factor.abs() Construct a Factor that computes abs() on each output of self.
zipline.pipeline.Factor.__add__(self, other) Construct a Factor computing self + other.
zipline.pipeline.Factor.__sub__(self, other) Construct a Factor computing self - other.
zipline.pipeline.Factor.__add__(self, other) Construct a Factor computing self + other.
zipline.pipeline.Factor.__sub__(self, other) Construct a Factor computing self - other.
zipline.pipeline.Factor.__mul__(self, other) Construct a Factor computing self * other.
zipline.pipeline.Factor.__div__(self, other) Construct a Factor computing self / other.
zipline.pipeline.Factor.__mod__(self, other) Construct a Factor computing self % other.
zipline.pipeline.Factor.__pow__(self, other) Construct a Factor computing self ** other.

Methods That Create Filters

zipline.pipeline.Factor.eq(self, other) Construct a Filter computing self == other.
zipline.pipeline.Factor.top(N[, mask, groupby]) Construct a Filter matching the top N asset values of self each day.
zipline.pipeline.Factor.bottom(N[, mask, ...]) Construct a Filter matching the bottom N asset values of self each day.
zipline.pipeline.Factor.isnull() A Filter producing True for values where this Factor has missing data.
zipline.pipeline.Factor.notnull() A Filter producing True for values where this Factor has complete data.
zipline.pipeline.Factor.isnan(self) A Filter producing True for all values where this Factor is NaN.
zipline.pipeline.Factor.notnan(self) A Filter producing True for values where this Factor is not NaN.
zipline.pipeline.Factor.isfinite(self) A Filter producing True for values where this Factor is anything but NaN, inf, or -inf.
zipline.pipeline.Factor.percentile_between(...) Construct a Filter matching values of self that fall within the range defined by min_percentile and max_percentile.
zipline.pipeline.Factor.__lt__(self, other) Construct a Filter computing self < other.
zipline.pipeline.Factor.__le__(self, other) Construct a Filter computing self <= other.
zipline.pipeline.Factor.__ne__(self, other) Construct a Filter computing self != other.
zipline.pipeline.Factor.__ge__(self, other) Construct a Filter computing self >= other.
zipline.pipeline.Factor.__gt__(self, other) Construct a Filter computing self > other.

Methods That Create Classifiers

zipline.pipeline.Factor.quartiles(self[, mask]) Construct a Classifier computing quartiles over the output of self.
zipline.pipeline.Factor.quintiles(self[, mask]) Construct a Classifier computing quintile labels on self.
zipline.pipeline.Factor.deciles(self[, mask]) Construct a Classifier computing decile labels on self.
zipline.pipeline.Factor.quantiles(self, bins) Construct a Classifier computing quantiles of the output of self.

Filter Methods

Methods that Create Filters

zipline.pipeline.Filter.__and__(other) Binary Operator: '&'
zipline.pipeline.Filter.__or__(other) Binary Operator: '|'
zipline.pipeline.Filter.__invert__() Unary Operator: '~'

Classifier Methods

Methods That Create Factors

zipline.pipeline.Classifier.peer_count() Construct a factor that gives the number of occurrences of each distinct category in a classifier.

Methods That Create Filters

zipline.pipeline.Classifier.isnull() A Filter producing True for values where this term has missing data.
zipline.pipeline.Classifier.notnull() A Filter producing True for values where this term has complete data.
zipline.pipeline.Classifier.eq(other) Construct a Filter returning True for asset/date pairs where the output of self matches other.
zipline.pipeline.Classifier.startswith(self, ...) Construct a Filter matching values starting with prefix.
zipline.pipeline.Classifier.endswith(self, ...) Construct a Filter matching values ending with suffix.
zipline.pipeline.Classifier.has_substring(...) Construct a Filter matching values containing substring.
zipline.pipeline.Classifier.matches(self, ...) Construct a Filter that checks regex matches against pattern.

Methods That Create Classifiers

zipline.pipeline.Classifier.relabel(self, ...) Convert self into a new classifier by mapping a function over each element produced by self.

Data

quantopian.pipeline.data.EquityPricing DataSet containing daily trading prices and volumes.
quantopian.pipeline.data.factset.Fundamentals DataSet containing fundamental data sourced from FactSet.
quantopian.pipeline.data.factset.EquityMetadata DataSet containing metadata about assets.
quantopian.pipeline.data.factset.RBICSFocus DataSet providing information about companies' areas of business focus.
quantopian.pipeline.data.factset.GeoRev DataSetFamily containing company revenue, broken down by source country or region.
quantopian.pipeline.data.factset.estimates.PeriodicConsensus DataSetFamily for quarterly, semi-annual, and annual consensus estimates.
quantopian.pipeline.data.factset.estimates.Actuals DataSetFamily for "actual" reports of estimated values.
quantopian.pipeline.data.factset.estimates.ConsensusRecommendations DataSet containing consensus broker recommendations.
quantopian.pipeline.data.factset.estimates.LongTermConsensus DataSetFamily for long term consensus estimates.
quantopian.pipeline.data.morningstar.Fundamentals DataSet containing fundamental data sourced from Morningstar.

Built-in Factors

quantopian.pipeline.factors.DailyReturns Calculates daily percent change in close price.
quantopian.pipeline.factors.Returns Calculates the percent change in close price over the given window_length.
quantopian.pipeline.factors.PercentChange Calculates the percent change over the given window_length.
quantopian.pipeline.factors.VWAP Volume Weighted Average Price
quantopian.pipeline.factors.AverageDollarVolume Average Daily Dollar Volume
quantopian.pipeline.factors.AnnualizedVolatility Volatility.
quantopian.pipeline.factors.SimpleBeta Factor producing the slope of a regression line between each asset's daily returns to the daily returns of a single "target" asset.
quantopian.pipeline.factors.SimpleMovingAverage Average Value of an arbitrary column
quantopian.pipeline.factors.Latest Factor producing the most recently-known value of inputs[0] on each day.
quantopian.pipeline.factors.MaxDrawdown Max Drawdown
quantopian.pipeline.factors.RSI Relative Strength Index
quantopian.pipeline.factors.ExponentialWeightedMovingAverage Exponentially Weighted Moving Average
quantopian.pipeline.factors.ExponentialWeightedMovingStdDev Exponentially Weighted Moving Standard Deviation
quantopian.pipeline.factors.WeightedAverageValue Helper for VWAP-like computations.
quantopian.pipeline.factors.MovingAverageConvergenceDivergenceSignal Moving Average Convergence/Divergence (MACD) Signal line https://en.wikipedia.org/wiki/MACD
quantopian.pipeline.factors.RollingPearsonOfReturns Calculates the Pearson product-moment correlation coefficient of the returns of the given asset with the returns of all other assets.
quantopian.pipeline.factors.RollingSpearmanOfReturns Calculates the Spearman rank correlation coefficient of the returns of the given asset with the returns of all other assets.
quantopian.pipeline.factors.RollingLinearRegressionOfReturns Perform an ordinary least-squares regression predicting the returns of all other assets on the given asset.

Built-in Filters

quantopian.pipeline.filters.QTradableStocksUS() Create the trading universe used in the Quantopian Contest.
quantopian.pipeline.filters.Q500US([...]) A default universe containing approximately 500 US equities each day.
quantopian.pipeline.filters.Q1500US([...]) A default universe containing approximately 1500 US equities each day.
quantopian.pipeline.filters.Q3000US([...]) A default universe containing approximately 3000 US equities each day.
quantopian.pipeline.filters.make_us_equity_universe(...) Create a QUS-style universe filter.
quantopian.pipeline.filters.default_us_equity_universe_mask([...]) Create the base filter used to filter assets from the QUS filters.

Built-in Classifiers

quantopian.pipeline.classifiers.morningstar.Sector Classifier that groups assets by Morningstar Sector Code.
quantopian.pipeline.classifiers.morningstar.SuperSector Classifier that groups assets by Morningstar Super Sector.

Risk Model

Style Loadings

quantopian.pipeline.experimental.BasicMaterials Quantopian Risk Model loadings for the basic materials sector.
quantopian.pipeline.experimental.ConsumerCyclical Quantopian Risk Model loadings for the consumer cyclical sector.
quantopian.pipeline.experimental.FinancialServices Quantopian Risk Model loadings for the financial services sector.
quantopian.pipeline.experimental.RealEstate Quantopian Risk Model loadings for the real estate sector.
quantopian.pipeline.experimental.ConsumerDefensive Quantopian Risk Model loadings for the consumer defensive sector.
quantopian.pipeline.experimental.HealthCare Quantopian Risk Model loadings for the health care sector.
quantopian.pipeline.experimental.Utilities Quantopian Risk Model loadings for the utilities sector.
quantopian.pipeline.experimental.CommunicationServices Quantopian Risk Model loadings for the communication services sector.
quantopian.pipeline.experimental.Energy Quantopian Risk Model loadings for the communication energy sector.
quantopian.pipeline.experimental.Industrials Quantopian Risk Model loadings for the industrials sector.
quantopian.pipeline.experimental.Technology Quantopian Risk Model loadings for the technology sector.

Sector Loadings

quantopian.pipeline.experimental.Momentum Quantopian Risk Model loadings for the "momentum" style factor.
quantopian.pipeline.experimental.ShortTermReversal Quantopian Risk Model loadings for the "short term reversal" style factor.
quantopian.pipeline.experimental.Size Quantopian Risk Model loadings for the "size" style factor.
quantopian.pipeline.experimental.Value Quantopian Risk Model loadings for the "value" style factor.
quantopian.pipeline.experimental.Volatility Quantopian Risk Model loadings for the "volatility" style factor.

Domains

Country Country Code Pipeline Domain Supported Exchanges
Austria AT AT_EQUITIES Vienna Stock Exchange
Australia AU AU_EQUITIES Australian Securities Exchange, National Stock Exchange of Australia
Belgium BE BE_EQUITIES Euronext Brussels
Brazil BR BR_EQUITIES Sao Paulo Stock Exchange
Canada CA CA_EQUITIES Toronto Stock Exchange, TSX Venture Exchange, Canadian Securities Exchange
Chile CL CL_EQUITIES Santiago Stock Exchange
China CN CN_EQUITIES Shenzhen Stock Exchange, Shanghai Stock Exchange
Colombia CO CO_EQUITIES Colombia Stock Exchange
Czech Republic CZ CZ_EQUITIES Prague Stock Exchange
Denmark DK DK_EQUITIES NASDAQ OMX Copenhagen
Finland FI FI_EQUITIES NASDAQ OMX Helsinki
France FR FR_EQUITIES Euronext Paris
Germany DE DE_EQUITIES Berlin Stock Exchange, Dusseldorf Stock Exchange, XETRA, Frankfurt Stock Exchange, Hamburg Stock Exchange, Hannover Stock Exchange, Munich Stock Exchange, Stuttgart Stock Exchange, Xetra Indices
Great Britain GB GB_EQUITIES London Stock Exchange, ICAP Securities & Derivatives Exchange, Cboe Europe Equities CXE
Greece GR GR_EQUITIES Athens Exchange
Hong Kong HK HK_EQUITIES Hong Kong Stock Exchange
Hungary HU HU_EQUITIES Budapest Stock Exchange
India IN IN_EQUITIES Bombay Stock Exchange, National Stock Exchange of India
Ireland IE IE_EQUITIES Irish Stock Exchange, Irish Stock Exchange Bonds & Funds
Italy IT IT_EQUITIES Milan Stock Exchange
Japan JP JP_EQUITIES Tokyo Stock Exchange, JASDAQ, Osaka Exchange, Nagoya Stock Exchange, Fukuoka Stock Exchange, Sapporo Securities Exchange
Mexico MX MX_EQUITIES Mexican Stock Exchange
Netherlands NL NL_EQUITIES Euronext Amsterdam
New Zealand NZ NZ_EQUITIES New Zealand Stock Exchange
Norway NO NO_EQUITIES Oslo Exchange
Peru PE PE_EQUITIES Lima Stock Exchange
Poland PL PL_EQUITIES Warsaw Stock Exchange
Portugal PT PT_EQUITIES Euronext Lisbon
Singapore SG SG_EQUITIES Singapore Exchange
South Africa ZA ZA_EQUITIES Johannesburg Securities Exchange
South Korea KR KR_EQUITIES Korea Exchange, Korea KONEX
Spain ES ES_EQUITIES Madrid Stock Exchange/Spanish Markets
Sweden SE SE_EQUITIES NASDAQ OMX Stockholm, AktieTorget, Nordic Growth Market
Switzerland CH CH_EQUITIES SIX Swiss Exchange, BX Swiss AG, Swiss Fund Data
Turkey TR TR_EQUITIES Istanbul Stock Exchange
United States US US_EQUITIES NYSE, NASDAQ, AMEX

Detailed Reference

Pipeline

class quantopian.pipeline.Pipeline(columns=None, screen=None, domain=GENERIC)

A Pipeline object represents a collection of named expressions to be compiled and executed by a PipelineEngine.

A Pipeline has two important attributes: 'columns', a dictionary of named Term instances, and 'screen', a Filter representing criteria for including an asset in the results of a Pipeline.

To compute a pipeline in the context of a TradingAlgorithm, users must call attach_pipeline in their initialize function to register that the pipeline should be computed each trading day. The most recent outputs of an attached pipeline can be retrieved by calling pipeline_output from handle_data, before_trading_start, or a scheduled function.

Parameters:

Note

The Pipeline class is defined in zipline.pipeline. It is re-exported on quantopian.pipeline to reduce the number of modules that need to be imported by users when working on Quantopian. Most code written on Quantopian should access Pipeline via quantopian.pipeline.

add(self, term, name, overwrite=False)

Add a column.

The results of computing term will show up as a column in the DataFrame produced by running this pipeline.

Parameters:
  • column (zipline.pipeline.Term) -- A Filter, Factor, or Classifier to add to the pipeline.
  • name (str) -- Name of the column to add.
  • overwrite (bool) -- Whether to overwrite the existing entry if we already have a column named name.
remove(self, name)

Remove a column.

Parameters:name (str) -- The name of the column to remove.
Raises:KeyError -- If name is not in self.columns.
Returns:removed -- The removed term.
Return type:zipline.pipeline.Term
set_screen(self, screen, overwrite=False)

Set a screen on this Pipeline.

Parameters:
  • filter (zipline.pipeline.Filter) -- The filter to apply as a screen.
  • overwrite (bool) -- Whether to overwrite any existing screen. If overwrite is False and self.screen is not None, we raise an error.
show_graph(self, format='svg')

Render this Pipeline as a DAG.

Parameters:format ({'svg', 'png', 'jpeg'}) -- Image format to render with. Default is 'svg'.
columns

The output columns of this pipeline.

Returns:columns -- Map from column name to expression computing that column's output.
Return type:dict[str, zipline.pipeline.ComputableTerm]
screen

The screen of this pipeline.

Returns:screen -- Term defining the screen for this pipeline. If screen is a filter, rows that do not pass the filter (i.e., rows for which the filter computed False) will be dropped from the output of this pipeline before returning results.
Return type:zipline.pipeline.Filter or None

Notes

Setting a screen on a Pipeline does not change the values produced for any rows: it only affects whether a given row is returned. Computing a pipeline with a screen is logically equivalent to computing the pipeline without the screen and then, as a post-processing-step, filtering out any rows for which the screen computed False.

Base Classes

class quantopian.pipeline.CustomFactor

Base class for user-defined Factors.

Parameters:
  • inputs (iterable, optional) -- An iterable of BoundColumn instances (e.g. USEquityPricing.close), describing the data to load and pass to self.compute. If this argument is not passed to the CustomFactor constructor, we look for a class-level attribute named inputs.
  • outputs (iterable[str], optional) -- An iterable of strings which represent the names of each output this factor should compute and return. If this argument is not passed to the CustomFactor constructor, we look for a class-level attribute named outputs.
  • window_length (int, optional) -- Number of rows to pass for each input. If this argument is not passed to the CustomFactor constructor, we look for a class-level attribute named window_length.
  • mask (zipline.pipeline.Filter, optional) -- A Filter describing the assets on which we should compute each day. Each call to CustomFactor.compute will only receive assets for which mask produced True on the day for which compute is being called.

Notes

Users implementing their own Factors should subclass CustomFactor and implement a method named compute with the following signature:

def compute(self, today, assets, out, *inputs):
   ...

On each simulation date, compute will be called with the current date, an array of sids, an output array, and an input array for each expression passed as inputs to the CustomFactor constructor.

The specific types of the values passed to compute are as follows:

today : np.datetime64[ns]
    Row label for the last row of all arrays passed as `inputs`.
assets : np.array[int64, ndim=1]
    Column labels for `out` and`inputs`.
out : np.array[self.dtype, ndim=1]
    Output array of the same shape as `assets`.  `compute` should write
    its desired return values into `out`. If multiple outputs are
    specified, `compute` should write its desired return values into
    `out.<output_name>` for each output name in `self.outputs`.
*inputs : tuple of np.array
    Raw data arrays corresponding to the values of `self.inputs`.

compute functions should expect to be passed NaN values for dates on which no data was available for an asset. This may include dates on which an asset did not yet exist.

For example, if a CustomFactor requires 10 rows of close price data, and asset A started trading on Monday June 2nd, 2014, then on Tuesday, June 3rd, 2014, the column of input data for asset A will have 9 leading NaNs for the preceding days on which data was not yet available.

Examples

A CustomFactor with pre-declared defaults:

class TenDayRange(CustomFactor):
    """
    Computes the difference between the highest high in the last 10
    days and the lowest low.

    Pre-declares high and low as default inputs and `window_length` as
    10.
    """

    inputs = [USEquityPricing.high, USEquityPricing.low]
    window_length = 10

    def compute(self, today, assets, out, highs, lows):
        from numpy import nanmin, nanmax

        highest_highs = nanmax(highs, axis=0)
        lowest_lows = nanmin(lows, axis=0)
        out[:] = highest_highs - lowest_lows

# Doesn't require passing inputs or window_length because they're
# pre-declared as defaults for the TenDayRange class.
ten_day_range = TenDayRange()

A CustomFactor without defaults:

class MedianValue(CustomFactor):
    """
    Computes the median value of an arbitrary single input over an
    arbitrary window..

    Does not declare any defaults, so values for `window_length` and
    `inputs` must be passed explicitly on every construction.
    """

    def compute(self, today, assets, out, data):
        from numpy import nanmedian
        out[:] = data.nanmedian(data, axis=0)

# Values for `inputs` and `window_length` must be passed explicitly to
# MedianValue.
median_close10 = MedianValue([USEquityPricing.close], window_length=10)
median_low15 = MedianValue([USEquityPricing.low], window_length=15)

A CustomFactor with multiple outputs:

class MultipleOutputs(CustomFactor):
    inputs = [USEquityPricing.close]
    outputs = ['alpha', 'beta']
    window_length = N

    def compute(self, today, assets, out, close):
        computed_alpha, computed_beta = some_function(close)
        out.alpha[:] = computed_alpha
        out.beta[:] = computed_beta

# Each output is returned as its own Factor upon instantiation.
alpha, beta = MultipleOutputs()

# Equivalently, we can create a single factor instance and access each
# output as an attribute of that instance.
multiple_outputs = MultipleOutputs()
alpha = multiple_outputs.alpha
beta = multiple_outputs.beta

Note: If a CustomFactor has multiple outputs, all outputs must have the same dtype. For instance, in the example above, if alpha is a float then beta must also be a float.

Note

The CustomFactor class is defined in zipline.pipeline. It is re-exported on quantopian.pipeline to reduce the number of modules that need to be imported by users when working on Quantopian. Most code written on Quantopian should access CustomFactor via quantopian.pipeline.

class quantopian.pipeline.CustomFilter

Base class for user-defined Filters.

Parameters:
  • inputs (iterable, optional) -- An iterable of BoundColumn instances (e.g. USEquityPricing.close), describing the data to load and pass to self.compute. If this argument is passed to the CustomFilter constructor, we look for a class-level attribute named inputs.
  • window_length (int, optional) -- Number of rows to pass for each input. If this argument is not passed to the CustomFilter constructor, we look for a class-level attribute named window_length.

Notes

Users implementing their own Filters should subclass CustomFilter and implement a method named compute with the following signature:

def compute(self, today, assets, out, *inputs):
   ...

On each simulation date, compute will be called with the current date, an array of sids, an output array, and an input array for each expression passed as inputs to the CustomFilter constructor.

The specific types of the values passed to compute are as follows:

today : np.datetime64[ns]
    Row label for the last row of all arrays passed as `inputs`.
assets : np.array[int64, ndim=1]
    Column labels for `out` and`inputs`.
out : np.array[bool, ndim=1]
    Output array of the same shape as `assets`.  `compute` should write
    its desired return values into `out`.
*inputs : tuple of np.array
    Raw data arrays corresponding to the values of `self.inputs`.

See the documentation for CustomFactor for more details on implementing a custom compute method.

Note

The CustomFactor class is defined in zipline.pipeline. It is re-exported on quantopian.pipeline to reduce the number of modules that need to be imported by users when working on Quantopian. Most code written on Quantopian should access CustomFilter via quantopian.pipeline.

class zipline.pipeline.Term

Base class for objects that can appear in the compute graph of a zipline.pipeline.Pipeline.

Notes

Most Pipeline API users only interact with Term via subclasses:

Instances of Term are memoized. If you call a Term's constructor with the same arguments twice, the same object will be returned from both calls:

Example:

>>> from zipline.pipeline.data import EquityPricing
>>> from zipline.pipeline.factors import SimpleMovingAverage
>>> x = SimpleMovingAverage(inputs=[EquityPricing.close], window_length=5)
>>> y = SimpleMovingAverage(inputs=[EquityPricing.close], window_length=5)
>>> x is y
True

Warning

Memoization of terms means that it's generally unsafe to modify attributes of a term after construction.

inputs

A tuple of other Terms needed as inputs for self.

class zipline.pipeline.ComputableTerm

A Term that should be computed from a tuple of inputs.

This is the base class for zipline.pipeline.Factor, zipline.pipeline.Filter, and zipline.pipeline.Classifier.

class zipline.pipeline.LoadableTerm

A Term that should be loaded from an external resource by a PipelineLoader.

This is the base class for zipline.pipeline.data.BoundColumn.

class zipline.pipeline.Factor

Pipeline API expression producing a numerical or date-valued output.

Factors are the most commonly-used Pipeline term, representing the result of any computation producing a numerical result.

Factors can be combined, both with other Factors and with scalar values, via any of the builtin mathematical operators (+, -, *, etc).

This makes it easy to write complex expressions that combine multiple Factors. For example, constructing a Factor that computes the average of two other Factors is simply:

>>> f1 = SomeFactor(...)  
>>> f2 = SomeOtherFactor(...)  
>>> average = (f1 + f2) / 2.0  

Factors can also be converted into zipline.pipeline.Filter objects via comparison operators: (<, <=, !=, eq, >, >=).

There are many natural operators defined on Factors besides the basic numerical operators. These include methods for identifying missing or extreme-valued outputs (isnull(), notnull(), isnan(), notnan()), methods for normalizing outputs (rank(), demean(), zscore()), and methods for constructing Filters based on rank-order properties of results (top(), bottom(), percentile_between()).

rank(method='ordinal', ascending=True, mask=sentinel('NotSpecified'), groupby=sentinel('NotSpecified'))

Construct a new Factor representing the sorted rank of each column within each row.

Parameters:
  • method (str, {'ordinal', 'min', 'max', 'dense', 'average'}) -- The method used to assign ranks to tied elements. See scipy.stats.rankdata for a full description of the semantics for each ranking method. Default is 'ordinal'.
  • ascending (bool, optional) -- Whether to return sorted rank in ascending or descending order. Default is True.
  • mask (zipline.pipeline.Filter, optional) -- A Filter representing assets to consider when computing ranks. If mask is supplied, ranks are computed ignoring any asset/date pairs for which mask produces a value of False.
  • groupby (zipline.pipeline.Classifier, optional) -- A classifier defining partitions over which to perform ranking.
Returns:

ranks -- A new factor that will compute the ranking of the data produced by self.

Return type:

zipline.pipeline.Factor

Notes

The default value for method is different from the default for scipy.stats.rankdata. See that function's documentation for a full description of the valid inputs to method.

Missing or non-existent data on a given day will cause an asset to be given a rank of NaN for that day.

demean(self, mask=sentinel('NotSpecified'), groupby=sentinel('NotSpecified'))

Construct a Factor that computes self and subtracts the mean from row of the result.

If mask is supplied, ignore values where mask returns False when computing row means, and output NaN anywhere the mask is False.

If groupby is supplied, compute by partitioning each row based on the values produced by groupby, de-meaning the partitioned arrays, and stitching the sub-results back together.

Parameters:

Examples

Let f be a Factor which would produce the following output:

             AAPL   MSFT    MCD     BK
2017-03-13    1.0    2.0    3.0    4.0
2017-03-14    1.5    2.5    3.5    1.0
2017-03-15    2.0    3.0    4.0    1.5
2017-03-16    2.5    3.5    1.0    2.0

Let c be a Classifier producing the following output:

             AAPL   MSFT    MCD     BK
2017-03-13      1      1      2      2
2017-03-14      1      1      2      2
2017-03-15      1      1      2      2
2017-03-16      1      1      2      2

Let m be a Filter producing the following output:

             AAPL   MSFT    MCD     BK
2017-03-13  False   True   True   True
2017-03-14   True  False   True   True
2017-03-15   True   True  False   True
2017-03-16   True   True   True  False

Then f.demean() will subtract the mean from each row produced by f.

             AAPL   MSFT    MCD     BK
2017-03-13 -1.500 -0.500  0.500  1.500
2017-03-14 -0.625  0.375  1.375 -1.125
2017-03-15 -0.625  0.375  1.375 -1.125
2017-03-16  0.250  1.250 -1.250 -0.250

f.demean(mask=m) will subtract the mean from each row, but means will be calculated ignoring values on the diagonal, and NaNs will written to the diagonal in the output. Diagonal values are ignored because they are the locations where the mask m produced False.

             AAPL   MSFT    MCD     BK
2017-03-13    NaN -1.000  0.000  1.000
2017-03-14 -0.500    NaN  1.500 -1.000
2017-03-15 -0.166  0.833    NaN -0.666
2017-03-16  0.166  1.166 -1.333    NaN

f.demean(groupby=c) will subtract the group-mean of AAPL/MSFT and MCD/BK from their respective entries. The AAPL/MSFT are grouped together because both assets always produce 1 in the output of the classifier c. Similarly, MCD/BK are grouped together because they always produce 2.

             AAPL   MSFT    MCD     BK
2017-03-13 -0.500  0.500 -0.500  0.500
2017-03-14 -0.500  0.500  1.250 -1.250
2017-03-15 -0.500  0.500  1.250 -1.250
2017-03-16 -0.500  0.500 -0.500  0.500

f.demean(mask=m, groupby=c) will also subtract the group-mean of AAPL/MSFT and MCD/BK, but means will be calculated ignoring values on the diagonal , and NaNs will be written to the diagonal in the output.

             AAPL   MSFT    MCD     BK
2017-03-13    NaN  0.000 -0.500  0.500
2017-03-14  0.000    NaN  1.250 -1.250
2017-03-15 -0.500  0.500    NaN  0.000
2017-03-16 -0.500  0.500  0.000    NaN

Notes

Mean is sensitive to the magnitudes of outliers. When working with factor that can potentially produce large outliers, it is often useful to use the mask parameter to discard values at the extremes of the distribution:

>>> base = MyFactor(...)  
>>> normalized = base.demean(
...     mask=base.percentile_between(1, 99),
... )  

demean() is only supported on Factors of dtype float64.

zscore(self, mask=sentinel('NotSpecified'), groupby=sentinel('NotSpecified'))

Construct a Factor that Z-Scores each day's results.

The Z-Score of a row is defined as:

(row - row.mean()) / row.stddev()

If mask is supplied, ignore values where mask returns False when computing row means and standard deviations, and output NaN anywhere the mask is False.

If groupby is supplied, compute by partitioning each row based on the values produced by groupby, z-scoring the partitioned arrays, and stitching the sub-results back together.

Parameters:
Returns:

zscored -- A Factor producing that z-scores the output of self.

Return type:

zipline.pipeline.Factor

Notes

Mean and standard deviation are sensitive to the magnitudes of outliers. When working with factor that can potentially produce large outliers, it is often useful to use the mask parameter to discard values at the extremes of the distribution:

>>> base = MyFactor(...)  
>>> normalized = base.zscore(
...    mask=base.percentile_between(1, 99),
... )  

zscore() is only supported on Factors of dtype float64.

Examples

See demean() for an in-depth example of the semantics for mask and groupby.

pearsonr(self, target, correlation_length, mask=sentinel('NotSpecified'))

Construct a new Factor that computes rolling pearson correlation coefficients between target and the columns of self.

Parameters:
  • target (zipline.pipeline.Term) -- The term used to compute correlations against each column of data produced by self. This may be a Factor, a BoundColumn or a Slice. If target is two-dimensional, correlations are computed asset-wise.
  • correlation_length (int) -- Length of the lookback window over which to compute each correlation coefficient.
  • mask (zipline.pipeline.Filter, optional) -- A Filter describing which assets should have their correlation with the target slice computed each day.
Returns:

correlations -- A new Factor that will compute correlations between target and the columns of self.

Return type:

zipline.pipeline.Factor

Notes

This method can only be called on expressions which are deemed safe for use as inputs to windowed Factor objects. Examples of such expressions include This includes BoundColumn Returns and any factors created from rank() or zscore().

Examples

Suppose we want to create a factor that computes the correlation between AAPL's 10-day returns and the 10-day returns of all other assets, computing each correlation over 30 days. This can be achieved by doing the following:

returns = Returns(window_length=10)
returns_slice = returns[sid(24)]
aapl_correlations = returns.pearsonr(
    target=returns_slice, correlation_length=30,
)

This is equivalent to doing:

aapl_correlations = RollingPearsonOfReturns(
    target=sid(24), returns_length=10, correlation_length=30,
)
spearmanr(self, target, correlation_length, mask=sentinel('NotSpecified'))

Construct a new Factor that computes rolling spearman rank correlation coefficients between target and the columns of self.

Parameters:
  • target (zipline.pipeline.Term) -- The term used to compute correlations against each column of data produced by self. This may be a Factor, a BoundColumn or a Slice. If target is two-dimensional, correlations are computed asset-wise.
  • correlation_length (int) -- Length of the lookback window over which to compute each correlation coefficient.
  • mask (zipline.pipeline.Filter, optional) -- A Filter describing which assets should have their correlation with the target slice computed each day.
Returns:

correlations -- A new Factor that will compute correlations between target and the columns of self.

Return type:

zipline.pipeline.Factor

Notes

This method can only be called on expressions which are deemed safe for use as inputs to windowed Factor objects. Examples of such expressions include This includes BoundColumn Returns and any factors created from rank() or zscore().

Examples

Suppose we want to create a factor that computes the correlation between AAPL's 10-day returns and the 10-day returns of all other assets, computing each correlation over 30 days. This can be achieved by doing the following:

returns = Returns(window_length=10)
returns_slice = returns[sid(24)]
aapl_correlations = returns.spearmanr(
    target=returns_slice, correlation_length=30,
)

This is equivalent to doing:

aapl_correlations = RollingSpearmanOfReturns(
    target=sid(24), returns_length=10, correlation_length=30,
)
linear_regression(self, target, regression_length, mask=sentinel('NotSpecified'))

Construct a new Factor that performs an ordinary least-squares regression predicting the columns of self from target.

Parameters:
  • target (zipline.pipeline.Term) -- The term to use as the predictor/independent variable in each regression. This may be a Factor, a BoundColumn or a Slice. If target is two-dimensional, regressions are computed asset-wise.
  • regression_length (int) -- Length of the lookback window over which to compute each regression.
  • mask (zipline.pipeline.Filter, optional) -- A Filter describing which assets should be regressed with the target slice each day.
Returns:

regressions -- A new Factor that will compute linear regressions of target against the columns of self.

Return type:

zipline.pipeline.Factor

Notes

This method can only be called on expressions which are deemed safe for use as inputs to windowed Factor objects. Examples of such expressions include This includes BoundColumn Returns and any factors created from rank() or zscore().

Examples

Suppose we want to create a factor that regresses AAPL's 10-day returns against the 10-day returns of all other assets, computing each regression over 30 days. This can be achieved by doing the following:

returns = Returns(window_length=10)
returns_slice = returns[sid(24)]
aapl_regressions = returns.linear_regression(
    target=returns_slice, regression_length=30,
)

This is equivalent to doing:

aapl_regressions = RollingLinearRegressionOfReturns(
    target=sid(24), returns_length=10, regression_length=30,
)
winsorize(self, min_percentile, max_percentile, mask=sentinel('NotSpecified'), groupby=sentinel('NotSpecified'))

Construct a new factor that winsorizes the result of this factor.

Winsorizing changes values ranked less than the minimum percentile to the value at the minimum percentile. Similarly, values ranking above the maximum percentile are changed to the value at the maximum percentile.

Winsorizing is useful for limiting the impact of extreme data points without completely removing those points.

If mask is supplied, ignore values where mask returns False when computing percentile cutoffs, and output NaN anywhere the mask is False.

If groupby is supplied, winsorization is applied separately separately to each group defined by groupby.

Parameters:
  • min_percentile (float, int) -- Entries with values at or below this percentile will be replaced with the (len(input) * min_percentile)th lowest value. If low values should not be clipped, use 0.
  • max_percentile (float, int) -- Entries with values at or above this percentile will be replaced with the (len(input) * max_percentile)th lowest value. If high values should not be clipped, use 1.
  • mask (zipline.pipeline.Filter, optional) -- A Filter defining values to ignore when winsorizing.
  • groupby (zipline.pipeline.Classifier, optional) -- A classifier defining partitions over which to winsorize.
Returns:

winsorized -- A Factor producing a winsorized version of self.

Return type:

zipline.pipeline.Factor

Examples

price = USEquityPricing.close.latest
columns={
    'PRICE': price,
    'WINSOR_1: price.winsorize(
        min_percentile=0.25, max_percentile=0.75
    ),
    'WINSOR_2': price.winsorize(
        min_percentile=0.50, max_percentile=1.0
    ),
    'WINSOR_3': price.winsorize(
        min_percentile=0.0, max_percentile=0.5
    ),

}

Given a pipeline with columns, defined above, the result for a given day could look like:

        'PRICE' 'WINSOR_1' 'WINSOR_2' 'WINSOR_3'
Asset_1    1        2          4          3
Asset_2    2        2          4          3
Asset_3    3        3          4          3
Asset_4    4        4          4          4
Asset_5    5        5          5          4
Asset_6    6        5          5          4
downsample(self, frequency)

Make a term that computes from self at lower-than-daily frequency.

Parameters:frequency ({'year_start', 'quarter_start', 'month_start', 'week_start'}) --

A string indicating desired sampling dates:

  • 'year_start' -> first trading day of each year
  • 'quarter_start' -> first trading day of January, April, July, October
  • 'month_start' -> first trading day of each month
  • 'week_start' -> first trading_day of each week
sin()

Construct a Factor that computes sin() on each output of self.

Returns:factor
Return type:zipline.pipeline.Factor
cos()

Construct a Factor that computes cos() on each output of self.

Returns:factor
Return type:zipline.pipeline.Factor
tan()

Construct a Factor that computes tan() on each output of self.

Returns:factor
Return type:zipline.pipeline.Factor
arcsin()

Construct a Factor that computes arcsin() on each output of self.

Returns:factor
Return type:zipline.pipeline.Factor
arccos()

Construct a Factor that computes arccos() on each output of self.

Returns:factor
Return type:zipline.pipeline.Factor
arctan()

Construct a Factor that computes arctan() on each output of self.

Returns:factor
Return type:zipline.pipeline.Factor
sinh()

Construct a Factor that computes sinh() on each output of self.

Returns:factor
Return type:zipline.pipeline.Factor
cosh()

Construct a Factor that computes cosh() on each output of self.

Returns:factor
Return type:zipline.pipeline.Factor
tanh()

Construct a Factor that computes tanh() on each output of self.

Returns:factor
Return type:zipline.pipeline.Factor
arcsinh()

Construct a Factor that computes arcsinh() on each output of self.

Returns:factor
Return type:zipline.pipeline.Factor
arccosh()

Construct a Factor that computes arccosh() on each output of self.

Returns:factor
Return type:zipline.pipeline.Factor
arctanh()

Construct a Factor that computes arctanh() on each output of self.

Returns:factor
Return type:zipline.pipeline.Factor
log()

Construct a Factor that computes log() on each output of self.

Returns:factor
Return type:zipline.pipeline.Factor
log10()

Construct a Factor that computes log10() on each output of self.

Returns:factor
Return type:zipline.pipeline.Factor
log1p()

Construct a Factor that computes log1p() on each output of self.

Returns:factor
Return type:zipline.pipeline.Factor
exp()

Construct a Factor that computes exp() on each output of self.

Returns:factor
Return type:zipline.pipeline.Factor
expm1()

Construct a Factor that computes expm1() on each output of self.

Returns:factor
Return type:zipline.pipeline.Factor
sqrt()

Construct a Factor that computes sqrt() on each output of self.

Returns:factor
Return type:zipline.pipeline.Factor
abs()

Construct a Factor that computes abs() on each output of self.

Returns:factor
Return type:zipline.pipeline.Factor
eq(self, other)

Construct a Filter computing self == other.

Parameters:other (zipline.pipeline.Factor, float) -- Right-hand side of the expression.
Returns:filter -- Filter computing self == other with the outputs of self and other.
Return type:zipline.pipeline.Filter
top(N, mask=sentinel('NotSpecified'), groupby=sentinel('NotSpecified'))

Construct a Filter matching the top N asset values of self each day.

If groupby is supplied, returns a Filter matching the top N asset values for each group.

Parameters:
  • N (int) -- Number of assets passing the returned filter each day.
  • mask (zipline.pipeline.Filter, optional) -- A Filter representing assets to consider when computing ranks. If mask is supplied, top values are computed ignoring any asset/date pairs for which mask produces a value of False.
  • groupby (zipline.pipeline.Classifier, optional) -- A classifier defining partitions over which to perform ranking.
Returns:

filter

Return type:

zipline.pipeline.Filter

bottom(N, mask=sentinel('NotSpecified'), groupby=sentinel('NotSpecified'))

Construct a Filter matching the bottom N asset values of self each day.

If groupby is supplied, returns a Filter matching the bottom N asset values for each group defined by groupby.

Parameters:
  • N (int) -- Number of assets passing the returned filter each day.
  • mask (zipline.pipeline.Filter, optional) -- A Filter representing assets to consider when computing ranks. If mask is supplied, bottom values are computed ignoring any asset/date pairs for which mask produces a value of False.
  • groupby (zipline.pipeline.Classifier, optional) -- A classifier defining partitions over which to perform ranking.
Returns:

filter

Return type:

zipline.pipeline.Filter

isnull()

A Filter producing True for values where this Factor has missing data.

Equivalent to self.isnan() when self.dtype is float64. Otherwise equivalent to self.eq(self.missing_value).

Returns:filter
Return type:zipline.pipeline.Filter
notnull()

A Filter producing True for values where this Factor has complete data.

Equivalent to ~self.isnan()` when ``self.dtype is float64. Otherwise equivalent to (self != self.missing_value).

isnan(self)

A Filter producing True for all values where this Factor is NaN.

Returns:nanfilter
Return type:zipline.pipeline.Filter
notnan(self)

A Filter producing True for values where this Factor is not NaN.

Returns:nanfilter
Return type:zipline.pipeline.Filter
isfinite(self)

A Filter producing True for values where this Factor is anything but NaN, inf, or -inf.

percentile_between(min_percentile, max_percentile, mask=sentinel('NotSpecified'))

Construct a Filter matching values of self that fall within the range defined by min_percentile and max_percentile.

Parameters:
  • min_percentile (float [0.0, 100.0]) -- Return True for assets falling above this percentile in the data.
  • max_percentile (float [0.0, 100.0]) -- Return True for assets falling below this percentile in the data.
  • mask (zipline.pipeline.Filter, optional) -- A Filter representing assets to consider when percentile calculating thresholds. If mask is supplied, percentile cutoffs are computed each day using only assets for which mask returns True. Assets for which mask produces False will produce False in the output of this Factor as well.
Returns:

out -- A new filter that will compute the specified percentile-range mask.

Return type:

zipline.pipeline.Filter

quartiles(self, mask=sentinel('NotSpecified'))

Construct a Classifier computing quartiles over the output of self.

Every non-NaN data point the output is labelled with a value of either 0, 1, 2, or 3, corresponding to the first, second, third, or fourth quartile over each row. NaN data points are labelled with -1.

If mask is supplied, ignore data points in locations for which mask produces False, and emit a label of -1 at those locations.

Parameters:mask (zipline.pipeline.Filter, optional) -- Mask of values to ignore when computing quartiles.
Returns:quartiles -- A classifier producing integer labels ranging from 0 to 3.
Return type:zipline.pipeline.Classifier
quintiles(self, mask=sentinel('NotSpecified'))

Construct a Classifier computing quintile labels on self.

Every non-NaN data point the output is labelled with a value of either 0, 1, 2, or 3, 4, corresonding to quintiles over each row. NaN data points are labelled with -1.

If mask is supplied, ignore data points in locations for which mask produces False, and emit a label of -1 at those locations.

Parameters:mask (zipline.pipeline.Filter, optional) -- Mask of values to ignore when computing quintiles.
Returns:quintiles -- A classifier producing integer labels ranging from 0 to 4.
Return type:zipline.pipeline.Classifier
deciles(self, mask=sentinel('NotSpecified'))

Construct a Classifier computing decile labels on self.

Every non-NaN data point the output is labelled with a value from 0 to 9 corresonding to deciles over each row. NaN data points are labelled with -1.

If mask is supplied, ignore data points in locations for which mask produces False, and emit a label of -1 at those locations.

Parameters:mask (zipline.pipeline.Filter, optional) -- Mask of values to ignore when computing deciles.
Returns:deciles -- A classifier producing integer labels ranging from 0 to 9.
Return type:zipline.pipeline.Classifier
quantiles(self, bins, mask=sentinel('NotSpecified'))

Construct a Classifier computing quantiles of the output of self.

Every non-NaN data point the output is labelled with an integer value from 0 to (bins - 1). NaNs are labelled with -1.

If mask is supplied, ignore data points in locations for which mask produces False, and emit a label of -1 at those locations.

Parameters:
  • bins (int) -- Number of bins labels to compute.
  • mask (zipline.pipeline.Filter, optional) -- Mask of values to ignore when computing quantiles.
Returns:

quantiles -- A classifier producing integer labels ranging from 0 to (bins - 1).

Return type:

zipline.pipeline.Classifier

__add__(self, other)

Construct a Factor computing self + other.

Parameters:other (zipline.pipeline.Factor, float) -- Right-hand side of the expression.
Returns:factor -- Factor computing self + other with outputs of self and other.
Return type:zipline.pipeline.Factor
__sub__(self, other)

Construct a Factor computing self - other.

Parameters:other (zipline.pipeline.Factor, float) -- Right-hand side of the expression.
Returns:factor -- Factor computing self - other with outputs of self and other.
Return type:zipline.pipeline.Factor
__add__(self, other)

Construct a Factor computing self + other.

Parameters:other (zipline.pipeline.Factor, float) -- Right-hand side of the expression.
Returns:factor -- Factor computing self + other with outputs of self and other.
Return type:zipline.pipeline.Factor
__sub__(self, other)

Construct a Factor computing self - other.

Parameters:other (zipline.pipeline.Factor, float) -- Right-hand side of the expression.
Returns:factor -- Factor computing self - other with outputs of self and other.
Return type:zipline.pipeline.Factor
__mul__(self, other)

Construct a Factor computing self * other.

Parameters:other (zipline.pipeline.Factor, float) -- Right-hand side of the expression.
Returns:factor -- Factor computing self * other with outputs of self and other.
Return type:zipline.pipeline.Factor
__div__(self, other)

Construct a Factor computing self / other.

Parameters:other (zipline.pipeline.Factor, float) -- Right-hand side of the expression.
Returns:factor -- Factor computing self / other with outputs of self and other.
Return type:zipline.pipeline.Factor
__mod__(self, other)

Construct a Factor computing self % other.

Parameters:other (zipline.pipeline.Factor, float) -- Right-hand side of the expression.
Returns:factor -- Factor computing self % other with outputs of self and other.
Return type:zipline.pipeline.Factor
__pow__(self, other)

Construct a Factor computing self ** other.

Parameters:other (zipline.pipeline.Factor, float) -- Right-hand side of the expression.
Returns:factor -- Factor computing self ** other with outputs of self and other.
Return type:zipline.pipeline.Factor
__lt__(self, other)

Construct a Filter computing self < other.

Parameters:other (zipline.pipeline.Factor, float) -- Right-hand side of the expression.
Returns:filter -- Filter computing self < other with the outputs of self and other.
Return type:zipline.pipeline.Filter
__le__(self, other)

Construct a Filter computing self <= other.

Parameters:other (zipline.pipeline.Factor, float) -- Right-hand side of the expression.
Returns:filter -- Filter computing self <= other with the outputs of self and other.
Return type:zipline.pipeline.Filter
__ne__(self, other)

Construct a Filter computing self != other.

Parameters:other (zipline.pipeline.Factor, float) -- Right-hand side of the expression.
Returns:filter -- Filter computing self != other with the outputs of self and other.
Return type:zipline.pipeline.Filter
__ge__(self, other)

Construct a Filter computing self >= other.

Parameters:other (zipline.pipeline.Factor, float) -- Right-hand side of the expression.
Returns:filter -- Filter computing self >= other with the outputs of self and other.
Return type:zipline.pipeline.Filter
__gt__(self, other)

Construct a Filter computing self > other.

Parameters:other (zipline.pipeline.Factor, float) -- Right-hand side of the expression.
Returns:filter -- Filter computing self > other with the outputs of self and other.
Return type:zipline.pipeline.Filter
class zipline.pipeline.Filter

Pipeline expression computing a boolean output.

Filters are most commonly useful for describing sets of assets to include or exclude for some particular purpose. Many Pipeline API functions accept a mask argument, which can be supplied a Filter indicating that only values passing the Filter should be considered when performing the requested computation. For example, zipline.pipeline.Factor.top() accepts a mask indicating that ranks should be computed only on assets that passed the specified Filter.

The most common way to construct a Filter is via one of the comparison operators (<, <=, !=, eq, >, >=) of Factor. For example, a natural way to construct a Filter for stocks with a 10-day VWAP less than $20.0 is to first construct a Factor computing 10-day VWAP and compare it to the scalar value 20.0:

>>> from zipline.pipeline.factors import VWAP
>>> vwap_10 = VWAP(window_length=10)
>>> vwaps_under_20 = (vwap_10 <= 20)

Filters can also be constructed via comparisons between two Factors. For example, to construct a Filter producing True for asset/date pairs where the asset's 10-day VWAP was greater than it's 30-day VWAP:

>>> short_vwap = VWAP(window_length=10)
>>> long_vwap = VWAP(window_length=30)
>>> higher_short_vwap = (short_vwap > long_vwap)

Filters can be combined via the & (and) and | (or) operators.

&-ing together two filters produces a new Filter that produces True if both of the inputs produced True.

|-ing together two filters produces a new Filter that produces True if either of its inputs produced True.

The ~ operator can be used to invert a Filter, swapping all True values with Falses and vice-versa.

Filters may be set as the screen attribute of a Pipeline, indicating asset/date pairs for which the filter produces False should be excluded from the Pipeline's output. This is useful both for reducing noise in the output of a Pipeline and for reducing memory consumption of Pipeline results.

__and__(other)

Binary Operator: '&'

__or__(other)

Binary Operator: '|'

__invert__()

Unary Operator: '~'

class zipline.pipeline.Classifier

A Pipeline expression computing a categorical output.

Classifiers are most commonly useful for describing grouping keys for complex transformations on Factor outputs. For example, Factor.demean() and Factor.zscore() can be passed a Classifier in their groupby argument, indicating that means/standard deviations should be computed on assets for which the classifier produced the same label.

element_of(choices)

Construct a Filter indicating whether values are in choices.

Parameters:choices (iterable[str or int]) -- An iterable of choices.
Returns:matches -- Filter returning True for all sid/date pairs for which self produces an entry in choices.
Return type:Filter
endswith(self, suffix)

Construct a Filter matching values ending with suffix.

Parameters:suffix (str) -- String suffix against which to compare values produced by self.
Returns:matches -- Filter returning True for all sid/date pairs for which self produces a string ending with prefix.
Return type:Filter
eq(other)

Construct a Filter returning True for asset/date pairs where the output of self matches other.

has_substring(self, substring)

Construct a Filter matching values containing substring.

Parameters:substring (str) -- Sub-string against which to compare values produced by self.
Returns:matches -- Filter returning True for all sid/date pairs for which self produces a string containing substring.
Return type:Filter
isnull()

A Filter producing True for values where this term has missing data.

matches(self, pattern)

Construct a Filter that checks regex matches against pattern.

Parameters:pattern (str) -- Regex pattern against which to compare values produced by self.
Returns:matches -- Filter returning True for all sid/date pairs for which self produces a string matched by pattern.
Return type:Filter
notnull()

A Filter producing True for values where this term has complete data.

peer_count()

Construct a factor that gives the number of occurrences of each distinct category in a classifier.

Examples

Let c be a Classifier which would produce the following output:

             AAPL   MSFT    MCD     BK   AMZN     FB
2015-05-05    'a'    'a'   None    'b'    'a'   None
2015-05-06    'b'    'a'    'c'    'b'    'b'    'b'
2015-05-07   None    'a'   'aa'   'aa'   'aa'   None
2015-05-08    'c'    'c'    'c'    'c'    'c'    'c'

Then c.peer_count() will count, for each row, the total number of assets in each classifier category produced by c. Missing data will be evaluated to NaN.

             AAPL   MSFT    MCD     BK   AMZN     FB
2015-05-05    3.0    3.0    NaN    1.0    3.0    NaN
2015-05-06    4.0    1.0    1.0    4.0    4.0    4.0
2015-05-07    NaN    1.0    3.0    3.0    3.0    NaN
2015-05-08    6.0    6.0    6.0    6.0    6.0    6.0
Returns:factor -- A CustomFactor that counts, for each asset, the total number of assets with the same classifier category label.
Return type:CustomFactor
relabel(self, relabeler)

Convert self into a new classifier by mapping a function over each element produced by self.

Parameters:relabeler (function[str -> str or None]) -- A function to apply to each unique value produced by self.
Returns:relabeled -- A classifier produced by applying relabeler to each unique value produced by self.
Return type:Classifier
startswith(self, prefix)

Construct a Filter matching values starting with prefix.

Parameters:prefix (str) -- String prefix against which to compare values produced by self.
Returns:matches -- Filter returning True for all sid/date pairs for which self produces a string starting with prefix.
Return type:Filter
class zipline.pipeline.data.DataSet

Base class for Pipeline datasets.

A DataSet is defined by two parts:

  1. A collection of Column objects that describe the queryable attributes of the dataset.
  2. A Domain describing the assets and calendar of the data represented by the DataSet.

To create a new Pipeline dataset, define a subclass of DataSet and set one or more Column objects as class-level attributes. Each column requires a np.dtype that describes the type of data that should be produced by a loader for the dataset. Integer columns must also provide a "missing value" to be used when no value is available for a given asset/date combination.

By default, the domain of a dataset is the special singleton value, GENERIC, which means that they can be used in a Pipeline running on any domain.

In some cases, it may be preferable to restrict a dataset to only allow support a single domain. For example, a DataSet may describe data from a vendor that only covers the US. To restrict a dataset to a specific domain, define a domain attribute at class scope.

You can also define a domain-specific version of a generic DataSet by calling its specialize method with the domain of interest.

Examples

The built-in EquityPricing dataset is defined as follows:

class EquityPricing(DataSet):
    open = Column(float)
    high = Column(float)
    low = Column(float)
    close = Column(float)
    volume = Column(float)

The built-in USEquityPricing dataset is a specialization of EquityPricing. It is defined as:

from zipline.pipeline.domain import US_EQUITIES
USEquityPricing = EquityPricing.specialize(US_EQUITIES)

Columns can have types other than float. A dataset containing assorted company metadata might be defined like this:

class CompanyMetadata(DataSet):
    # Use float for semantically-numeric data, even if it's always
    # integral valued (see Notes section below). The default missing
    # value for floats is NaN.
    shares_outstanding = Column(float)

    # Use object for string columns. The default missing value for
    # object-dtype columns is None.
    ticker = Column(object)

    # Use integers for integer-valued categorical data like sector or
    # industry codes. Integer-dtype columns require an explicit missing
    # value.
    sector_code = Column(int, missing_value=-1)

    # Use bool for boolean-valued flags. Note that the default missing
    # value for bool-dtype columns is False.
    is_primary_share = Column(bool)

Notes

Because numpy has no native support for integers with missing values, users are strongly encouraged to use floats for any data that's semantically numeric. Doing so enables the use of NaN as a natural missing value, which has useful propagation semantics.

columns

Get all the columns of this dataset. :returns: frozenset[zipline.pipeline.data.BoundColumn]

classmethod get_column(name)

Look up a column by name.

Parameters:name (str) -- Name of the column to look up.
Returns:column -- Column with the given name.
Return type:zipline.pipeline.data.BoundColumn
Raises:AttributeError -- If no column with the given name exists.
class zipline.pipeline.data.DataSetFamily

Base class for Pipeline dataset families.

Dataset families are used to represent data where the unique identifier for a row requires more than just asset and date coordinates. A DataSetFamily can also be thought of as a collection of DataSet objects, each of which has the same columns, domain, and ndim.

DataSetFamily objects are defined with by one or more Column objects, plus one additional field: extra_dims.

The extra_dims field defines coordinates other than asset and date that must be fixed to produce a logical timeseries. The column objects determine columns that will be shared by slices of the family.

extra_dims are represented as an ordered dictionary where the keys are the dimension name, and the values are a set of unique values along that dimension.

To work with a DataSetFamily in a pipeline expression, one must choose a specific value for each of the extra dimensions using the slice() method. For example, given a DataSetFamily:

class SomeDataSet(DataSetFamily):
    extra_dims = [
        ('dimension_0', {'a', 'b', 'c'}),
        ('dimension_1', {'d', 'e', 'f'}),
    ]

    column_0 = Column(float)
    column_1 = Column(bool)

This dataset might represent a table with the following columns:

sid :: int64
asof_date :: datetime64[ns]
timestamp :: datetime64[ns]
dimension_0 :: str
dimension_1 :: str
column_0 :: float64
column_1 :: bool

Here we see the implicit sid, asof_date and timestamp columns as well as the extra dimensions columns.

This DataSetFamily can be converted to a regular DataSet with:

DataSetSlice = SomeDataSet.slice(dimension_0='a', dimension_1='e')

This sliced dataset represents the rows from the higher dimensional dataset where (dimension_0 == 'a') & (dimension_1 == 'e').

classmethod slice(*args, **kwargs)

Take a slice of a DataSetFamily to produce a dataset indexed by asset and date.

Parameters:
  • *args --
  • **kwargs -- The coordinates to fix along each extra dimension.
Returns:

dataset -- A regular pipeline dataset indexed by asset and date.

Return type:

DataSet

Notes

The extra dimensions coords used to produce the result are available under the extra_coords attribute.

class zipline.pipeline.data.BoundColumn

A column of data that's been concretely bound to a particular dataset.

dtype

The dtype of data produced when this column is loaded.

Type:numpy.dtype
latest

A Filter, Factor, or Classifier computing the most recently known value of this column on each date. See zipline.pipeline.mixins.LatestMixin for more details.

Type:zipline.pipeline.LoadableTerm
dataset

The dataset to which this column is bound.

Type:zipline.pipeline.data.DataSet
name

The name of this column.

Type:str
metadata

Extra metadata associated with this column.

Type:dict

Notes

Instances of this class are dynamically created upon access to attributes of DataSet. For example, close is an instance of this class. Pipeline API users should never construct instances of this directly.

class zipline.pipeline.data.Column

An abstract column of data, not yet associated with a dataset.

class zipline.pipeline.domain.Domain

A domain represents a set of labels for the arrays computed by a Pipeline.

A domain defines two things:

  1. A calendar defining the dates to which the pipeline's inputs and outputs should be aligned. The calendar is represented concretely by a pandas DatetimeIndex.
  2. The set of assets that the pipeline should compute over. Right now, the only supported way of representing this set is with a two-character country code describing the country of assets over which the pipeline should compute. In the future, we expect to expand this functionality to include more general concepts.
zipline.pipeline.domain.GENERIC

Special sentinel domain used for pipeline terms that can be computed on any domain.

Pricing Data

class quantopian.pipeline.data.EquityPricing

DataSet containing daily trading prices and volumes.

close = EquityPricing.close::float64
high = EquityPricing.high::float64
low = EquityPricing.low::float64
open = EquityPricing.open::float64
volume = EquityPricing.volume::float64
class quantopian.pipeline.data.USEquityPricing

Backwards-compat alias for EquityPricing.specialize(US_EQUITIES).

FactSet Data

class quantopian.pipeline.data.factset.Fundamentals

DataSet containing fundamental data sourced from FactSet.

Notes

See the Data Reference for more info.

class quantopian.pipeline.data.factset.EquityMetadata

DataSet containing metadata about assets.

class quantopian.pipeline.data.factset.GeoRev

DataSetFamily containing company revenue, broken down by source country or region.

Slices of this family allow users to query for revenue sourced from a particular country or collection of countries. GeoRev.slice('US'), for example, produces data on each asset's revenue from the United States, while GeoRev.slice('CN') produces data on each asset's revenue from China.

Notes

See the Data Reference for more info.

class quantopian.pipeline.data.factset.RBICSFocus

DataSet providing information about companies' areas of business focus.

class quantopian.pipeline.data.factset.estimates.PeriodicConsensus

DataSetFamily for quarterly, semi-annual, and annual consensus estimates.

Slices of this family allow users to query for consensus estimates of quarterly, semi-annual, and annual financial items.

Examples

# Earnings estimates for next fiscal quarter.
fq1_eps = PeriodicConsensus.slice('EPS', 'qf', 1)

# Earnings estimates for most recently announced quarter.
fq0_eps = PeriodicConsensus.slice('EPS', 'qf', 0)

# Earnings estimates for two quarters out.
fq2_eps = PeriodicConsensus.slice('EPS', 'qf', 2)

# Earnings estimates for next fiscal year.
fy1_eps = PeriodicConsensus.slice('EPS', 'af', 1)

# Cash flow estimates for next quarter.
fq1_dps = PeriodicConsensus.slice('CFPS', 'qf', 1)

Notes

See the Data Reference for more info.

class quantopian.pipeline.data.factset.estimates.Actuals

DataSetFamily for "actual" reports of estimated values.

Slices of this family allow users to query for actual results of estimated quarterly, semi-annual, and annual financial items.

Examples

# Most recently reported quarterly earnings.
fq0_eps = Actuals.slice('EPS', 'qf', 0)

# EPS reported two quarters ago.
fqm1_eps = Actuals.slice('EPS', 'qf', -1)

# Most recently reported annual earnings.
fy0_eps = Actuals.slice('EPS', 'af', 0)

# Most recently reported quarterly cash flow.
fq0_cfps = Actuals.slice('CFPS', 'qf', 0)

Notes

See the Data Reference for more info.

class quantopian.pipeline.data.factset.estimates.ConsensusRecommendations

DataSet containing consensus broker recommendations.

Notes

See the Data Reference for more info.

class quantopian.pipeline.data.factset.estimates.LongTermConsensus

DataSetFamily for long term consensus estimates.

Examples

# Long term estimates for eps growth. lt_eps_growth = LongTermConsensus.slice('EPS_LTG')

# Long term estimates for price target. lt_price_target = LongTermConsensus.slice('PRICE_TGT')

Notes

See the Data Reference for more info.

Morningstar Data

class quantopian.pipeline.data.morningstar.Fundamentals

DataSet containing fundamental data sourced from Morningstar.

Notes

See the Data Reference for more info.

Built-in Factors

class quantopian.pipeline.factors.DailyReturns

Calculates daily percent change in close price.

Default Inputs: [EquityPricing.close]

class quantopian.pipeline.factors.Returns

Calculates the percent change in close price over the given window_length.

Default Inputs: [EquityPricing.close]

class quantopian.pipeline.factors.PercentChange

Calculates the percent change over the given window_length.

Default Inputs: None

Default Window Length: None

Notes

Percent change is calculated as (new - old) / abs(old).

class quantopian.pipeline.factors.VWAP

Volume Weighted Average Price

Default Inputs: [EquityPricing.close, EquityPricing.volume]

Default Window Length: None

class quantopian.pipeline.factors.AverageDollarVolume

Average Daily Dollar Volume

Default Inputs: [EquityPricing.close, EquityPricing.volume]

Default Window Length: None

class quantopian.pipeline.factors.AnnualizedVolatility

Volatility. The degree of variation of a series over time as measured by the standard deviation of daily returns. https://en.wikipedia.org/wiki/Volatility_(finance)

Default Inputs: [Returns(window_length=2)]

Parameters:annualization_factor (float, optional) -- The number of time units per year. Defaults is 252, the number of NYSE trading days in a normal year.
class quantopian.pipeline.factors.SimpleBeta

Factor producing the slope of a regression line between each asset's daily returns to the daily returns of a single "target" asset.

Parameters:
  • target (zipline.Asset) -- Asset against which other assets should be regressed.
  • regression_length (int) -- Number of days of daily returns to use for the regression.
  • allowed_missing_percentage (float, optional) -- Percentage of returns observations (between 0 and 1) that are allowed to be missing when calculating betas. Assets with more than this percentage of returns observations missing will produce values of NaN. Default behavior is that 25% of inputs can be missing.
class quantopian.pipeline.factors.SimpleMovingAverage

Average Value of an arbitrary column

Default Inputs: None

Default Window Length: None

class quantopian.pipeline.factors.Latest

Factor producing the most recently-known value of inputs[0] on each day.

The .latest attribute of DataSet columns returns an instance of this Factor.

class quantopian.pipeline.factors.MaxDrawdown

Max Drawdown

Default Inputs: None

Default Window Length: None

class quantopian.pipeline.factors.RSI

Relative Strength Index

Default Inputs: [EquityPricing.close]

Default Window Length: 15

class quantopian.pipeline.factors.ExponentialWeightedMovingAverage

Exponentially Weighted Moving Average

Default Inputs: None

Default Window Length: None

Parameters:
  • inputs (length-1 list/tuple of BoundColumn) -- The expression over which to compute the average.
  • window_length (int > 0) -- Length of the lookback window over which to compute the average.
  • decay_rate (float, 0 < decay_rate <= 1) --

    Weighting factor by which to discount past observations.

    When calculating historical averages, rows are multiplied by the sequence:

    decay_rate, decay_rate ** 2, decay_rate ** 3, ...
    

Notes

  • This class can also be imported under the name EWMA.

Alternate Constructors

classmethod from_span(cls, inputs, window_length, span, **kwargs)

Convenience constructor for passing decay_rate in terms of span.

Forwards decay_rate as 1 - (2.0 / (1 + span)). This provides the behavior equivalent to passing span to pandas.ewma.

Examples

# Equivalent to:
# my_ewma = EWMA(
#    inputs=[EquityPricing.close],
#    window_length=30,
#    decay_rate=(1 - (2.0 / (1 + 15.0))),
# )
my_ewma = EWMA.from_span(
    inputs=[EquityPricing.close],
    window_length=30,
    span=15,
)

Notes

This classmethod is provided by both ExponentialWeightedMovingAverage and ExponentialWeightedMovingStdDev.

classmethod from_center_of_mass(inputs, window_length, center_of_mass, **kwargs)

Convenience constructor for passing decay_rate in terms of center of mass.

Forwards decay_rate as 1 - (1 / 1 + center_of_mass). This provides behavior equivalent to passing center_of_mass to pandas.ewma.

Examples

# Equivalent to:
# my_ewma = EWMA(
#    inputs=[EquityPricing.close],
#    window_length=30,
#    decay_rate=(1 - (1 / 15.0)),
# )
my_ewma = EWMA.from_center_of_mass(
    inputs=[EquityPricing.close],
    window_length=30,
    center_of_mass=15,
)

Notes

This classmethod is provided by both ExponentialWeightedMovingAverage and ExponentialWeightedMovingStdDev.

classmethod from_halflife(cls, inputs, window_length, halflife, **kwargs)

Convenience constructor for passing decay_rate in terms of half life.

Forwards decay_rate as exp(log(.5) / halflife). This provides the behavior equivalent to passing halflife to pandas.ewma.

Examples

# Equivalent to:
# my_ewma = EWMA(
#    inputs=[EquityPricing.close],
#    window_length=30,
#    decay_rate=np.exp(np.log(0.5) / 15),
# )
my_ewma = EWMA.from_halflife(
    inputs=[EquityPricing.close],
    window_length=30,
    halflife=15,
)

Notes

This classmethod is provided by both ExponentialWeightedMovingAverage and ExponentialWeightedMovingStdDev.

class quantopian.pipeline.factors.ExponentialWeightedMovingStdDev

Exponentially Weighted Moving Standard Deviation

Default Inputs: None

Default Window Length: None

Parameters:
  • inputs (length-1 list/tuple of BoundColumn) -- The expression over which to compute the average.
  • window_length (int > 0) -- Length of the lookback window over which to compute the average.
  • decay_rate (float, 0 < decay_rate <= 1) --

    Weighting factor by which to discount past observations.

    When calculating historical averages, rows are multiplied by the sequence:

    decay_rate, decay_rate ** 2, decay_rate ** 3, ...
    

Notes

  • This class can also be imported under the name EWMSTD.

See also

pandas.DataFrame.ewm()

Alternate Constructors

classmethod from_span(cls, inputs, window_length, span, **kwargs)

Convenience constructor for passing decay_rate in terms of span.

Forwards decay_rate as 1 - (2.0 / (1 + span)). This provides the behavior equivalent to passing span to pandas.ewma.

Examples

# Equivalent to:
# my_ewma = EWMA(
#    inputs=[EquityPricing.close],
#    window_length=30,
#    decay_rate=(1 - (2.0 / (1 + 15.0))),
# )
my_ewma = EWMA.from_span(
    inputs=[EquityPricing.close],
    window_length=30,
    span=15,
)

Notes

This classmethod is provided by both ExponentialWeightedMovingAverage and ExponentialWeightedMovingStdDev.

classmethod from_center_of_mass(inputs, window_length, center_of_mass, **kwargs)

Convenience constructor for passing decay_rate in terms of center of mass.

Forwards decay_rate as 1 - (1 / 1 + center_of_mass). This provides behavior equivalent to passing center_of_mass to pandas.ewma.

Examples

# Equivalent to:
# my_ewma = EWMA(
#    inputs=[EquityPricing.close],
#    window_length=30,
#    decay_rate=(1 - (1 / 15.0)),
# )
my_ewma = EWMA.from_center_of_mass(
    inputs=[EquityPricing.close],
    window_length=30,
    center_of_mass=15,
)

Notes

This classmethod is provided by both ExponentialWeightedMovingAverage and ExponentialWeightedMovingStdDev.

classmethod from_halflife(cls, inputs, window_length, halflife, **kwargs)

Convenience constructor for passing decay_rate in terms of half life.

Forwards decay_rate as exp(log(.5) / halflife). This provides the behavior equivalent to passing halflife to pandas.ewma.

Examples

# Equivalent to:
# my_ewma = EWMA(
#    inputs=[EquityPricing.close],
#    window_length=30,
#    decay_rate=np.exp(np.log(0.5) / 15),
# )
my_ewma = EWMA.from_halflife(
    inputs=[EquityPricing.close],
    window_length=30,
    halflife=15,
)

Notes

This classmethod is provided by both ExponentialWeightedMovingAverage and ExponentialWeightedMovingStdDev.

class quantopian.pipeline.factors.WeightedAverageValue

Helper for VWAP-like computations.

Default Inputs: None

Default Window Length: None

compute(today, assets, out, base, weight)

Override this method with a function that writes a value into out.

class quantopian.pipeline.factors.BollingerBands

Bollinger Bands technical indicator. https://en.wikipedia.org/wiki/Bollinger_Bands

Default Inputs: zipline.pipeline.data.EquityPricing.close

Parameters:
  • inputs (length-1 iterable[BoundColumn]) -- The expression over which to compute bollinger bands.
  • window_length (int > 0) -- Length of the lookback window over which to compute the bollinger bands.
  • k (float) -- The number of standard deviations to add or subtract to create the upper and lower bands.
compute(today, assets, out, close, k)

Override this method with a function that writes a value into out.

class quantopian.pipeline.factors.MovingAverageConvergenceDivergenceSignal(*args, **kwargs)

Moving Average Convergence/Divergence (MACD) Signal line https://en.wikipedia.org/wiki/MACD

A technical indicator originally developed by Gerald Appel in the late 1970's. MACD shows the relationship between two moving averages and reveals changes in the strength, direction, momentum, and duration of a trend in a stock's price.

Default Inputs: zipline.pipeline.data.EquityPricing.close

Parameters:
  • fast_period (int > 0, optional) -- The window length for the "fast" EWMA. Default is 12.
  • slow_period (int > 0, > fast_period, optional) -- The window length for the "slow" EWMA. Default is 26.
  • signal_period (int > 0, < fast_period, optional) -- The window length for the signal line. Default is 9.

Notes

Unlike most pipeline expressions, this factor does not accept a window_length parameter. window_length is inferred from slow_period and signal_period.

compute(today, assets, out, close, fast_period, slow_period, signal_period)

Override this method with a function that writes a value into out.

class quantopian.pipeline.factors.RollingPearsonOfReturns(*args, **kwargs)

Calculates the Pearson product-moment correlation coefficient of the returns of the given asset with the returns of all other assets.

Pearson correlation is what most people mean when they say "correlation coefficient" or "R-value".

Parameters:
  • target (zipline.assets.Asset) -- The asset to correlate with all other assets.
  • returns_length (int >= 2) -- Length of the lookback window over which to compute returns. Daily returns require a window length of 2.
  • correlation_length (int >= 1) -- Length of the lookback window over which to compute each correlation coefficient.
  • mask (zipline.pipeline.Filter, optional) -- A Filter describing which assets should have their correlation with the target asset computed each day.

Notes

Computing this factor over many assets can be time consuming. It is recommended that a mask be used in order to limit the number of assets over which correlations are computed.

Examples

Let the following be example 10-day returns for three different assets:

               SPY    MSFT     FB
2017-03-13    -.03     .03    .04
2017-03-14    -.02    -.03    .02
2017-03-15    -.01     .02    .01
2017-03-16       0    -.02    .01
2017-03-17     .01     .04   -.01
2017-03-20     .02    -.03   -.02
2017-03-21     .03     .01   -.02
2017-03-22     .04    -.02   -.02

Suppose we are interested in SPY's rolling returns correlation with each stock from 2017-03-17 to 2017-03-22, using a 5-day look back window (that is, we calculate each correlation coefficient over 5 days of data). We can achieve this by doing:

rolling_correlations = RollingPearsonOfReturns(
    target=sid(8554),
    returns_length=10,
    correlation_length=5,
)

The result of computing rolling_correlations from 2017-03-17 to 2017-03-22 gives:

               SPY   MSFT     FB
2017-03-17       1    .15   -.96
2017-03-20       1    .10   -.96
2017-03-21       1   -.16   -.94
2017-03-22       1   -.16   -.85

Note that the column for SPY is all 1's, as the correlation of any data series with itself is always 1. To understand how each of the other values were calculated, take for example the .15 in MSFT's column. This is the correlation coefficient between SPY's returns looking back from 2017-03-17 (-.03, -.02, -.01, 0, .01) and MSFT's returns (.03, -.03, .02, -.02, .04).

class quantopian.pipeline.factors.RollingSpearmanOfReturns(*args, **kwargs)

Calculates the Spearman rank correlation coefficient of the returns of the given asset with the returns of all other assets.

Parameters:
  • target (zipline.assets.Asset) -- The asset to correlate with all other assets.
  • returns_length (int >= 2) -- Length of the lookback window over which to compute returns. Daily returns require a window length of 2.
  • correlation_length (int >= 1) -- Length of the lookback window over which to compute each correlation coefficient.
  • mask (zipline.pipeline.Filter, optional) -- A Filter describing which assets should have their correlation with the target asset computed each day.

Notes

Computing this factor over many assets can be time consuming. It is recommended that a mask be used in order to limit the number of assets over which correlations are computed.

class quantopian.pipeline.factors.RollingLinearRegressionOfReturns(*args, **kwargs)

Perform an ordinary least-squares regression predicting the returns of all other assets on the given asset.

Parameters:
  • target (zipline.assets.Asset) -- The asset to regress against all other assets.
  • returns_length (int >= 2) -- Length of the lookback window over which to compute returns. Daily returns require a window length of 2.
  • regression_length (int >= 1) -- Length of the lookback window over which to compute each regression.
  • mask (zipline.pipeline.Filter, optional) -- A Filter describing which assets should be regressed against the target asset each day.

Notes

Computing this factor over many assets can be time consuming. It is recommended that a mask be used in order to limit the number of assets over which regressions are computed.

This factor is designed to return five outputs:

  • alpha, a factor that computes the intercepts of each regression.
  • beta, a factor that computes the slopes of each regression.
  • r_value, a factor that computes the correlation coefficient of each regression.
  • p_value, a factor that computes, for each regression, the two-sided p-value for a hypothesis test whose null hypothesis is that the slope is zero.
  • stderr, a factor that computes the standard error of the estimate of each regression.

For more help on factors with multiple outputs, see zipline.pipeline.CustomFactor.

Examples

Let the following be example 10-day returns for three different assets:

               SPY    MSFT     FB
2017-03-13    -.03     .03    .04
2017-03-14    -.02    -.03    .02
2017-03-15    -.01     .02    .01
2017-03-16       0    -.02    .01
2017-03-17     .01     .04   -.01
2017-03-20     .02    -.03   -.02
2017-03-21     .03     .01   -.02
2017-03-22     .04    -.02   -.02

Suppose we are interested in predicting each stock's returns from SPY's over rolling 5-day look back windows. We can compute rolling regression coefficients (alpha and beta) from 2017-03-17 to 2017-03-22 by doing:

regression_factor = RollingRegressionOfReturns(
    target=sid(8554),
    returns_length=10,
    regression_length=5,
)
alpha = regression_factor.alpha
beta = regression_factor.beta

The result of computing alpha from 2017-03-17 to 2017-03-22 gives:

               SPY    MSFT     FB
2017-03-17       0    .011   .003
2017-03-20       0   -.004   .004
2017-03-21       0    .007   .006
2017-03-22       0    .002   .008

And the result of computing beta from 2017-03-17 to 2017-03-22 gives:

               SPY    MSFT     FB
2017-03-17       1      .3   -1.1
2017-03-20       1      .2     -1
2017-03-21       1     -.3     -1
2017-03-22       1     -.3    -.9

Note that SPY's column for alpha is all 0's and for beta is all 1's, as the regression line of SPY with itself is simply the function y = x.

To understand how each of the other values were calculated, take for example MSFT's alpha and beta values on 2017-03-17 (.011 and .3, respectively). These values are the result of running a linear regression predicting MSFT's returns from SPY's returns, using values starting at 2017-03-17 and looking back 5 days. That is, the regression was run with x = [-.03, -.02, -.01, 0, .01] and y = [.03, -.03, .02, -.02, .04], and it produced a slope of .3 and an intercept of .011.

Built-in Filters

class quantopian.pipeline.filters.StaticAssets(assets)

A Filter that computes True for a specific set of predetermined assets.

StaticAssets is mostly useful for debugging or for interactively computing pipeline terms for a fixed set of assets that are known ahead of time.

Parameters:assets (iterable[Asset]) -- An iterable of assets for which to filter.
class quantopian.pipeline.filters.StaticSids(sids)

A Filter that computes True for a specific set of predetermined sids.

StaticSids is mostly useful for debugging or for interactively computing pipeline terms for a fixed set of sids that are known ahead of time.

Parameters:sids (iterable[int]) -- An iterable of sids for which to filter.
quantopian.pipeline.filters.QTradableStocksUS()

Create the trading universe used in the Quantopian Contest.

Returns:
  • universe (Filter defining the contest universe.)
  • Equities are filtered in three passes. Each pass operates only on equities
  • that survived the previous pass.
  • First Pass
  • Filter based on infrequently-changing attributes using the following rules
  • 1. The stock must be a common (i.e. not preferred) stock.
  • 2. The stock must not be a depository receipt.
  • 3. The stock must not be for a limited partnership.
  • 4. The stock must not be traded over the counter (OTC).
  • Second Pass
  • For companies with more than one share class, choose the most liquid share
  • class. Share classes belonging to the same company are indicated by a
  • common primary_share_class_id.
  • Liquidity is measured using the 200-day median daily dollar volume.
  • Equities without a primary_share_class_id are automatically excluded.
  • Third Pass
  • Filter based on dynamic attributes using the following rules
  • 1. The stock must have a 200-day median daily dollar volume exceeding -- 2.5 Million USD.
  • 2. The stock must have a moving average market capitalization of at least -- 350 Million USD over the last 20 days.
  • 3. The stock must not have more than 20 days of missing close price in the -- last 200 and must not have any missing close price in the last 20 days.
  • 4. The stock must not be an active M&A target; equities that pass the -- filter IsAnnouncedAcquisitionTarget() are screened out.

Notes

  • ETFs are not included in this universe.
  • Unlike the Q500US() and Q1500US(), this universe has no size cutoff. All equities that match the required criteria are included.
  • If the most liquid share class of a company passes the static pass but fails the dynamic pass, then no share class for that company is included.
quantopian.pipeline.filters.Q500US(minimum_market_cap=500000000)

A default universe containing approximately 500 US equities each day.

Constituents are chosen at the start of each calendar month by selecting the top 500 "tradeable" stocks by 200-day average dollar volume, capped at 30% of equities allocated to any single sector

A stock is considered "tradeable" if it meets the following criteria:

  1. The stock must be the primary share class for its company.
  2. The company issuing the stock must have known market capitalization.
  3. The stock must not be a depository receipt.
  4. The stock must not be traded over the counter (OTC).
  5. The stock must not be for a limited partnership.
  6. The stock must have a known previous-day close price.
  7. The stock must have had nonzero volume on the previous trading day.
quantopian.pipeline.filters.Q1500US(minimum_market_cap=500000000)

A default universe containing approximately 1500 US equities each day.

Constituents are chosen at the start of each month by selecting the top 1500 "tradeable" stocks by 200-day average dollar volume, capped at 30% of equities allocated to any single sector.

A stock is considered "tradeable" if it meets the following criteria:

  1. The stock must be the primary share class for its company.
  2. The company issuing the stock must have known market capitalization.
  3. The stock must not be a depository receipt.
  4. The stock must not be traded over the counter (OTC).
  5. The stock must not be for a limited partnership.
  6. The stock must have a known previous-day close price.
  7. The stock must have had nonzero volume on the previous trading day.
quantopian.pipeline.filters.Q3000US(minimum_market_cap=500000000)

A default universe containing approximately 3000 US equities each day. Used for generating a universe of tradeable stocks at the start of each trading day.

Constituents are chosen at the start of each month by selecting the top 3000 "tradeable" stocks by 200-day average dollar volume, capped at 30% of equities allocated to any single sector.

A stock is considered "tradeable" if it meets the following criteria:

  1. The stock must be the primary share class for its company.
  2. The company issuing the stock must have known market capitalization.
  3. The stock must not be a depository receipt.
  4. The stock must not be traded over the counter (OTC).
  5. The stock must not be for a limited partnership.
  6. The stock must have a known previous-day close price.
  7. The stock must have had nonzero volume on the previous trading day.
quantopian.pipeline.filters.make_us_equity_universe(target_size, rankby, groupby, max_group_weight, mask, smoothing_func=<function downsample_monthly>, exclude_ipos=False)

Create a QUS-style universe filter.

The constructed Filter accepts approximately the top target_size assets ranked by rankby, subject to tradeability, weighting, and turnover constraints.

The selection algorithm implemented by the generated Filter is as follows:

  1. Look at all known stocks and eliminate stocks for which mask returns False.

  2. Partition the remaining stocks into buckets based on the labels computed by groupby.

  3. Choose the top target_size stocks, sorted by rankby, subject to the constraint that the percentage of stocks accepted in any single group in (2) is less than or equal to max_group_weight.

  4. Pass the resulting "naive" filter to smoothing_func, which must return a new Filter.

    Smoothing is most often useful for applying transformations that reduce turnover at the boundary of the universe's rank-inclusion criterion. For example, a smoothing function might require that an asset pass the naive filter for 5 consecutive days before acceptance, reducing the number of assets that too-regularly enter and exit the universe.

    Another common smoothing technique is to reduce the frequency at which we recalculate using Filter.downsample. The default smoothing behavior is to downsample to monthly frequency.

  5. & the result of smoothing with mask, ensuring that smoothing does not re-introduce masked-out assets.

Parameters:
  • target_size (int > 0) -- The target number of securities to accept each day. Exactly target_size assets will be accepted by the Filter supplied to smoothing_func, but more or fewer may be accepted in the final output depending on the smoothing function applied.
  • rankby (zipline.pipeline.Factor) -- The Factor by which to rank all assets each day. The top target_size assets that pass mask will be accepted, subject to the constraint that no single group receives greater than max_group_weight as a percentage of the total number of accepted assets.
  • mask (zipline.pipeline.Filter) -- An initial filter used to ignore securities deemed "untradeable". Assets for which mask returns False on a given day will always be rejected by the final output filter, and will be ignored when calculating ranks.
  • groupby (zipline.pipeline.Classifier) -- A classifier that groups assets into buckets. Each bucket will receive at most max_group_weight as a percentage of the total number of accepted assets.
  • max_group_weight (float) -- A float between 0.0 and 1.0 indicating the maximum percentage of assets that should be accepted in any single bucket returned by groupby.
  • smoothing_func (callable[Filter -> Filter], optional) --

    A function accepting a Filter and returning a new Filter.

    This is generally used to apply 'stickiness' to the output of the "naive" filter. Adding stickiness helps reduce turnover of the final output by preventing assets from entering or exiting the final universe too frequently.

    The default smoothing behavior is to downsample at monthly frequency. This means that the naive universe is recalculated at the start of each month, rather than continuously every day, reducing the impact of spurious turnover.

Example

The algorithm for the built-in Q500US universe is defined as follows:

At the start of each month, choose the top 500 assets by average dollar volume over the last year, ignoring hard-to-trade assets, and choosing no more than 30% of the assets from any single market sector.

The Q500US is implemented as:

from quantopian.pipeline import factors, filters, classifiers

def Q500US():
    return filters.make_us_equity_universe(
        target_size=500,
        rankby=factors.AverageDollarVolume(window_length=200),
        mask=filters.default_us_equity_universe_mask(),
        groupby=classifiers.fundamentals.Sector(),
        max_group_weight=0.3,
        smoothing_func=lambda f: f.downsample('month_start'),
    )
Returns:universe -- A Filter representing the final universe
Return type:zipline.pipeline.Filter
quantopian.pipeline.filters.default_us_equity_universe_mask(minimum_market_cap=500000000)

Create the base filter used to filter assets from the QUS filters.

The criteria required to pass the resulting filter are as follows:

  1. The stock must be the primary share class for its company.
  2. The company issuing the stock must have a minimum market capitalization of 'minimum_market_cap', defaulting to 500 Million.
  3. The stock must not be a depository receipt.
  4. The stock must not be traded over the counter (OTC).
  5. The stock must not be for a limited partnership.
  6. The stock must have a known previous-day close price.
  7. The stock must have had nonzero volume on the previous trading day.

Notes

We previously had an additional limited partnership check using Fundamentals.limited_partnership, but this provided only false positives beyond those already captured by not_lp_by_name, so it has been removed.

class quantopian.pipeline.filters.morningstar.IsDepositaryReceipt

A Filter indicating whether a given asset is a depositary receipt

inputs = (Fundamentals<US>.is_depositary_receipt::bool,)
class quantopian.pipeline.filters.morningstar.IsPrimaryShare

A Filter indicating whether a given asset class is a primary share.

inputs = (Fundamentals<US>.is_primary_share::bool,)
quantopian.pipeline.filters.morningstar.is_common_stock()

Construct a Filter indicating whether an asset is common (as opposed to preferred) stock.

Built-in Classifiers

class quantopian.pipeline.classifiers.morningstar.SuperSector

Classifier that groups assets by Morningstar Super Sector.

There are three possible classifications:

  • 1 - Cyclical
  • 2 - Defensive
  • 3 - Sensitive

These values are provided as integer constants on the class.

For more information on morningstar classification codes, see: https://www.quantopian.com/help/fundamentals#industry-sector.

inputs = (Fundamentals<US>.morningstar_economy_sphere_code::int64,)
dtype = dtype('int64')
missing_value = -1
CYCLICAL = 1
DEFENSIVE = 2
SENSITIVE = 3
SUPER_SECTOR_NAMES = {1: 'CYCLICAL', 2: 'DEFENSIVE', 3: 'SENSITIVE'}
class quantopian.pipeline.classifiers.morningstar.Sector

Classifier that groups assets by Morningstar Sector Code.

There are 11 possible classifications:

  • 101 - Basic Materials
  • 102 - Consumer Cyclical
  • 103 - Financial Services
  • 104 - Real Estate
  • 205 - Consumer Defensive
  • 206 - Healthcare
  • 207 - Utilities
  • 308 - Communication Services
  • 309 - Energy
  • 310 - Industrials
  • 311 - Technology

These values are provided as integer constants on the class.

For more information on morningstar classification codes, see: https://www.quantopian.com/help/fundamentals#industry-sector.

inputs = (Fundamentals<US>.morningstar_sector_code::int64,)
dtype = dtype('int64')
missing_value = -1
BASIC_MATERIALS = 101
CONSUMER_CYCLICAL = 102
FINANCIAL_SERVICES = 103
REAL_ESTATE = 104
CONSUMER_DEFENSIVE = 205
HEALTHCARE = 206
UTILITIES = 207
COMMUNICATION_SERVICES = 308
ENERGY = 309
INDUSTRIALS = 310
TECHNOLOGY = 311
SECTOR_NAMES = {101: 'BASIC_MATERIALS', 102: 'CONSUMER_CYCLICAL', 103: 'FINANCIAL_SERVICES', 104: 'REAL_ESTATE', 205: 'CONSUMER_DEFENSIVE', 206: 'HEALTHCARE', 207: 'UTILITIES', 308: 'COMMUNICATION_SERVICES', 309: 'ENERGY', 310: 'INDUSTRIALS', 311: 'TECHNOLOGY'}

Risk Model Factors (Experimental)

Functions and classes listed here provide access to the outputs of the Quantopian Risk Model via the Pipeline API. They are currently importable from quantopian.pipeline.experimental.

We expect to eventually stabilize and move these features to quantopian.pipeline.

quantopian.pipeline.experimental.risk_loading_pipeline()

Create a pipeline with all risk loadings for the Quantopian Risk Model.

Returns:pipeline -- A Pipeline containing risk loadings for each factor in the Quantopian Risk Model.
Return type:quantopian.pipeline.Pipeline

Sector Loadings

These classes provide access to sector loadings computed by the Quantopian Risk Model.

class quantopian.pipeline.experimental.BasicMaterials

Quantopian Risk Model loadings for the basic materials sector.

class quantopian.pipeline.experimental.ConsumerCyclical

Quantopian Risk Model loadings for the consumer cyclical sector.

class quantopian.pipeline.experimental.FinancialServices

Quantopian Risk Model loadings for the financial services sector.

class quantopian.pipeline.experimental.RealEstate

Quantopian Risk Model loadings for the real estate sector.

class quantopian.pipeline.experimental.ConsumerDefensive

Quantopian Risk Model loadings for the consumer defensive sector.

class quantopian.pipeline.experimental.HealthCare

Quantopian Risk Model loadings for the health care sector.

class quantopian.pipeline.experimental.Utilities

Quantopian Risk Model loadings for the utilities sector.

class quantopian.pipeline.experimental.CommunicationServices

Quantopian Risk Model loadings for the communication services sector.

class quantopian.pipeline.experimental.Energy

Quantopian Risk Model loadings for the communication energy sector.

class quantopian.pipeline.experimental.Industrials

Quantopian Risk Model loadings for the industrials sector.

class quantopian.pipeline.experimental.Technology

Quantopian Risk Model loadings for the technology sector.

Style Loadings

These classes provide access to style loadings computed by the Quantopian Risk Model.

class quantopian.pipeline.experimental.Momentum

Quantopian Risk Model loadings for the "momentum" style factor.

This factor captures differences in returns between stocks that have had large gains in the last 11 months and stocks that have had large losses in the last 11 months.

class quantopian.pipeline.experimental.ShortTermReversal

Quantopian Risk Model loadings for the "short term reversal" style factor.

This factor captures differences in returns between stocks that have experienced short term losses and stocks that have experienced short term gains.

class quantopian.pipeline.experimental.Size

Quantopian Risk Model loadings for the "size" style factor.

This factor captures difference in returns between stocks with high market capitalizations and stocks with low market capitalizations.

class quantopian.pipeline.experimental.Value

Quantopian Risk Model loadings for the "value" style factor.

This factor captures differences in returns between "expensive" stocks and "inexpensive" stocks, measured by the ratio between each stock's book value and its market cap.

class quantopian.pipeline.experimental.Volatility

Quantopian Risk Model loadings for the "volatility" style factor.

This factor captures differences in returns between stocks that experience large price fluctuations and stocks that have relatively stable prices.

Domains

quantopian.pipeline.domain.US_EQUITIES

zipline.pipeline.domain.Domain for equities traded in the United States.

quantopian.pipeline.domain.AU_EQUITIES

Domain for equities traded in Australia.

quantopian.pipeline.domain.AT_EQUITIES

Domain for equities traded in Austria.

quantopian.pipeline.domain.BE_EQUITIES

Domain for equities traded in Belgium.

quantopian.pipeline.domain.BR_EQUITIES

Domain for equities traded in Brazil.

quantopian.pipeline.domain.CA_EQUITIES

Domain for equities traded in Canada.

quantopian.pipeline.domain.CH_EQUITIES

Domain for equities traded in Switzerland.

quantopian.pipeline.domain.CL_EQUITIES

Domain for equities traded in Chile.

quantopian.pipeline.domain.CN_EQUITIES

Domain for equities traded in China.

quantopian.pipeline.domain.CO_EQUITIES

Domain for equities traded in Colombia.

quantopian.pipeline.domain.CZ_EQUITIES

Domain for equities traded in the Czech Republic.

quantopian.pipeline.domain.DE_EQUITIES

Domain for equities traded in Germany.

quantopian.pipeline.domain.DK_EQUITIES

Domain for equities traded in Denmark.

quantopian.pipeline.domain.ES_EQUITIES

Domain for equities traded in Spain.

quantopian.pipeline.domain.FI_EQUITIES

Domain for equities traded in Finland.

quantopian.pipeline.domain.FR_EQUITIES

Domain for equities traded in France.

quantopian.pipeline.domain.GB_EQUITIES

Domain for equities traded in the United Kingdom.

quantopian.pipeline.domain.GR_EQUITIES

Domain for equities traded in Greece.

quantopian.pipeline.domain.HK_EQUITIES

Domain for equities traded in Hong Kong.

quantopian.pipeline.domain.HU_EQUITIES

Domain for equities traded in Hungary.

quantopian.pipeline.domain.IE_EQUITIES

Domain for equities traded in Ireland.

quantopian.pipeline.domain.IN_EQUITIES

Domain for equities traded in India.

quantopian.pipeline.domain.IT_EQUITIES

Domain for equities traded in Italy.

quantopian.pipeline.domain.JP_EQUITIES

Domain for equities traded in Japan.

quantopian.pipeline.domain.KR_EQUITIES

Domain for equities traded in South Korea.

quantopian.pipeline.domain.MX_EQUITIES

Domain for equities traded in Mexico.

quantopian.pipeline.domain.NL_EQUITIES

Domain for equities traded in the Netherlands.

quantopian.pipeline.domain.NZ_EQUITIES

Domain for equities traded in New Zealand.

quantopian.pipeline.domain.NO_EQUITIES

Domain for equities traded in Norway.

quantopian.pipeline.domain.PE_EQUITIES

Domain for equities traded in Peru.

quantopian.pipeline.domain.PL_EQUITIES

Domain for equities traded in Poland.

quantopian.pipeline.domain.PT_EQUITIES

Domain for equities traded in Portugal.

quantopian.pipeline.domain.SE_EQUITIES

Domain for equities traded in Sweden.

quantopian.pipeline.domain.SG_EQUITIES

Domain for equities traded in Singapore.

quantopian.pipeline.domain.TR_EQUITIES

Domain for equities traded in Turkey.

quantopian.pipeline.domain.ZA_EQUITIES

Domain for equities traded in South Africa.

Miscellaneous

class zipline.pipeline.mixins.LatestMixin

Common behavior for zipline.pipeline.data.BoundColumn.latest.

Given a DataSet named MyData with a column col of numeric dtype, the following expression:

factor = MyData.col.latest

is equivalent to:

class Latest(CustomFactor):
    inputs = [MyData.col]
    window_length = 1

    def compute(self, today, assets, out, data):
        out[:] = data[-1]

factor = Latest()

The behavior is the same for columns of boolean or string dtype, except the resulting expression will be a CustomFilter for boolean columns, and the resulting object will be a CustomClassifier for string or integer columns.

class zipline.pipeline.CustomClassifier

Base class for user-defined Classifiers.

Does not suppport multiple outputs.

Note

Custom classifiers are rarely used. Almost all user-defined classifiers are created via latest on a BoundColumn with string/int dtype, or via zipline.pipeline.Factor.quantiles().