Back to Community
New Pipeline Features: Slicing and Correlation/Regression Methods

In a previous post, we introduced three built-in factors which could be used to compute correlations and regressions between the returns of stocks. Now, three new methods have been added to Factors to allow for more flexible computations: pearsonr, spearmanr, and linear_regression. In principle these methods work the same as the built-in factors, but they allow for things like correlations between two different factors or between a factor and a dataset. For example, you could compute the correlation between each column of a Returns factor and VIX (VIX is now treated as a single column of values).

One important distinction between these methods and the built-in factors is that they no longer accept an Asset object as their target parameter, but instead accept a Slice object. A Slice can simply be thought of as a single column of a Factor corresponding to a particular stock. Slices can be created by indexing into a Factor, keyed by Asset:

# Create a Returns factor and extract just AAPL's returns.  
returns = Returns(window_length=30)  
aapl_returns_slice = returns[sid(24)]  

There are various restrictions as to how these methods and Slices can be used, so check out the attached notebook for detailed examples and explanations. For an overview of the new methods, check out the docs here.

Loading notebook preview...
Notebook previews are currently unavailable.
Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

6 responses

"most factors still cannot be used as inputs, including any custom factors". Custom factors are disallowed? While I agree to disallow that by default I believe there should be a workaround for the "well aware of the risk" users. Just because you might use that feature in a wrong way it doesn't mean you have to forbid it entirely.

All these posts are great! Although in the regression code here - since you can't add columns, looks like you're not including alpha in the regression. Won't the slopes be biased?

This is cool. I'm just working on calculating the beta of each stock to its relevant sector ETF. The best way seems to be to run the regression for each sector, but mask it to only the subset of stocks from the universe in that sector. This will product multiple output columns.

Does this functionality exist in zipline? I tried to implement, but getting key errors.

Thanks
Adam

Stat arb anyone?

@Burrito Dan: I'm just working on calculating the beta of each stock
to its relevant sector ETF. The best way seems to be to run the
regression for each sector, but mask it to only the subset of stocks
from the universe in that sector. This will product multiple output
columns.

Do you have an example of computing the linear_regression for each security against its sector?