New Correlation and Linear Regression Factors

Three new built-in factors have just been added to pipeline: RollingPearsonOfReturns, RollingSpearmanOfReturns, and RollingLinearRegressionOfReturns. These factors are a first pass at offering correlation and regression capabilities that are fast enough to use via pipeline. We have plans to offer a more generic implementation of these factors in the future, but would love your feedback on them. For more detailed information on how to use these factors, check out the documentation.

The example algorithm below utilizes the "beta" output of RollingLinearRegressionOfReturns. RollingLinearRegressionOfReturns is a factor that computes multiple outputs, a feature that was released for custom factors a couple of weeks ago. This algorithm looks at the beta of high dollar-volume stocks with SPY, then longs the low-beta stocks and shorts the high-beta stocks. It also records/plots the rolling alpha, beta and correlation of AAPL with SPY to help better visualize what each output might look like.

261
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
from quantopian.algorithm import attach_pipeline, pipeline_output
from quantopian.pipeline import Pipeline
from quantopian.pipeline.factors import (
AverageDollarVolume, RollingLinearRegressionOfReturns,
)

def initialize(context):
"""
Called once at the start of the algorithm.
"""
# This is the cutoff for determining high-beta vs. low-beta stocks.
context.beta_threshold = 1

# Arguments to pass to our linear regression factor.
context.returns_length = 30
context.regression_length = 100
context.target_asset = sid(8554)  # SPY

schedule_function(
func=rebalance,
date_rule=date_rules.month_start(days_offset=0),
time_rule=time_rules.market_open(hours=0,minutes=30),
)
schedule_function(
func=record_vars,
date_rule=date_rules.every_day(),
time_rule=time_rules.market_close(hours=1),
)

attach_pipeline(my_pipeline(context), 'my_pipeline')

def my_pipeline(context):
"""
A function to create our dynamic stock selector (pipeline).
"""
pipe = Pipeline()

# Only consider the top 1% of stocks ranked by dollar volume.
dollar_volume = AverageDollarVolume(window_length=1)
high_dollar_volume = dollar_volume.percentile_between(99, 100)

# Create a regression factor with our asset of interest.
regression = RollingLinearRegressionOfReturns(
target=context.target_asset,
returns_length=context.returns_length,
regression_length=context.regression_length,
)
alpha = regression.alpha
beta = regression.beta
correlation = regression.r_value

low_beta = (beta < context.beta_threshold) & \
(beta > -context.beta_threshold)
high_beta = ~low_beta

pipe.set_screen(high_dollar_volume)

return pipe

"""
Called every day before market open.
"""
context.output = output = pipeline_output('my_pipeline')

context.longs = longs = output[output['low_beta']].index
context.shorts = shorts = output[output['high_beta']].index
context.long_weights, context.short_weights = assign_weights(longs, shorts)

def assign_weights(longs, shorts):
"""
Assign weights to securities that we want to order.
"""
# Equally weight all of our securities.
num_longs = len(longs)
num_shorts = len(shorts)
equal_weight = 1.0 / (num_longs + num_shorts)
long_weights = [equal_weight] * num_longs
short_weights = [-equal_weight] * num_shorts
return long_weights, short_weights

def rebalance(context, data):
"""
Long low-beta stocks and short high-beta stocks, rebalancing monthly.
"""
log.info('Long leverage this month: %f' % sum(context.long_weights))
log.info('Short leverage this month: %f' % sum(context.short_weights))

for asset, weight in zip(context.longs, context.long_weights):
order_target_percent(asset, weight)

for asset, weight in zip(context.shorts, context.short_weights):
order_target_percent(asset, weight)

def record_vars(context, data):
"""
Plot AAPL's rolling correlation, alpha and beta with SPY.
"""
AAPL_data = context.output.ix[sid(24), :]
record(
AAPL_corr=AAPL_data['correlation'],
AAPL_alpha=AAPL_data['alpha'],
AAPL_beta=AAPL_data['beta'],
)

There was a runtime error.
Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

21 responses

This post should be marked as interesting.

This is incredibly good progress in the last few weeks on Q!

Can these be used to compute correlation to cross sectional mean return? So, rather than include SPY in the universe, you just compare stocks individual returns to their cross sectional mean return. This could be market cap weighted, but equal weighted seems often used in papers.

I can see on github that RollingLinearRegressionOfReturns use Returns(window_length=returns_length) as its inputs. Does this mean we can pass factors as input to other factors at last?

The "rebalance" function in the example doesn't seem to ever close orphan positions from previous months, so not quite rebalancing.

Here's a version of the above with a fixed rebalancing function, producing a more realistic backtest.

Quantopian: what about having a default maximum leverage (eg 3x or same as contest condition) to catch gross bugs like this earlier? With an API routine to change the default for people who really want extreme leverage.

38
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
from quantopian.algorithm import attach_pipeline, pipeline_output
from quantopian.pipeline import Pipeline
from quantopian.pipeline.factors import (
AverageDollarVolume, RollingLinearRegressionOfReturns,
)

def initialize(context):
"""
Called once at the start of the algorithm.
"""
# This is the cutoff for determining high-beta vs. low-beta stocks.
context.beta_threshold = 1

# Arguments to pass to our linear regression factor.
context.returns_length = 30
context.regression_length = 100
context.target_asset = symbol('SPY')

schedule_function(
func=rebalance,
date_rule=date_rules.month_start(days_offset=0),
time_rule=time_rules.market_open(hours=0,minutes=30),
)
schedule_function(
func=record_vars,
date_rule=date_rules.every_day(),
time_rule=time_rules.market_close(hours=1),
)

attach_pipeline(my_pipeline(context), 'my_pipeline')

def my_pipeline(context):
"""
A function to create our dynamic stock selector (pipeline).
"""
pipe = Pipeline()

# Only consider the top 1% of stocks ranked by dollar volume.
dollar_volume = AverageDollarVolume(window_length=1)
high_dollar_volume = dollar_volume.percentile_between(99, 100)

# Create a regression factor with our asset of interest.
regression = RollingLinearRegressionOfReturns(
target=context.target_asset,
returns_length=context.returns_length,
regression_length=context.regression_length,
)
alpha = regression.alpha
beta = regression.beta
correlation = regression.r_value

low_beta = (beta < context.beta_threshold) & \
(beta > -context.beta_threshold)
high_beta = ~low_beta

pipe.set_screen(high_dollar_volume)

return pipe

"""
Called every day before market open.
"""
context.output = output = pipeline_output('my_pipeline')

context.longs = longs = output[output['low_beta']].index
context.shorts = shorts = output[output['high_beta']].index
context.long_weights, context.short_weights = assign_weights(longs, shorts)

def assign_weights(longs, shorts):
"""
Assign weights to securities that we want to order.
"""
# Equally weight all of our securities.
num_longs = len(longs)
num_shorts = len(shorts)
equal_weight = 1.0 / (num_longs + num_shorts)
long_weights = [equal_weight] * num_longs
short_weights = [-equal_weight] * num_shorts
return long_weights, short_weights

def rebalance(context, data):
"""
Long low-beta stocks and short high-beta stocks, rebalancing monthly.
"""
log.info('Long leverage this month: %f' % sum(context.long_weights))
log.info('Short leverage this month: %f' % sum(context.short_weights))

sells = set(context.portfolio.positions) - set(context.longs + context.shorts)
for asset in sells:
order_target(asset, 0)

trades = zip(context.longs, context.long_weights) + zip(context.shorts, context.short_weights)
order_target_percent(asset, weight)

def record_vars(context, data):
"""
Plot AAPL's rolling correlation, alpha and beta with SPY.
"""
AAPL_data = context.output.ix[sid(24), :]
record(
AAPL_corr=AAPL_data['correlation'],
AAPL_alpha=AAPL_data['alpha'],
AAPL_beta=AAPL_data['beta'],
)
record(leverage=context.account.leverage)
There was a runtime error.

Awesome! Thank You Q. This makes Pipeline much more appealing.

@ Norbert, thanks for the mod as well. Saves a little time!

@Dan Unfortunately what you are suggesting is not currently possible with these factors; they are specifically limited to only computing correlations between the returns of individual stocks. As I mentioned above however, we intend to implement a more generic version of these factors that would allow for more interesting computations such as what you have in mind, so stay tuned and thanks for the feedback!

@Luca In general we still do not support the use of factors as inputs, but with the release of these new correlation/regression factors we have deemed that a select few factors can safely be used as inputs (such as Returns). The main difficulty of using other factors as inputs here is accounting for splits. For example, if we try to use SimpleMovingAverage as the input to our regression factor, and our target asset undergoes a split, the regression output would be distorted for any regressions computed over a window containing the date of the split. Factors that are comparable across splits, such as zscore and rank, are other examples (besides Returns) of factors that could be used as inputs.

@Norbert Thanks for catching that. We have actually implemented a set_max_leverage method, but have yet to release it for use in the IDE. I made an internal feature request to try to get that out.

Thank you David, your explanation makes sense. I appreciate that

@ David,

I am trying to pinpoint the time of the pricing that is used to calculate daily returns in this factor. This is probably obvious, but just want to confirm that daily returns are calculated on the open and close price that would correspond to the open and close price of the data I make use of in the backtester?

Thanks

Frank,

These built-in factors use close price to calculate Returns; this is consistent in both research and the backtester. For example, this can be seen here, where the Returns factor is created. Note that Returns uses a default input of close price. The built-ins are somewhat limited in scope as they restrict you from using, say, open price for Returns. In order to facilitate more generic computations, we implemented new correlation/regression methods for Factors, which will allow you to customize your Returns factor and even compute against different factors besides Returns. The post for that can be found here.

Thanks David.

close and close[-1] from USEquityPricing.close are what are used...that is what I was trying to understand.

Is it possible to perform a multiple regression?

Is it possible to use these factors in a research Notebook? I can't seem to get them to work in the research environment. Is there a trick for passing the target= variable in research?

This worked for me

    regression_factor = RollingLinearRegressionOfReturns(
target=symbols('SPY'),
returns_length=10,
regression_length=90,
)


Emerging from my shell to let all of you know this function is only accurate when returns_length = 2 while regression_length can be anything you wish except it must be 1 fewer than the number of trading days you are targeting.

A returns_length > 2 can make sense, but it depends on what you are looking for. Why do you say it's not accurate? Is there a bug in the factor code?

Code to clarify the apparent requirement of returns_length=2 for stock Beta values.

5
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
'''
For comparison, produces a table for each security of:
o Pandas Beta calculation.
o RollingLinearRegressionOfReturns "beta" value.
o statsmodels.api slope calculation (slope of regression, i.e. trend line)

RollingLinearRegressionOfReturns output does not match statsmodels.api slope nor Pandas Beta
except when returns_length = 2.
Every source on the web says Regression "beta" just means slope of the trend line.

Stocks     Beta means volatility compared to some other volatility.
Regression beta means simply slope.
Regression is just a fancy word for trend line coined by Charles Darwin's cousin in the Victorian era.

Here, modifying returns_length and regression_length, you can find instances of positive pandas values and negative RollingLinearRegressionOfReturns values and visa versa.

RollingLinearRegressionOfReturns() claims to produce volatility Beta values with any returns_length and I haven't been able find a correlation with anyting except 2.
'''

from quantopian.algorithm        import attach_pipeline, pipeline_output
from quantopian.pipeline         import Pipeline, CustomFilter
from quantopian.pipeline.factors import AverageDollarVolume, RollingLinearRegressionOfReturns
import statsmodels.api as sm
import numpy as np

def initialize(context):
c = context
c.target_sec        = sid(8554)  # SPY
c.num               = 2          # Number each of high and low to return
c.regression_length = 30
c.returns_length    = 3
print ".\n\n\tChange c.returns_length to 2 if you want the numbers to match\n."
schedule_function(calcs, date_rules.every_day(), time_rules.market_open())

# Modify to inject any sids that have presumably known Beta from some other source ...
sids = SidInList( sid_list = ( c.target_sec , sid(33431), sid(30877), sid(49072), sid(1374) ))
dv   = AverageDollarVolume(window_length=10).percentile_between(94, 95)

regression = RollingLinearRegressionOfReturns(
target            = c.target_sec,
returns_length    = c.returns_length,
regression_length = c.regression_length,
)
pipe  = Pipeline()
''' Add these columns if you wish ...
yval  = regression.alpha   ; pipe.add(yval,  'yval')
corr  = regression.r_value ; pipe.add(corr,  'corr')
pval  = regression.p_value ; pipe.add(pval,  'pval')
stder = regression.stderr  ; pipe.add(stder, 'stder')
'''
# This is what statisticians mean by regression "beta", i.e. slope of trend line.
slope = regression.beta    ; pipe.add(slope, 'slope')

pipe.set_screen( sids | slope.top(c.num) | slope.bottom(c.num) )
attach_pipeline(pipe, 'zoo')

def calcs(context, data):
c = context

prices = c.prices
prices['SPY_chg'] = prices[sid(8554)].pct_change()
spy_var = prices['SPY_chg'].var()                             # SPY price variance

log.info('.')
log.info('           Pandas      Regression         OLS ')
log.info('  Sym       Beta      "beta" Slope       Slope')

for sec in prices.columns:
if sec == 'SPY_chg': continue

# pandas beta calc
beta = prices[sec].pct_change().cov(prices.SPY_chg) / spy_var

# Log all
log.info('{} {} {} {}'.format(
sec.symbol.rjust(5),
('%.2f' % beta).rjust(10),                        # pandas beta
('%.2f' % c.output['slope'][sec]).rjust(14),      # regression slope
('%.2f' % slope(prices[sec].values)).rjust(14),   # statmodels slope
))

context.output = pipeline_output('zoo')

# pandas beta calculation prep
context.prices  = data.history(
context.output.index,
'price',
context.regression_length + 1, '1d'        # +1 due to the one-off discrepancy.
)

def slope(in_list):    # Return beta (slope of trend line) portion of OLS regression.
return sm.OLS(in_list, sm.add_constant(range(-len(in_list) + 1, 1))).fit().params[-1]

class SidInList(CustomFilter):
inputs = [] ; window_length = 1 ; params = ('sid_list',)
def compute(self, today, assets, out, sid_list):
out[:] = np.in1d(assets, sid_list)

'''
c.output.sort('slope')
DataFrame:
corr           pval      slope     stder      yval
Equity(37736 [UCO])  -0.932672   7.420639e-10  -8.918334  0.791325  0.140035
Equity(4664 [SM])    -0.858707   6.303240e-07  -7.274278  0.995943  0.472035
Equity(30877 [CAPR]) -0.100903   6.634221e-01  -0.402654  0.910818 -0.086735
Equity(33431 [ROSG])  0.469518   3.175997e-02   0.901243  0.388808 -0.068106
Equity(8554 [SPY])    1.000000  1.308086e-188   1.000000  0.000000  0.000000
Equity(1374 [CDE])    0.880160   1.438864e-07  10.636107  1.315960 -0.054483
Equity(18522 [ARMH])  0.840600   1.839885e-06  13.218204  1.954028  0.000017

Pandas      Regression         OLS        FinViz
Sym       Beta      "beta" Slope       Slope        Beta
CAPR       0.13          -3.45          -0.00       -2.25
CDE       7.32          21.56          -0.16        1.11
HAIN       7.49          21.37          -0.96        0.92
LABD     -10.06         -23.21           0.08        none
NTAP       2.87         -11.83           0.47        1.58
ROSG      -2.50           4.43          -0.01       -4.49
SPY       1.00           1.00          -0.02        none

Note: Since the finviz lookback window and method are unknown and values unverified,
there is no reason to trust it, nevertheless just in case, some are added
here for what it might be worth, however they only apply to the date of this code originally,
long gone now.
'''

There was a runtime error.

@blue, I didn't read carefully the code but there are at least 2 things I noticed:
- You should swap .pct_change() with .pct_change(context.returns_length-1) to get the same returns as the pipeline factor
- data.history should fetch enough days of data to calculate .pct_change on context.regression_length days, that is: context.regression_length + context.returns_length -1 (then you have to drop the nans)

This should work with any returns_length value

9
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
'''
For comparison, produces a table for each security of:
o Pandas Beta calculation.
o RollingLinearRegressionOfReturns "beta" value.
o statsmodels.api slope calculation (slope of regression, i.e. trend line)

RollingLinearRegressionOfReturns output does not match statsmodels.api slope nor Pandas Beta
except when returns_length = 2.
Every source on the web says Regression "beta" just means slope of the trend line.

Stocks     Beta means volatility compared to some other volatility.
Regression beta means simply slope.
Regression is just a fancy word for trend line coined by Charles Darwin's cousin in the Victorian era.

Here, modifying returns_length and regression_length, you can find instances of positive pandas values and negative RollingLinearRegressionOfReturns values and visa versa.

RollingLinearRegressionOfReturns() claims to produce volatility Beta values with any returns_length and I haven't been able find a correlation with anyting except 2.
'''

from quantopian.algorithm        import attach_pipeline, pipeline_output
from quantopian.pipeline         import Pipeline, CustomFilter
from quantopian.pipeline.factors import AverageDollarVolume, RollingLinearRegressionOfReturns
import statsmodels.api as sm
import numpy as np

def initialize(context):
c = context
c.target_sec        = sid(8554)  # SPY
c.num               = 2          # Number each of high and low to return
c.regression_length = 30
c.returns_length    = 3
print ".\n\n\tChange c.returns_length to 2 if you want the numbers to match\n."
schedule_function(calcs, date_rules.every_day(), time_rules.market_open())

# Modify to inject any sids that have presumably known Beta from some other source ...
sids = SidInList( sid_list = ( c.target_sec , sid(33431), sid(30877), sid(49072), sid(1374) ))
dv   = AverageDollarVolume(window_length=10).percentile_between(94, 95)

regression = RollingLinearRegressionOfReturns(
target            = c.target_sec,
returns_length    = c.returns_length,
regression_length = c.regression_length,
)
pipe  = Pipeline()
''' Add these columns if you wish ...
yval  = regression.alpha   ; pipe.add(yval,  'yval')
corr  = regression.r_value ; pipe.add(corr,  'corr')
pval  = regression.p_value ; pipe.add(pval,  'pval')
stder = regression.stderr  ; pipe.add(stder, 'stder')
'''
# This is what statisticians mean by regression "beta", i.e. slope of trend line.
slope = regression.beta    ; pipe.add(slope, 'slope')

pipe.set_screen( sids | slope.top(c.num) | slope.bottom(c.num) )
attach_pipeline(pipe, 'zoo')

def calcs(context, data):
c = context

prices = c.prices
prices['SPY_chg'] = prices[sid(8554)].pct_change(c.returns_length-1)
spy_var = prices['SPY_chg'].var()                             # SPY price variance

log.info('.')
log.info('           Pandas      Regression         OLS ')
log.info('  Sym       Beta      "beta" Slope       Slope')

for sec in prices.columns:
if sec == 'SPY_chg': continue

# pandas beta calc
beta = prices[sec].pct_change(c.returns_length-1).cov(prices.SPY_chg) / spy_var

# Log all
log.info('{} {} {} {}'.format(
sec.symbol.rjust(5),
('%.2f' % beta).rjust(10),                        # pandas beta
('%.2f' % c.output['slope'][sec]).rjust(14),      # regression slope
('%.2f' % slope(prices[sec].values)).rjust(14),   # statmodels slope
))

context.output = pipeline_output('zoo')

# pandas beta calculation prep
context.prices  = data.history(
context.output.index,
'price',
context.regression_length + context.returns_length - 1, '1d'
)

def slope(in_list):    # Return beta (slope of trend line) portion of OLS regression.
return sm.OLS(in_list, sm.add_constant(range(-len(in_list) + 1, 1))).fit().params[-1]

class SidInList(CustomFilter):
inputs = [] ; window_length = 1 ; params = ('sid_list',)
def compute(self, today, assets, out, sid_list):
out[:] = np.in1d(assets, sid_list)

'''
c.output.sort('slope')
DataFrame:
corr           pval      slope     stder      yval
Equity(37736 [UCO])  -0.932672   7.420639e-10  -8.918334  0.791325  0.140035
Equity(4664 [SM])    -0.858707   6.303240e-07  -7.274278  0.995943  0.472035
Equity(30877 [CAPR]) -0.100903   6.634221e-01  -0.402654  0.910818 -0.086735
Equity(33431 [ROSG])  0.469518   3.175997e-02   0.901243  0.388808 -0.068106
Equity(8554 [SPY])    1.000000  1.308086e-188   1.000000  0.000000  0.000000
Equity(1374 [CDE])    0.880160   1.438864e-07  10.636107  1.315960 -0.054483
Equity(18522 [ARMH])  0.840600   1.839885e-06  13.218204  1.954028  0.000017

Pandas      Regression         OLS        FinViz
Sym       Beta      "beta" Slope       Slope        Beta
CAPR       0.13          -3.45          -0.00       -2.25
CDE       7.32          21.56          -0.16        1.11
HAIN       7.49          21.37          -0.96        0.92
LABD     -10.06         -23.21           0.08        none
NTAP       2.87         -11.83           0.47        1.58
ROSG      -2.50           4.43          -0.01       -4.49
SPY       1.00           1.00          -0.02        none

Note: Since the finviz lookback window and method are unknown and values unverified,
there is no reason to trust it, nevertheless just in case, some are added
here for what it might be worth, however they only apply to the date of this code originally,
long gone now.
'''

There was a runtime error.

Good, now, since this is a linear regression, what can be done to make the algo OLS slope also match, perhaps.

Beta using pandas:

def beta_sids(context, data):
spy = sid(8554)
changes = data.history(list(context.stocks) + [spy], 'close', 252, '1d').ffill().bfill().pct_change()
return pd.Series({sid: changes[sid].cov(changes[spy]) / changes[spy].var() for sid in changes})


Edit 2017-06-18
It dawned on me since then that my point about linear regression beta purely as slope above is just one case where one of the two sets being compared is a constant, like time or a counter.

When two sets of varying data are used as inputs to a linear regression (the implementation provided here by Quantopian), the beta value is precisely the beta we are familiar with in the environment around here, the stock market's beta.