Back to Community
101 Alphas Project: Alpha #41

Here is alpha #41 from the 101 alphas project. I would love to hear ideas about combining alphas!

Clone Algorithm
Total Returns
Max Drawdown
Benchmark Returns
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
from quantopian.algorithm import attach_pipeline, pipeline_output
from quantopian.pipeline import Pipeline
from quantopian.pipeline import CustomFactor
from import USEquityPricing
from import morningstar
from quantopian.pipeline.factors import AverageDollarVolume
from quantopian.pipeline.factors import VWAP
import numpy as np
import scipy
import math

class ROE(CustomFactor):   
    inputs = [morningstar.operation_ratios.roe] 
    window_length = 1
    def compute(self, today, assets, out, close):        
        out[:] = close[-1]        

class Alpha41(CustomFactor):   
    inputs = [USEquityPricing.low, USEquityPricing.high] 
    window_length = 1
    def compute(self, today, assets, out, low, high):        
        out[:] = high[0]*low[0]

def initialize(context):
    #set_commission(commission.PerShare(cost=0, min_trade_cost=None))

    pipe = Pipeline()
    attach_pipeline(pipe, 'ranked')
    dollar_volume = AverageDollarVolume(window_length=1)
    high_dollar_volume = dollar_volume.percentile_between(95, 100)
    alpha41 = Alpha41(mask=high_dollar_volume)
    vwap = VWAP(window_length=1)
    alpha41 = alpha41**.5 - vwap
    alpha41_rank = alpha41.rank(mask=high_dollar_volume)
    roe =  ROE(mask=high_dollar_volume)

    combo_raw = (alpha41_rank)
    pipe.add(combo_raw, 'combo_raw') 
    pipe.set_screen(roe > .005)

    context.long_leverage =  .5
    context.short_leverage = -.5
    context.short_num = 20
    context.long_num = 20

def before_trading_start(context, data):
    context.output = pipeline_output('ranked')
    context.long_list = context.output.sort_values(['combo_raw'], ascending=False).iloc[:context.long_num]
    context.short_list = context.output.sort_values(['combo_raw'], ascending=False).iloc[-context.short_num:]   
def rebalance(context,data):
    if float(len(context.long_list)) <> 0:
        long_weight = context.long_leverage / float(len(context.long_list))
        long_weight = 0 
    if float(len(context.short_list)) <> 0:
        short_weight = context.short_leverage / float(len(context.short_list))
        short_weight = 0 
    for long_stock in context.long_list.index:
        if data.can_trade(long_stock):
            if long_stock not in security_lists.leveraged_etf_list:
                order_target_percent(long_stock, long_weight)
    for short_stock in context.short_list.index:
        if data.can_trade(short_stock):
            if short_stock not in security_lists.leveraged_etf_list:
                order_target_percent(short_stock, short_weight)
    for stock in context.portfolio.positions.iterkeys():
        if stock not in context.long_list.index and stock not in context.short_list.index:
            if data.can_trade(stock):

There was a runtime error.
28 responses

Georges- Have you seen Alphalens our open source tool for analyzing alpha factors? It is a much better tool for learning how good a given alpha is. Below is your Alpha 41 run through an Alphalens tear sheet.

Loading notebook preview...

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

start = pd.Timestamp("2015-01-01") # Can't choose a too long time-period or we run out of RAM  
end = pd.Timestamp("2016-03-01")  

Seems like we need some figure-of-merit over a long time scale. If you can only do ~ 1 year, how's this gonna work? Can't you write results to disk, as the computation is carried out? Why is RAM a limitation?

@Grant, here is a ten years analysis. It's just a matter of splitting the pipeline in chunks

Loading notebook preview...

@ Luca -


It's just a matter of splitting the pipeline in chunks

Is this a new feature?

It'd be interesting to hear from the Q team on how to interpret the result. At first glance, it just looks like a bunch of noise.

I am using Quantopian to learn how to program in Python, so forgive me if I am totally off base, but I don't understand why we want to create a class called Alpha41 that does half the calculation and do the remainder in the factor function. I was not sure how to access the pricing data within the function, so I created a class for high, low, and close and then modified factor to my factor41 as below:

def factor41(mask):
#alpha41 = Alpha41(mask=mask)
high = p_high(mask=mask)
low = p_low(mask=mask)
close = p_close(mask=mask)
vwap = VWAP(window_length=1, mask=mask)
#alpha41 = alpha41**.5 - vwap
alpha41 = (low * high) ** .5 - vwap
return alpha41

then I created factor 42:

def factor42(mask):
close = p_close(mask=mask)
vwap = VWAP(window_length=1, mask=mask)
alpha42 = (vwap - close).rank() / (vwap + close).rank()
return alpha42

You can see the full code in the attached notebook. Does this seem like an okay approach? Or is there a better way?

Loading notebook preview...


In the original algorithm the factors are 2 because VWAP is part of pipeline and the author didn't want to reinvent the wheel. So he created the missing code, Alpha41, that is actually only half of the calculation and combined that one with VWAP to get the actual alpha factor described in the paper. The author couldn't use VWAP inside Alpha41 because of a restrinction on the factor that can be used as input of other factors, that's the reason why we have 2 distinct factors.

Regarding the factor function in the NB, please note that run_tear_sheet function accepts only one factor as input, but as we have 2 factors I had to write a function that combines those together and return a single one (and pass this function to run_tear_sheet instead).

Is it possible to add the chunking behaviour internally it the range is too long so the end user does not have to deal with this on their side?

@Suminda, have a look at here

A couple of observations:

  1. All of the alpha is from 2008. I haven't checked, but I assume that the model includes short positions in the approximately 1000 financial names that were not allowed to be shorted during the Sep-Oct short sale ban. Ideally, the platform would automatically disallow those trades.

  2. The "VWAP" pipeline factor is not really VWAP. It's just average daily close weighted by daily volume. With window_length=1, VWAP=close. What's needed is to calculate daily VWAP from the minutely bars (ideally from ticks, but minutely can approximate it if you use V*(H+L+C)/3).

I don't really understand the alphalens tool yet. Does it tell us, point-in-time, when the IC is non-zero at a specified level of statistical significance? For the 101 Alphas Project: Alpha #41, in 2008, is it non-zero, but zero at other times? Or is it always non-zero, but the market conditions in 2008 resulted in it being effective?

It seems that the Q backtester should include the 2008 short ban, no? Kind of important, I would think.

Regardless of the short ban, 2008 causes distortion in much of this research. (As does 1998-2001, in which it is ridiculously easy to find alpha, but that's not in this database.) Market structure has changed significantly since then, both on the micro level (dark pools, HFT, etc.) and the macro level (indexing, fee-based advisory, etc.). In my own work, I mostly look at results since 2010.

Market structure has changed significantly since then, both on the micro level (dark pools, HFT, etc.) and the macro level (indexing, fee-based advisory, etc.).

This is at the heart of one of my basic questions about factors. The idea that one has a large number of factors that apply over long spans of time, back to colonial times doesn't sound workable. Yet, I'd still like to run backtests back to 2002, to test the validity of my system (if I think I can pick good factors now, and exclude bad ones, I should be able to make this assessment at any point in time).

I more agree with @mhp for high frequency algos but for daily and weekly you need more history.

Does anyone know how alphalens works? Is there a paper about it?

@Luca I just found out about run_pipeline_chunks. Thank you very much for this!

I am new to alpha strategies. Is alpha #41 a momentum strategy or mean reversion strategy? Is it possible to reduce the trading frequency from minutely to daily but it is still profitable? Thanks a lot!

Hi Georges,

I wonder why use the ROE factor? In the original algo there is only VWAP.


To mhp:

You said: "What's needed is to calculate daily VWAP from the minutely bars ". Why should we calculate the daly VWAP fro the minutely bars? Q has the VWAP factor. Why not use it directly?


Thomas, the pipeline "VWAP" factor is simply computing a volume-weighted average of daily closes (if you don't believe me, look up the source code). This is not the definition of VWAP that most people use, including the author of the paper which is the topic of this tread. True VWAP is calculated intraday from tick data. A calculation using sum(V*(H+L+C)/3,n)/sum(V,n) of minutely bars can be a good approximation. See also

Hi mhp,

I read the original article again and find: "vwap = daily volume-weighted average price". I think this is what you mean, right? :-)


Yes, and "daily volume-weighted average price" means the volume-weighted average price of all the trades that occurred within that day.

Otherwise, a 1-day VWAP would be the same as the daily close, which it is if you use the pipeline factor.

You are right!

Hi - sorry that I have not been active in this thread.

Thomas Chang - the ROE filter is to ensure there are no ETFs included in the security basket. I am sure there are better ways to do this.

Hi Georges,

I saw from the samples from Q, another way to exclude the ETFs, one can use the following filter:
... have_market_cap = morningstar.valuation.market_cap.latest.notnull()

Since the ETFs have no market_cap from Morningstar.

Besides I've cloned your algo and run the backtesting. I found, if I use less capital such as 50000, the performance will drawdown rapidly. Any explanation?

And if I select the starting time from 2015, the performace is negativ. Seems this algo is not "valid" any more? :-)


Hi mhp,

To the point VWAP. If the VWAP should be calculated as you described, this means, one can set order at the last minute or even at the last second before the market closes? So in this way one can get the "best" daily VWAP? Or one can trade based on the yesterdays VWAP?


A further question on mhp:

You wrote above "All of the alpha is from 2008...." and "... In my own work, I mostly look at results since 2010." Do you mean all of, or at least most of the 101 alphas are little/less or even not "valid"? :-)


Re. VWAP calculation, please see:

Re. "are these alphas valid?" -- I suppose that's for each of us to decide based on our own objectives and timeframes.