Question about "periods" parameter and how it works in alphalens

When I'm defining these periods (i.e. say 1, 3, and 5) in alphalens, am I asking alphalens to calculate what the period return would be for the securities in the pipeline for each day? For example, for securities in day 1, it calculates a 1, 3 and 5 period return and then for securities in day 2 in calculates 1, 3, and 5 day returns (based on factor weights if I'm not mistaken). What if securities are in day 1 and day 2 and the factors change from day 2 to day 1? Isn't there overlap? Is the periods parameter to help us identify holding periods? For example, 5 day period looks better than 1 period so go with a strategy using 5 days (but then again pipeline runs everyday before trading start so I am confused as to how this can work out efficiently). Am I looking at this all wrong? Or Is this just the return of the portfolio based on different periods (i.e. p1/p0, p3/p0 , p5/p0) and not the returns of the portfolio if you held the securities for 1, 3, or 5 days?

As always, thanks for the help.

Zayd

9 responses

When I'm defining these periods (i.e. say 1, 3, and 5) in alphalens,
am I asking alphalens to calculate what the period return would be for
the securities in the pipeline for each day? For example, for securities
in day 1, it calculates a 1, 3 and 5 period return and then for securities
in day 2 in calculates 1, 3, and 5 day returns

Yes, exactly.

(based on factor weights if I'm not mistaken).

'Factor Weighted Long/Short Portfolio Cumulative Return' is the only plot that uses security weighted by factor value, all the other plots use security equal weighting, i.e. if a quantile contains 100 securities then each security has 1/100 weight

What if securities are in day 1 and day 2 and the factors change from
day 2 to day 1? Isn't there overlap? Is the periods parameter to help
us identify holding periods? For example, 5 day period looks better
than 1 period so go with a strategy using 5 days (but then again
pipeline runs everyday before trading start so I am confused as to how
this can work out efficiently). Am I looking at this all wrong? Or Is
this just the return of the portfolio based on different periods (i.e.
p1/p0, p3/p0 , p5/p0) and not the returns of the portfolio if you held
the securities for 1, 3, or 5 days?

You are understanding it right and I'll explain how Alphalens handles the overlapping. I'll be a little verbose to summarize how Alphalens works, this might help other people.

Alphalens expects to receive in input a factor that ranks the stocks in a certain way. Alphalens helps you verify that top and bottom ranked stocks perform the opposite (positive vs negative returns) in the days (periods) after the ranking is computed. To verify the ranking scheme, Alphalens groups the stocks in quantiles (or bins) and computes the average of the forward returns (returns in the days following the ranking) of the stocks in the same quantile.

Those quantile mean forward returns are calculated by Alphalens every single date the input factor DataFrame has values for (every date is considered a new starting point for forward returns calculation). This makes sense because we want to test the factor quality every time the factor is computed. All in all, what we are looking for is to answer the question: After the stocks are ranked by our factor, what happens on average to the quantile returns?

Ideally a factor applies is ranking scheme every trading day , but this is not compulsory, e.g. a factor might generate values only on Mondays and in this case the factor we would give to Alphalens would contain values only for those dates corresponding to Mondays while the other days would be nan or not present at all.

Now let's imagine we want to analyze a factor that ranks the stocks every day and we like to analyze the performance after 22 days/periods (this correspond to an algorithm that rebalances its portfolio monthly as there are 22 trading days per month on average). Alphalens would calculate the quantile 22 days forward returns for every single date (because the factor DataFrame has values every day) and average them. In this way you get the mean returns by factor quantile, but how does Alphalens calculate the cumulative return plots?

Alphalens builds the cumulative returns plots starting from the 22 days forward returns calculated every day. It seems there is a problem of overlapping though. If we want to hold securities for 22 days, it means we cannot trade the factor every day, but every 22 days instead. If this is the case then the cumulative returns are very dependent on the particular trading day we start the computation. E.g. In a year, a hypothetical algorithm that uses that factor would only trades 12 times, one trade every 22 days. This doesn't give us meaningful statistical results. Alphalens is able to calculate the factor forward returns every single day and we make use of only 1/22th of those information?

To overcome the issue and give a more meaningful result, Alphalens builds 22 cumulative returns time series, where each cumulative return time series starts on a subsequent day from the previous one. This way all the possible outcomes for a rebalancing period of 22 days are covered and Alphalens returns an average of those results. In this way all the information coming from the daily 22 day forward returns is used and the results can be much more trusted.

Luca,

So, to understand what you are saying, it looks like I'm being provided an "overview" per se of what returns would be like for a specific holding period (because of the averaging of returns for a security that shows up every day). Based on what your saying, i'd imagine that in practice it will all depend on which date I start to trade a security my results will differ. But all in all, and statistically speaking, if results look better over a 5 day period than a 1 day period then a five day period should theoretically be utilized in my strategy for better performance.

Seems pretty robust in terms of analysis so kudos to quantopian for their package.

A clarification is needed. Using your words, Alphalens provides an "overview" per se of what factor quantile returns would be like for a specific holding period. Alphalens computes the statistics by quantile, not by security.

So if Alphalens shows that your factor has the best performance with a holding period of 22 days (assuming the best performance come from first and last quantiles) and you want to trade that factor, your algorithm has to buy the full top quantile and go short on the full bottom quantile and then hold that portfolio for 22 days. After 22 days the algorithm has to re-calculates the new factor values, extract top and bottom quantiles, calculate the difference between current portfolio and new ones and order the necessary securities to cover the difference. The Alphalens 'periods' correspond to the holding period of this hypothetical algorithm

However the above algorithm would be a naive one. As you say, the results will depend on which date I start to trade. Then why not implementing an algorithm that perform as Alphalens does? If your rebalance period is X days, then divide your capital by X and trade each 1/X capital at subsequent starting days (and hold/rebalance each 1/X capital for X days). This means the algorithm would trade every day instead of every X day and it'd get lower volatility, better leverage/exposure control and algorithm returns very close to your factor expected returns.

Hey Luca. I came to this thread as my short term mean reversion algo has a solid IC on a one day horizon, but much better over 5 days. It made me think, how does one actually trade the 5 day horizon? I tried a week_start trading schedule, but this is clearly only one of 5 alternatives, that must be blended. How best to create this blend with pipeline? The easiest way I can think of is to run 5 independent backtests, each starting on a different day of the week, then combining them as a portfolio in the research environment.

How have you solved this?

I came up with the same conclusion that the best way to trade it is with 5 independent portfolios. I don't have an elegant solution that involves Q API, I had to write custom code that handle multiple portfolios and the logic that multiplex multiple portfolios in one algorithm

@Dan I forgot to mention this subtlety that might interest you. If you have a factor that has higher returns when traded every X days (so a higher IC too over X days horizon) compared to the same factor traded every day, then you can always find a day in the X range where the mean daily return for that day is higher than the mean return the factor has when traded daily. This means the best performing portfolio is still the one with 1 day rebalancing, but the best day to trade the factor is not the day right after the factor is computed. This is the reason why you might see mean reversion factors that discard the last one or two recent days, this is equivalent to trade a factor with one or two days delay.

Bottom line, you might still want to trade every X days for other reasons (slippage impact, high amount of capital, transaction costs) but the best performing algo is always the one that rebalances every day.

Hopefully what I said makes sense, please let me know if I need to clarify things.

@luca that makes total sense. I’ll let you know. On a related note, I found that trading on Fridays only was better than any other choice, but I was unconvinced if it’s statistical robustness.

@Luca,
"... the best performing algo is always the one that rebalances every day."
Is this necessarily always so? I don't know. I have been experimenting with the impact of varying the re-balancing periodicity on at least some of my algos. So far i have tried monthly, weekly & daily, although not intermediate values such as every 2nd day or twice weekly. One of the key issues is the cost of rebalancing, especially if the portfolio holds a large number of stocks, as compared to the actual NEED to rebalance. If the target number of stocks of security XYZ is very close to the actual number of XYZ held, then there is a rebalancing cost but relatively little impact, compared to the situation when the target number and the actual number are very different, in which case yes, more frequent rebalancing will certainly help.

Do any of you have an algo that you could share, in which rebalancing is "efficient" in that it only takes place if the difference between target number of shares and actual number held exceeds some minimum threshold level?

@Burrito Dan, i think i read about some similar finding by Perry Kaufman. I will follow up.

Thanks @Luca for your detailed description.