Back to Community
Python Pairs Trading - should the lookback period of the beta calculation match lookback period of the z-score?

I have a pairs trading system that is based on a relative short duration - 30 periods on 15-minute intervals. This works well in applying z-score and catching smaller fluctuations between a pair, but I don't like the idea of relying on a 30-period sample for calculating the beta.

Specifically, in python the beta is calculated in a Pandas DataFrame on a rolling basis with a 300-period window. Beta calculation looks like this:

DataFrame1['Beta'] = pd.rolling_cov(DataFrame1['y_returns'],DataFrame1['x_returns'],window = 300) / pd.rolling_var(DataFrame1['x_returns'], window = 300)  

...Where x represents the 'market' assets' returns and y is the asset the beta is applied to. The pairs spread is calculated like this:

DataFrame1['pairs_spread'] = DataFrame1['Close_x'] - DataFrame1['Beta'] * DataFrame1['Close_y']  

Lastly, I calculate the mean, standard deviation and z-score based on a 30-period window:

DataFrame1['pairs_spread_mean'] = DataFrame1['pairs_spread'].rolling(window=30).mean()  
DataFrame1['pairs_spread_std'] = DataFrame1['pairs_spread'].rolling(window=30).std()  
DataFrame1['pairs_zscore'] = (DataFrame1['pairs_spread'] - DataFrame1['pairs_spread_mean'])/DataFrame1['pairs_spread_std']  

Is there an issue with calculating beta based on the last 300 periods and then calculating z-score/z-score factors based on a 30-period window?

2 responses

I also use different BetaTrainingWindowSize and ErrorTrainingWindowSize. I didn't give them a sepecific value, but from the results optimized by genetic algorithm, they are always not very different. like 23 vs 29, 25 vs.19... I assume that's because the z-value was calculate in the most recent periods, so there's no reason to use a long term based beta. Cause when you use a long period to calculate a beta, it means the bata is going to be super stable, But what we always try to do is hoping this beta is dynamic. I even use something like kalman filter to let this beta weighted more on recent few days and weighted less on previous days.
But prob you can try that 300 periods but weighted more on last 100 periods to see if that's better. Then beta is stable but still consider more recent data then should be more dynamic and not just based on the whole history.

Thank You for the response Manlin. Yeah, I've been looking at the results between longer and shorter window betas and there doesn't usually seem to be too much of a difference. But I like your idea of weighting the most recent periods more!