I have a pairs trading system that is based on a relative short duration - 30 periods on 15-minute intervals. This works well in applying z-score and catching smaller fluctuations between a pair, but I don't like the idea of relying on a 30-period sample for calculating the beta.
Specifically, in python the beta is calculated in a Pandas DataFrame on a rolling basis with a 300-period window. Beta calculation looks like this:
DataFrame1['Beta'] = pd.rolling_cov(DataFrame1['y_returns'],DataFrame1['x_returns'],window = 300) / pd.rolling_var(DataFrame1['x_returns'], window = 300)
...Where x represents the 'market' assets' returns and y is the asset the beta is applied to. The pairs spread is calculated like this:
DataFrame1['pairs_spread'] = DataFrame1['Close_x'] - DataFrame1['Beta'] * DataFrame1['Close_y']
Lastly, I calculate the mean, standard deviation and z-score based on a 30-period window:
DataFrame1['pairs_spread_mean'] = DataFrame1['pairs_spread'].rolling(window=30).mean() DataFrame1['pairs_spread_std'] = DataFrame1['pairs_spread'].rolling(window=30).std() DataFrame1['pairs_zscore'] = (DataFrame1['pairs_spread'] - DataFrame1['pairs_spread_mean'])/DataFrame1['pairs_spread_std']
Is there an issue with calculating beta based on the last 300 periods and then calculating z-score/z-score factors based on a 30-period window?