Back to Community
Multifactor regression to test alphas

I have implemented a multi-factor linear regression model to predict future returns. This approach helps in finding predictive power of multiple factors in single step, finding the appropriate combination of alpha factors and eliminating correlated/collinear factors.

Please share your feedback!

Loading notebook preview...
Notebook previews are currently unavailable.
7 responses

Hi Shiv,

Very interesting work and notebook. Had a question. Did you follow an external process to determine the significant_factor_names, I noticed they are hardcoded as.
significant_factor_names = ['alpha2', 'alpha4', 'alpha11', 'alpha14', 'alpha31', 'alpha35']

I have found significant factors by progressively adding factors . If a factor comes with significant t-stat, I keep it otherwise I move forward with other factors.

Also you can run a regression by replacing "significant_factor_names" by "factor_names". Check the significant factors, keep them in the model and progressively add the remaining factors (in order of decreasing significance) and see if they improve the model.

I have found significant factors by progressively adding factors . If a factor comes with significant t-stat, I keep it otherwise I move forward with other factors.

I suggest that you cross validate with Out of Sample data as an added procedure to avoid just overfitting to your in sample data. Curious if this is the procedure you used in the mini-contest?

I agree with James. As the purpose of the exercise is prediction, you wouldn't want to follow up by running the trading strategy on the same data.

@James, I totally agree with you. There could be over-fitting here. As a second step, one can run expanding window regressions to test the performance of an alpha factor. Other OOS testing would be to keep a separate window for modeling and validation. I will try that.

I didn't use the exact procedure for mini-contest. In mini contest, I would just research an individual factor, add it linearly with other factors (if significant) but I have never liked the approach. That approach doesn't give any visibility into predictive ability of new factor in presence of other factors. In a separate thread, I requested Q to create something like this in their Alphalens module but there wasn't any feedback on it so thought of creating one myself (it didn't turn out to be difficult at all).

Very cool notebook @Shiv, thanks for sharing that!

You could try using LassoCV, RidgeCV, or ElasticNetCV instead of OLS.