Contrast set learning is different from classification in the sense that it identifies differences between groups instead of predicting groups. I attempted to apply this to fundamental data, by splitting forward returns into "low", "medium" and "high" classes and tried to identify which combinations of fundamental factors would be contrasting features across these three classes.
I managed to come up with something, but my next challenge is to convert this into an algorithm/research notebook as follows:
- Every week, identify the contrasting features that differentiate low forward returns from high forward weekly returns.
- Use these rules to rank stocks into different long/short buckets.
- Trade them and hold until next week.
- Continue this process every week.
Could someone experienced with pipeline show me how to go about this?
The attached notebook shows some rules identified by Contrast set learning that I want to apply on securities to trade. For example, from a universe of 36 securities given this output:
Returns=>low ('price_fcf_qf =>(19.0, 36.0)', 'price_qf1_eps =>(19.0, 36.0)', 'cf_total_assets_af =>(18.0, 35.0)') Returns=>high ('wc_sales_qf =>(0.0, 18.0)', 'price_eps_af =>(18.0, 35.0)', 'z_score_qf =>(19.0, 36.0)')
It says that securities with high price_fcf_qf, price_qf1_eps, and cf_total_assets_af performed poorly next week and securities with low wc_sales_qf, price_eps_af, and high z_score_qf performed well.