Combining Weighted Factors

Greetings,

Imagine I have the following factors and would like to weight them unequally (number in () indicates weight):

• EPS_Growth_3M (10%)
• Mean_Reversion_1M (20%)
• Price_Momentum_12M (20%)
• Price_To_Forward_Earnings (15%)
• ForwardPEvsSector (15%)
• ForwardPEvsIndustry (20%)

I've written/borrowed the following code in research:

# Market Cap
class Market_Cap(CustomFactor):
...

# Forward Price to Earnings Ratio (MORNINGSTAR)
class Price_To_Forward_Earnings(CustomFactor):
...

# 3-month EPS Growth
class EPS_Growth_3M(CustomFactor):
...

# 12-month Price Rate of Change
class Price_Momentum_12M(CustomFactor):
...

# 1-month Mean Reversion
class Mean_Reversion_1M(CustomFactor):
...

def filter_universe():
...

pipe = Pipeline(
columns={
'Price_To_Forward_Earnings': Price_To_Forward_Earnings(),
'EPS_Growth_3M': EPS_Growth_3M(),
'Price_Momentum_12M': Price_Momentum_12M(),
'Mean_Reversion_1M': Mean_Reversion_1M(),
'morningstar_industry_group_code': asset_classification.morningstar_industry_group_code.latest,
'morningstar_sector_code': asset_classification.morningstar_sector_code.latest
},
screen=filter_universe()
)

results = run_pipeline(pipe, '2014-01-01', '2014-06-30')
results = results.fillna(value=0.0)

sector_forward_pe = results.groupby(['morningstar_sector_code'])['Price_To_Forward_Earnings'].mean()
results['sector_forward_pe'] = results['morningstar_sector_code'].map(sector_forward_pe)

industry_forward_pe = results.groupby(['morningstar_industry_group_code'])['Price_To_Forward_Earnings'].mean()
results['industry_forward_pd'] = results['morningstar_industry_group_code'].map(industry_forward_pe)

results['ForwardPEvsSector'] = results['Price_To_Forward_Earnings']/results['sector_forward_pe']
results['ForwardPEvsIndustry'] = results['Price_To_Forward_Earnings']/results['industry_forward_pd']



At which point (and how) do I weight the various factors? I can't just apply the weights to the raw values (can I?). Do I rank them first then weight the rank before combining?

Comments, resources or other posts are gratefully appreciated.

2 responses

You are correct that you need to rank first (to normalize) and then weight. Maybe take a look at this post https://www.quantopian.com/posts/how-to-combine-factors-in-alphalens

As Luca mentioned in that post

The easiest way would be to rank each factor and then sum them. This
way each factor weights the same.

mask = my_universe_filter #something like Q500US() or Q1500US()

In general terms one needs to normalize each factor to make sure the values can be compared 'apples to apples' . This can be done using the '.rank' method but another option could be '.zscore'.

In your particular case maybe do something like:

score = ( EPS_Growth_3M(mask = universe).zscore() * .10
+ Mean_Reversion_1M(mask = universe).zscore() * .20
+ Price_Momentum_12M(mask = universe).zscore() * .20
+ Price_To_Forward_Earnings(mask = universe).zscore() * .15
+ ForwardPEvsSecto(mask = universe).zscore() * .15
+ ForwardPEvsIndustry (mask = universe).zscore() * .20 )



Probably personal preference if you do that calculation in the pipeline (by creating a new factor) or after the pipeline output in the returned dataframe. However, if for any reason one wishes to have dynamic weights (that change based upon some criteria) then the weighting would need to be done after the pipeline is returned. The pipeline calculations are done asynchronous to the backtest date.

A separate note... maybe consider using the 'mask = universe' parameter when constructing the factors. This helps with speed and memory especially when using the fundamentals dataset.

If you are careful one can use numpy arrays to do the factor manipulation and weighting. This may make the setting of the weights easier?

Just another idea. See the attached notebook.

18