simple alpha factor combination techniques?

I'm looking for simple alpha factor combination techniques (compatible with the limitations of Pipeline).

Beyond the canonical sum-of-z-scored-factors technique, what might work better? Linear combination, with each factor weighted by its IC Mean or Risk-Adjusted IC from Alphlens? Compute some rolling metric of the variation in each factor, and weight each factor by its inverse? Weight on an individual stock basis, versus at the alpha factor level? Etc.

At this point, I'm not interested in fancy ML techniques...I'd like to start with something computationally lightweight (and easy to code, as well).

13 responses

I’d be interested to try weight by Risk adjusted IC. On my very long list of things to try.

Maybe also have a look at Michael Matthews’ very cool NB on this?

Also perhaps check out this recent webinar where, at around 16 min in the presentation, the speaker argues that the simple (linear regression) and moderate (ridge regression) approach performs best OOS.

Thanks Joakim -

I think my first go will be to roll up my sleeves and run Alphalens on all of my factors individually. Then, I'll just try a static weighting, with risk-adjusted IC weights for the factors hard-coded into the algo (and could easily try other statistics from Alphalens). This way, I can straightforwardly compare equal weights to static weights, and not have to change my existing framework much at all.

One way to adjust the weights, I suppose, would be to feed them in via the Self-Serve Data feature. Run Alphalens quarterly, for example, and adjust the weights.

Note that Robert Carver's book Systematic Trading has a recommended weighting scheme, as I recall. The trick is to weight by a scheme that doesn't tend to over-fit.

Here's a framework for adding fixed weights. Simply, I just changed the factor to a tuple:

# dictionary of (factor,factor weight)
return {
'MessageSum':              (MessageSum,1),
'FCF':                     (fcf,1),
'Direction':               (Direction,1),
'mean_rev':                (mean_rev,1),
'volatility':              (volatility,1),
'GrowthScore':             (growthscore,1),
'PegRatio':                (peg_ratio,1),
'MoneyFlow':               (MoneyflowVolume5d,1),
'Trendline':               (Trendline,1),
'Revenue':                 (Revenue,1),
'GrossMarginChange':       (GrossMarginChange,1),
'Gross_Income_Margin':     (Gross_Income_Margin,1),
'MaxGap':                  (MaxGap,1),
'CapEx_Vol':               (CapEx_Vol,1),
'fcf_ev':                  (fcf_ev,1),
'DebtToTotalAssets':       (DebtToTotalAssets,1),
'CapEx_Vol_fs':            (TEM,1),
'Piotroski':               (Piotroski,1),
'Altman_Z':                (Altman_Z,1),
'Quick_Ratio':             (Quick_Ratio,1),
}


Currently, weights are all set at 1--the equal weight case. Easy enough to change them.

Note that this is largely an exercise. There are likely some bad factors in the mix (which should end up getting low weights).

Hi Joakim -

Regarding factor interactions, there is a lot of overlap with response surface methodology from process engineering (e.g. see https://www.itl.nist.gov/div898/handbook/pri/section3/pri336.htm).

For two factors, the alpha combination with interactions would look like:

F = b1*f1 + b2*f2 + b12*f1*f2

So one general question is should interactions be included in the alpha combination step, and if so, how should the coefficients be determined?

I guess one approach would be to run Alphalens on each of the factor pairs (e.g. f1*f2, f1*f3, f2*f3, etc.), basically treating them as new factors. Then, sum them with the linear terms (e.g. f1, f2, f3, etc.), weighting by risk-adjusted IC (or whatever seems best).

Here's an example of weighting each factor by its risk-adjusted IC (I simply picked the largest value across the various periods, 1D, 3D, 5D, 10D, 21D, using start_date='2010-01-01', end_date='2017-11-2').

Here's a backtest.

Here's an example I cooked up. The basic idea is to apply a similarity metric (in this case, cosine similarity) and apply less weight to factors that are more similar to the other factors, and more weight to factors that are less similar to other factors.

There also needs to be a weighting by a measure of the goodness of each factor (e.g. its IC or risk-adjusted IC or something else), unless all factors are about the same in individual goodness.

Hi @Grant,

I read that Quantopian recommends to not do alpha addition and create different algos instead (or for some cases alpha product could work but there has to be a very strong economical reasoning behind it. You cloud be multiplying 2 negative alphas to create a positive one).

Anyway your last algo in this post was very interesting, to create the weighted alpha you do:

    for alpha, A in pipeline_output('factor_pipeline').iteritems():
w = 0
for alpha, B in pipeline_output('factor_pipeline').iteritems():
w += np.abs(A.dot(B)/np.linalg.norm(A)/np.linalg.norm(B))
alpha_weighted[alpha] *= 1.0/w
context.combined_alpha = alpha_weighted.sum(axis=1)


So, for each alpha, you iterate over the other alphas computing the cosine similarity, using as vectors all stocks (with the alpha value). The ones that are very similar, will get lower weight

w += np.abs(A.dot(B)/np.linalg.norm(A)/np.linalg.norm(B))


But, for the cosine similarity, why you divide np.linalg.norm(A)/np.linalg.norm(B) instead of multiplying them np.linalg.norm(A)*np.linalg.norm(B)?

I understand the cosine similarity as:

from numpy import dot
from numpy.linalg import norm
cos_sim = dot(a, b)/(norm(a)*norm(b))


Also when you mention:

There also needs to be a weighting by a measure of the goodness of each factor (e.g. its IC or risk-adjusted IC or something else), unless all factors are about the same in individual goodness.

How is this reflected in the code?

For each alpha factor, I'm computing the sum of the absolute value of its cosine similarity with itself and the rest of the factors. This is then used as an inverse weight for the alpha factor.

Mathematically, you could write it like this:

w += np.abs(A.dot(B)/(np.linalg.norm(A)*np.linalg.norm(B))


The double division is equivalent.

My sense is that now Q is looking for more raw, basic alpha factors/signals, versus scalable full algos. I suspect that they are doing the combination via techniques more sophisticated than one could implement on their platform, but this is just a guess

I didn't implement the IC weighting.

Thanks a lot for your clarification @Grant.

I've been playing a bit with the algo. Among other things:
- Added to the framework the possibility to normalize (zscore) each factor against the whole universe or sector universe.
- Removed factors, constraints and modified configurations.
- Made it cheaper to trade by reducing turnover, rebalancing just monthly, and reducing the number of held stocks.
- Emulate Interactive Brokers fees and backtested with a small initial capital.

Same version as before but just using 6 factors