Back to Community
How to accept/reject alpha factors?

Is there an objective, step-by-step method to accept/reject alpha factors?

With a bit of research and head-scratching, and by scouring the Q forums, tutorials, and help documentation, one can collect a relatively large number of alpha factors (1). Then one is faced with the task of evaluating the factors, sorting into The Good, the Bad, and the Ugly. The existing tools I'm aware of are:

Some possible accept/reject criteria:

  • p-values below 0.05 with a mean IC of above 0.01
  • Over-fitting tests
  • Q risk factor (sector and style) exposures
  • Alpha time scale (e.g. max IC delay)
  • Q fund exposure (presently, no way to measure)
  • Size of universe over which factor is effective
  • Sensitivity to specific time/day of week trading

So, the question is how to perform the first-stage good/bad/ugly sort? 'Good' factors would be kept, and move on to be combined with other factors, in the alpha combination step, in one or more algos. 'Bad' factors would be rejected, and put on the compost heap. 'Ugly' factors would be scrutinized further, just to make sure one isn't throwing the baby out with the bath water.

As a side note, factors optimally trade on different time scales, but this would seem to be part of engineering the alpha combination, and not a consideration in the first-cut to decide if a factor has merit.

"An alpha is an expression, applied to the cross-section of your universe of stocks, which returns a vector of real numbers where these values are predictive of the relative magnitude of future returns" per

48 responses

Good question, I’d be interested as well. Basically I’ve relied on p-values below 0.05 with a mean IC of above 0.01. And Thomas’ odd/even quarters for training/testing factors to avoid overfitting.

Thanks Joakim -

How do you handle the Q risk factor exposures (sector and style)?

Morningstar has some 900 factors one can download for any set of stock price series. For all of them, we can apply any type of smoothing or filtering over any lookback period of our choice that fits within the time series themselves. A way of saying we can also force all that data to say what we want it to say.

Which 4, 5, or 6 of those factors will be relevant going forward?

With 6 factors, you could, by extensive testing, find which set, out of the 7.25 ∙ 10^14 possible combinations, might be worthwhile over some long-term past dataset.

The question would be: will it give you an advantage going forward? Will you have time to do all those tests?

We assume the data has been filtered, cleaned, rendered accurate as much as possible, and timely delivered. We never question their accuracy and almost take them for granted when we know that a lot of that data has been pasteurized before appearing in any company's books. Also, even the as-of-date of that data does not say that you would have had access to such data at that time. Still, you can use it for testing purposes.

The point being made is that even if you have some factors, whichever set you choose, it only represents your own data filtering mechanism on top of what the factors might provide. A smoothed EPS series with some lookback period is a different time series than its original as-of-date version. The same goes for any of the series. Once you manipulate, in some way, any of the Morningstar data series, you technically get a new set of factors which could hardly be called predictive due to their own delayed lookback periods.

The relevance of some of that data is to be questioned also. Quarterly data require longer lookback periods to make any of them significant. 100 quarters is still 25 years of quarterly data. Ten quarters is not significant enough to count as some reliable forecasting tool. Especially, if that smoothed data is already out of whack by half a quarter.

Guy -

I agree that the amount of data is important. For fundamentals data, I kinda had the same intuition that one needs many decades of data over many "business cycles" to firm up conclusions. It seems like Q might not have enough data, but I'm no expert. I guess the idea is to pull in other sources of information and/or rely on one's own professional/industry experience, versus just relying on Q data sets?

Stock A might be tracking alpha A perfectly and then switch to alpha B while stock B is completely all about alpha C.

Blue -

Not sure I follow. How would one apply the concept to evaluating an alpha factor?


As long as the ‘specific Returns’ are close to Total Returns, and also as long as the risk factors are within the bounds of the competition, I don’t worry about them too much. I used to constrain them very tightly in the Optimize API (e.g. in my old PARTY algo), but I think you lose alpha that way and might also be prone to overfitting, so I don’t do that anymore. Sometimes I instead use Thomas’ Orthogonalize Alpha function if I want to squeeze out only specific Returns.

On second thought, that's not completely true. I do worry about them and look at those risk exposures quite a bit. Ideally I'd like them to 'naturally' be as low as possible, without constraining or orthogonalize them. I don't really know how to do that though, other than finding unique alpha. If you know, or find out, I'd be very interested to know.

@Grant, when considering dataset, we have to make choices.

When designing or modifying strategies, I start with the theoretical big picture. The where it will all end with a long-term perspective. And that such a strategy might have to make its own place among others, not only by outperforming the market in general but also by outperforming its peers.

I first look at it from a mathematical point of view. If I can put an equal sign somewhere, I consider it a lot better than an opinion or some guesses. An equal sign is very cruel: it answers only to yes or no. You made enough profits or you did not. As simple as that.

You can consider the whole stock market as this big price matrix P. Each column representing a stock over the duration of the portfolio. Each EOD entry \( (p_{d, j}) \) viewed as recorded history. We cannot change any of it. It is part of this big blob of market data.

For simulation purposes, and other considerations, we only take a subset of this huge price matrix on which we intend to trade. The selection process could be about anything. You will only barely scratch the surface of possibilities anyway.

One thing is sure, you cannot test the humongous set of available scenarios. You have to make a choice on whatever criteria you might find reasonable. Even something based on common sense will do.

Whatever that selection may be, it will be unique. In the order of 1 in 10^400+ possibilities. So, the point is: make a selection and live with it. It is important, but it is not what is the most important in this game. Time is. Will your trading strategy last or blowup?

The next thing that is important is the trading strategy H, as in Σ(H∙ΔP), your strategy's payoff matrix. Since the price difference matrix ΔP is also historical data, or going forward, never seen data that eventually will become history, all you have at your disposal to control anything is your trading strategy H. It is the same size as your price matrix subset and holds the value of your ongoing inventory in every position that you have taken, or will take, over the life of the portfolio.

This makes H (your strategy), the most important part of the equation. It is trade and time agnostic. And this too has its own set of implications.

You could add an information and a decision matrix to all this. It would result in something like Σ(H∙I∙D∙ΔP). But that would not change the price matrix (P). It would only qualify the reasons for the volume changes in the stock inventory. But already the H matrix has the result of all the trade decision-making H = B - S. Technically, it is the only thing of interest for the relatively short-term trader. What governs your buying and selling matrices?

I look at the trade mechanics to answer that problem. I want the mechanics of the trade to be independent of the stock selection process. This way, I could change the stock selection and still find profits. It is like designing an intricate maze (using trading procedures) where the stock price will hit some of the boundaries and trigger trades. I try to find in the math of the problem the solutions to better returns.

However, I do know that if my trading strategy, for whatever reason, could not survive or generate more than the averages over some extended period of time over some historical data, then it has no value going forward. It ends up in the “remember this one and do not do that again bin”.

As much as long-term investors develop a long-term investment philosophy, the short to medium-term traders have to also develop a trader's philosophy. An understanding of what is being done during the whole trading process. And this can be driven by equations.

@ Joakim -

I never did quite understand the Q thinking in imposing the specific (idiosyncratic) returns versus common returns jazz, and now that they've gone over to a "signal combination" model for the Q fund, the requirement to have each algo with broad sector diversification and low style risk exposures makes even less sense, in my mind. But as the expression goes, "You can't fight city hall."

Here's a definition:

Specific Return
The return on an asset or investment over and above the expected return that cannot be explained by common factors. That is, the specific return is the return coming from the asset or investment's own merits, rather than the merits common to other, similar assets or investments. It is also called the idiosyncratic return.

What this implies is that common factors, which by definition are well-known, are predictive and can generate returns (better than just chance). This would say that the market at a gross level is not so efficient at all. This makes no sense. The alpha associated with common factors would have decayed a long time ago (if it ever existed in the first place). By definition, a factor is predictive or it is just noise. I suspect that common factors are really common noise.

If you look at A Professional Quant Equity Workflow, the risk exposure management doesn't get applied until the portfolio construction step. So, over-constraining risk exposure of the individual alpha factors may be self-defeating, in the sense that with enough diverse factors, the net risk exposure may be low enough, upon combination. My intuition is that there are dynamic diversification effects at play, and so if each alpha factor is orthogonalized with respect to the risk factors over some look-back period, any benefit of dynamic effects will be lost by applying the risk management too early in the workflow. There is a reason it is applied in the alpha combination step.

So, maybe for each factor to be evaluated, there is an "accretion test"--when it is combined with other factors and then the risk exposure constraint is imposed, does it add or subtract from the performance (with a test for over-fitting applied, as well)?

The shift to signal combination raises a valid question as to when to apply risk exposure management, at individual signal level or at signal combination (portfolio) level? At the individual signal level, assuming the alpha factor/s are generated using the broad base universe of QTU without constraining for anything would be what I call "raw signals" and anything with a Q Sharpe of 1 and above is likely a good candidate. These "raw signals" inherently accounts for risk mitigation through asset diversification just by the mere evaluation of the QTU universe regardless of industry or sector. In short, I consider the factor performance across the broad base universe devoid of any other constraints as to style, risks and neutralities. The factor/s performance is measured by one metric, Q Sharpe (returns/risks). I would then gather the raw signals that pass this treshold and apply leverage, dollar neutral, position concentration and turnover constraints to the signal combination. You may notice that I left out beta and risk model (sector and common returns) constraints as I deem them unnecessary because they should have been inherently negated by mere diversification of QTU universe. Just my two cents.

@ James -

Yeah, it is not clear if the switch to a "signal combination" approach at the Q fund level means anything for writing algos. The contest rules were not changed, and so presumably they still represent general requirements for an algo (although I have been told that more niche strategies are considered, as well).

I would think that the "signal combination" approach, with each algo effectively an alpha factor in the architecture, to be assigned a weight and compensated proportionately, could greatly expand the participation rate of the users. This way, for example, one could write an alpha factor that trades a single industry sector or niche market, where one might have some expertise to bring to the table.

I gather that the Q fund alpha combination step may intercept the raw alpha vector input to MaximizeAlpha (or TargetWeights) anyway, so the Optimize API constraints of a given algo may not matter (at least that's how I would consider approaching the "signal combination" at the fund level, versus using the output of order_optimal_portfolio). So getting worked up over the Optimize API constraints probably doesn't make any sense; it all gets blended raw into the final combined alpha vector, and then constrained. This would be consistent with the architecture of such a fund operated under the "signal combination" paradigm, versus a "fund of funds" approach.

Quantopian's alpha generation problem is this:

Σ(H∙ΔP) = ω_a ∙ Σ(H_a∙ΔP) + , … , + ω_k ∙ Σ(H_k∙ΔP) + , … , + ω_z ∙ Σ(H_z∙ΔP)

They are trying to find the weighing factors ω that would maximize Σ(H∙ΔP). The problem is somewhat simplified since each of those trading strategies will be considered as some alpha source. However, each of those strategies does not generate the same amount of profits and therefore should be ordered by decreasing relevance. Which would imply that the highest producing strategy Σ(H_a∙ΔP) receives the highest weight ω_a and the highest allocation. Treating each strategy as equal is technically nonsensical. Strategies should battle to stay relevant and above a predetermined alpha threshold. This should read as many try but few are selected.

This does not change the author's compensation method, he/she receive 10% of the NET profits generated by ω_a ∙ Σ(H_a∙ΔP). Note that, Quantopian intends to leverage these trading strategies. It was not said anywhere if the author's strategy would be compensated accordingly lev_a ∙ ω_a ∙ Σ(H_a∙ΔP) ∙ 10%. I view the word NET as net of all trading expenses.

The advantage for Quantopian to consider strategies as alpha-signals is one of volatility reduction by diversification and outcome control. Simply by dynamical changing the weights ω_i(t) on the trading strategies, they can control what they want to see and treat each strategy as if a simple alpha-signal, a simple factor in their portfolio equation.

Guy -

The payout to each algo is its fraction of the total allocation (its weight) times the total net profit of the entire fund (not sure about leverage) times 10%. So the payout could be more, less or equal to the algo’s share of the total net profit. It all depends on how the algo weight is set and the net fund profit. The actual algo return doesn’t drive its payout, as I understand.

@Grant, we are saying the same thing. The sum of weights is Σω_i = 1. However, Quantopian could treat any of the trading strategies it considers for its fund as either overweight or underweight with varying degrees of leverage.

Quantopian in their business update said they had 25 authors and 40 allocated strategies. One author has the equivalent of 5 allocations ($50M). His or her strategy is more heavily weighted and should grab more of the net profits compared to others. The share of net profits will be ω_a ∙ Σ(H_a∙ΔP) / Σ(H∙ΔP). Which is what the equation said.

Guy -

Your equation ω_a ∙ Σ(H_a∙ΔP) / Σ(H∙ΔP) says that the payout depends on the profit of the algo, right? This is not the case. The algo could make $0 actual profit (Σ(H_a∙ΔP) = 0), but still get a payout if its weight is not zero.

@Grant, whatever the weight, if Σ(H_a∙ΔP) = 0, then ω_a ∙ Σ(H_a∙ΔP) = 0.

@ Guy - have a look at The relative allocation (weight) of capital determines the share of the overall net fund profit but the forward return of an algo doesn’t determine the payout to the author (although one would expect the trailing returns to impact the algos weight in the fund).

@Grant, would you pay anything as compensation for a strategy's participation in the Quantopian portfolio if all it generates in profits is Σ(H_a∙ΔP) <= 0. I do not think Quantopian would pay anything either for non-performing strategies. Otherwise, all the poorer performing strategies would drain the potential rewards of the best-performing ones.

@ Guy - With an extreme case, I was just illustrating the fact that the payout scheme has changed to:

Royalty = (weight of algorithm in signal combination) * (total net profit of the combination)

Your math and comments seemed to be inconsistent with this new formula, and so I wanted to make sure we were all on the same page. My interpretation of the change is that Q may be combining the raw alpha vectors from all of the algos, versus first running each through the Optimize API, and then combining the orders. If this is the case, the idea of individually traded strategies is not relevant. Based on what I can glean, the alphas from each algo are combined as a simple linear combination, since, per the Get Funded page "Each author’s share of that pool is proportional to their allocation or weight within the broader Quantopian investment strategy." On Quantopian Business Update, however, the description is a more general "The weight of your signal will be based on the quality of the alpha" and there is no reference to the weight being equivalent to the relative allocation of the algo in the fund. The linear combination of alpha vectors implied by equivalency of the weights and allocations seems over-constrained. I would think that a more general ML approach, using feature importance for the weights would be better, followed by application of the Optimize API for risk management. This would be more along the lines of weighting based on the "quality of the alpha" (i.e. the importance of the alpha in forecasting) without the constraint of a simple linear combination of alphas in the alpha combination step.

Back to the main topic of this thread ("How to accept/reject alpha factors?"), reportedly Q is working on some way of evaluating correlation to algos already in the fund. This might be the first cut in whether to accept/reject an alpha factor. Similar to the style risk factors, there's no point in submitting something already in the Q fund (although perhaps there's an advantage to diversifying the same strategy across multiple authors, assuming that the expenses associated with acquiring and maintaining an author are also could have an amplification effect, since engaging more authors with real money will tend to attract more users via zero-cost peer-to-peer marketing..."Hey, I made $1000 on Q. You might want to give it a try, too.").

@Grant, we are saying the same thing. Let's take Q's formula:

Royalty = (weight of algorithm in signal combination) * (total net profit of the combination)

weight of algorithm in signal combination = Σ(H_a∙ΔP) / Σ(H∙ΔP) = ω_a

total net profit of the combination = Σ(H∙ΔP)


Royalty = [(ω_a) ∙ (Σ(H_a∙ΔP) / Σ(H∙ΔP)] * Σ(H∙ΔP) = ω_a ∙ Σ(H_a∙ΔP)

and your 10% royalty remains proportional to what your trading strategy (ω_a) ∙ (Σ(H_a∙ΔP) is producing. You produced more than some other trading strategy, you should get more in royalties. Just as in the contest leaderboard.

Should you first want to add the outcome of all the strategies and divide that by the number of allocated strategies, this would give:

Royalty share = Σ [ω_a ∙ Σ(H_a∙ΔP) + , … , + ω_k ∙ Σ(H_k∙ΔP) + , … , + ω_z ∙ Σ(H_z∙ΔP)] / i

which would penalize the higher performers to the benefit of the low-performing strategies. Low-performing strategies would find that scheme quite acceptable. In a way free money.

Nonetheless, Q should provide its “mathematical” attribution formula to clear things up.

@Grant, back to topic. Let's start with: there is no universal stand-alone factor able to explain the gigantic ball of variance that we see in stock market prices. There are gazillions of factor combinations we could test over past market data and it still would not give us which combination would best prevail going forward.

The only thing you can do is compare one factor to another within this ocean of variance where a lot of randomness also prevails. However, whichever set of factors you want to select, they will form a unique combination that will have to deal with the portfolio's selected stocks. You change the stock selection method and the strategy will behave differently giving different results: Σ(H_a∙ΔP) > , … , > ω_k ∙ Σ(H_k∙ΔP) > , … , > ω_z ∙ Σ(H_z∙ΔP)

You are always trying to find a better strategy than Σ(H_a∙ΔP) in order to get better results. You can also add a whole set of constraints in order to control the general behavior of the strategy as in the contest rules for instance. Or you can tweak it to death to really show that past results are no guarantee of future performance. But it will not change the mission, you want the better performing one nonetheless.

The preoccupation might not be looking for factors per se, but for trading methods you can control. Otherwise, you are at the mercy of your code, of your selection process, and your short-term trading philosophy. Not to mention the math of the game.

Guy -

your 10% royalty remains proportional to what your trading strategy (ω_a) ∙ (Σ(H_a∙ΔP) is producing. You produced more than some other trading strategy, you should get more in royalties. Just as in the contest leaderboard.

You statement is incorrect. The royalty is the algo weight times the net profit of the entire fund. What the algo produces is not relevant (although its historical returns must factor into computing the algo weight in the "signal combination" algorithm). I'm not sure what else to say...have a look at these two links again (and contact Q if it is still not clear):


Is there an objective, step-by-step method to accept/reject alpha factors?

I would do the following (but all of it is not supported by Q at this time, I guess)
1. Simple regression of alpha factor on forward returns (Alphalens metrics - p value/IC etc.). If significant, proceed to Step 2
2. Control for Q factors. Throw in the Q factor returns (multiple linear regression) and check if alpha factor is significant (This is not yet supported by Alphalens). If it's significant, proceed to step 3
3. Control for existing alpha factors. Throw in other alpha factors (another multiple linear regression) and check if additional alpha factor is significant). This will not only help in finding the significance of additional alpha factor but will help in finding the appropriate linear combination of all alpha factors.
4. Control for other participants total alpha factor (this can be controlled by Q internally). However, at this time Q only gets to see only participant's trades/positions and not alpha factor so this may not be possible in the current framework.

I don't know if it's possible to implement Step 2 as historical Q factor returns are not exposed. The methodology is shared however so it can be created externally.

I edited my original post above to include a link to Thomas W.'s orthogonalize function, since it seems like it could be handy (Joakim had already pointed it out, but I figured it would help to promote it to the top level of this discussion thread).

It raises an interesting point, since on the Get Funded page it says clearly "We are building a portfolio of uncorrelated investments." In practice, what does this mean for evaluating alpha factors and algos? In the extreme, it may mean that only the uncorrelated returns of an algo (the specific/indiosyncratic versus common) are of interest, where anything that is a published risk factor (sector and style) and already represented in the fund would be considered 'common.' Basically, one has to orthogonalize with respect to the risk factors and the fund to determine if there's any specific/indiosyncratic alpha of interest to the fund.

Any guidance on how to use the time scale of the factor in determining if it should be accepted? For example, here Thomas W. points out "Here is a new iteration of this tearsheet. Instead of cumulative IC it now just displays daily IC which makes it easier to see which horizon the signal is predictive for." So what is one to do with this information, as a first-cut in evaluating a factor? Is there an optimal range for the peak in the IC versus delay, for example?

A way to evaluate the value of factors is with relative performance. A trading strategy has structure. It is usually quite simple, a single do while loop (do while not finished, if this then trade in this manner). And whatever stock trading strategy, it can be compared to a benchmark like SPY. Σ(H_a∙ΔP) > Σ(H_(spy)∙ΔP) ?

If all that is different between two strategies is the use of one factor, then comparing those two strategies would be as easy as Σ(H_a∙ΔP) > Σ(H_b∙ΔP) > Σ(H_(spy)∙ΔP) ? All other things being equal.

However, if you change the stock selection method, the number of stocks to trade, the time horizon, the leverage, the rebalance timing, or the available initial capital, the picture might change drastically. Saying that the factor difference in question might have been good at times and bad at others. As if the factor had some relevance in the stock selection _a and not so much in _b just because a different set of stocks were considered. And depending on the stock selection method, there could be gazillions of gazillions of possibilities.

This might render the value of an additional factor almost irrelevant as we increase the number of factors treated. Especially if the factor considered is not part of the primary factors: _f1 > _f2 > , … , > _f4 > , … , > _fn > since these factor values decay as you increase their number as illustrated in the following:

It would require more extensive testing, meaning a much greater number of stock selection methods just to partially ascertain the value of a single factor. As if trying to give statistical significance to a distant factor that might be more dependent on the fumes of variance than anything else.

Is there a need to go into that kind of exhaustive search beyond a few primary factors when you have no certainty that a more distant factor might prevail going forward? And are the few factors considered that relevant if they have little predictive powers? The more distant the factor, the less predictive is should be since its weight relative to the others is diminishing. These factors will remain in order of significance, no matter the set of factors we use.

@ Guy -

I think one needs a very high Sharpe ratio to be able to have much confidence whatsoever using even 10-20 years of look-back. I haven't seen an example in awhile, but the Bayesian cone thingy that is used on some of the Q plots illustrates this point. It gets broad in a hurry, and then it's anybody's guess, unless the Sharpe ratio is really high (and if it is, it probably is due to some form of bias/over-fitting).

@Grant, on the Sharpe thing, not necessarily. Normally, I would tend to agree, but.

For instance, from my modifications to your clustering scenario, (see:, my average Sharpe was 1.81. And in my modified robot advisor script scenario (see:, the average Sharpe was 1.85. The average Sharpe ratio could be considered as almost equal and not that high considering. Nonetheless, the difference in performance between these two strategies was extremely high. They used quite different trading methods even if both used an optimizer to settle trades.

In the former case, all is done to reduce volatility and drawdown while squashing the beta to near zero. In the latter scenario, the strategy is seeking volatility and trades in order to increase performance. Evidently, the price was a higher beta (0.61 - 0.65) even though the selected stocks were all high beta stocks (>1.0) and should have averaged at greater than 1.0. But, that was not the case. The differences in performance are due to the trading methodology used and their long-term objectives. In essence, the result of the game they respectively played.

The Bayesian cones we display in the tearsheets have to be expanding due to recorded portfolio volatility and their increasing variance over time. These cones will increase for any trading scenario. They will grow wider and wider the more time you give them. We have a hard time predicting tomorrow, and there we go extrapolating a year or two ahead. What would you expect except a widening cone since “uncertainty” of the estimates are certainly rising?

Why is it that anyone generating higher performance levels is considered as over-fitting? Can't we generate some higher performance simply because we game the game differently?

For example, I plan for my game to grow with time. I usually concentrate on increasing the number of trades and structuring the game so that the average net profit per trade also rises with time. Should I not be able to do that? And if not, why not?

My sense is that chasing performance by ranking alpha factors and picking the top 5 or so will lead to a lot more volatility than incorporating all factors that have predictive power on an individual basis, regardless of relative risk-adjusted returns. Joakim summarizes one recipe:

Basically I’ve relied on p-values below 0.05 with a mean IC of above 0.01. And Thomas’ odd/even quarters for training/testing factors to avoid overfitting.

This still leaves the task of combining the accepted alpha factors, which could include some performance-based weighting (alpha combination is not the main topic of this thread). However, it would need to incorporate the statistics of very limited trailing data sets, relative to economic time scales. There is a statistical significance to X > Y; if X & Y have relatively large error bars, then one might as well say X = Y (all other things being equal).

Why is it that anyone generating higher performance levels is considered as over-fitting?

The question is really how much out-of-sample data is required to determine that something other than luck is the mechanism for higher performance? For example, have a look at It's a plot of the rolling 12-month Sharpe ratio (SR) for the S&P 500 over the last 25 years. Given the underlying volatility in the SR, how much out-of-sample data would be required to prove that a strategy does better than the market on a risk-adjusted basis, and that skill was the reason? There is such a thing as skill, but my read is that proving it might take a lifetime.

@ Joakim -

And Thomas’ odd/even quarters for training/testing factors to avoid overfitting.

Is there an example? And what are the recommended pass/fail criteria?

It is not intuitive that somehow comparing odd/even quarters would be able to sniff out over-fitting, once the factor has been written. For example, if the author has inadvertently, or intentionally applied look-ahead bias, then the bias will likely be the same for any given quarter. The only way to detect the bias, it would seem, would be to wait for out-of-sample data and then try to detect a change (which is why the contest is 6 months...which I guess if there is something grossly funky going on might be enough to detect over-fitting...but only if the factor time scale is short relative to 6 months, and the signal-to-noise ratio is sufficiently high, which is not likely to be the case for an individual factor).

I continue to be confused by the new signal combination approach to constructing the Q fund. I'm pretty confident that Q should be incentivizing individual factors, not multi-factor algos, if they want to do the best job in combining the factors. This would say that each factor should get its own algo, which would allow Q to capture the "born on date" to determine how the factor has performed over 6 months or more out-of-sample. If this is correct, there's probably a way to construct a "factor algo" that allows for evaluating the individual alpha vector, without all of the irrelevant risk control and trading gobbledygook.


I'm pretty confident that Q should be incentivizing individual factors, not multi-factor algos, if they want to do the best job in combining the factors.

I think that your confusion is related to your above assumption. About 95% of fund algorithms are sourced from contest results and the rest from Q's automated screening processes. That said, most if not all these submissions are multi-factor combinations which are treated as an individual "signal" in the signal combination of the Q fund. Each of this individual signal have its unique tradable universe within the broad based QTU universe, position concentration, time to start trade execution, etc. It is also targeted that these individual signals are uncorrelated to each other. So there is this layer that takes all these uncorrelated individual signals, post process them (i.e. net out positions of same stocks) and somehow apply a weighing scheme (perhaps proprietary to Q) that meets the desired trading strategy after factoring in the associated risk mitigating schemes.

@Grant, you say:

There is such a thing as skill, but my read is that proving it might
take a lifetime.

I agree.

That kind of study has been done. It turns out it would take some 38 years for a professional money manager to show skill prevailed over luck at the 95% level based on sufficient data (10 years and more). No one is waiting or forward-testing for that long. And even if they did, they would again be faced with the right edge of their portfolio chart: uncertainty, all over again.

In all, it would be a monumental waste of time, opportunities, and resources. It is partly why we tend to do all those backtests just to demonstrate to ourselves that over some past data our strategies succeeded in some way to outperform or not. Evidently, we will throw away those strategies that had little or no value whatsoever. If a trading strategy cannot demonstrate it could have survived over extended periods of time, why should we even consider them in our arsenal of trading strategies going forward?

Often, in the trading strategies I look at, the impact of the high degree of randomness in market prices is practically totally ignored. As if people are trying to look at the market price matrix P as if a database of numbers from which they can gather or extract some statistical significance of some kind, either at the market or stock level. And from there, trying to make some sense of all those numbers using all types of analysis methods.

Only under a high degree of randomness can any of the following methods have their moments of coincidental predominance, enough to make us think they might have something kind of predictive. That it be a sentiment indicator trying to show the wisdom of crowds, or machine learning, deep learning or artificial intelligence, they will all have their what if moments. That you use technical indicators, parameters, factors, residuals, principal component analysis, wavelets, multiple regressions, quadratic functions, and more, it might not help either in deciphering a game approaching a heads or tails type of game. Upcoming odds will change all the time and even show somewhat unpredictable momentary biases.

Nonetheless, it is within this quasi-random trading environment that we have to design our trading strategies. And design it in such a way as to not only provide positive results and outperform market averages, but also, at the same time, outperform our peers, not just over the immediate momentum thingy, but ultimately over the long term.

The end game is what really matters. Can our trading strategies get there? That is the real question. What kind of game can we design within the game that will allow us to outperform market averages and our peers? What kind of trading rules should we implement in order to do so? This goes back full circle to what our trading strategy H does, to how we manage our stock inventory, and what will be the outcome of our forward-looking payoff matrix:

Σ(H_(ours)∙ΔP) > Σ(H_(peers)∙ΔP) > Σ(H_(spy)∙ΔP) > Σ(H_(others)∙ΔP)?

@ James -

My main hypothesis is that rather than incentivizing multi-factor algos via the contest it would be better for Q to handle the alpha combination step at the fund level across all factors. Otherwise Q hasn’t really changed much with their switch to a signal combination approach. My bet is that there is some performance degradation by not accessing all of the factors; the pre-combination at the algo level may be sub-optimal.

@Grant, I think @James expressed it correctly. Q cannot use a single factor from within a trading strategy. It would require that they know which factor is within a particular trading strategy and its impact. That, in turn, would require that Q sees the code. It would go against the prime directive that one's code is protected from prying eyes. If Q can see your code at their own discretion, they do not need you in a future allocation picture or in the contest winning circle.

What Q can do, is take the outcome of a strategy as a whole, look at its trading order output as a single signal or trade vector it can mix with other signals from other strategies. As @James noted, this might give Q the ability to pre-add or pre-remove redundancies and cross-current trades. Thereby, going for the net trade impact for a group of strategies with the effect of slightly reducing overall commissions and increasing trade efficiency.

However, even doing this requires a 3-dimensional array: Σ(H∙ΔP), with strategy, stock, and price as respective axes. So that Q's job becomes weighing each strategy's contribution to the whole, under whatever principles they like, as in ω_i ∙ Σ(H∙ΔP)_i where ω_i is the weight (i = 1 to k) attributed to strategy i. The problem becomes even more complicated since due to market orders overlapping, strategies will continue, nonetheless, to act as if the actual trades would have been taken, and thereby distorts the future outcome of the affected strategies as well as Q's mix.

Also, they cannot use that many trading strategies in their mix since the number of stocks appearing in more than one strategy with increase with the number of strategies considered. This would have for consequence to increase the redundancy problem and the distortion impact even further the more they add strategies to their signal mix.

@ Guy -

All I'm saying is that at the fund level, Q should follow the workflow outlined by Jonathan Larkin on:

I could be wrong, but I think they'd be better off taking in individual alpha factors, versus having authors chug through the entire workflow, combining lots of alpha factors in an attempt to smooth out returns, manage risk, etc. My impression is that it's not what a hedge fund would do typically; each narrowly focused alpha would feed into the fund global alpha combination, as Jonathan has shown. Again, my intuition could be off, but incentivizing authors to do the entire workflow, basically constructing a super-algo, is the wrong approach.

I guess if I were tasked with doing signal combination, I'd want the individual signals (i.e. the alpha factors), versus having them pre-combined. For example, would it be better to have access to all ETFs, or just a handful of them? The number of degrees of freedom for the former is much higher.

@Grant, the problem here is the individual's IP.

In a hedge fund, they can mix and combine alpha signals any which way they want, at the portfolio and signal level. They have access to it all, it is their code after all. For Q, they are not “allowed” to look inside a trading strategy otherwise it blows their fiduciary assured trust. If we ever see they looked inside our trading strategies, the limited trust we might have might simply fly away.

You ask: For example, would it be better to have access to all ETFs, or just a handful of them? Due to the potential liquidity problems, it will turn out to be just the most active ETFs that should be of interest. You still need someone on the other side to take the trade.

@ Guy -

To evaluate the quality of an individual alpha factor, it is not necessary to review the code or even to know its “strategic intent.” It can be a black box (the same as an algo that combines multiple factors). Either the factor predicts the future across some universe of stocks or it doesn’t.

I think the liquidity problem goes away at the fund level once factors are combined. One isn’t trading the factors individually and independently.


@ Joakim -

And Thomas’ odd/even quarters for training/testing factors to avoid

Is there an example? And what are the recommended pass/fail criteria?

Sorry I missed this earlier. Here's the link to Thomas' post that includes the notebook for researching and cross-validating alpha factors over odd/even quarters. I find it quite useful for avoiding overfitting, though I'd prefer to use 'random' quarters instead to minimize any risk of fitting on seasonal trends. Up to you of course but personally I would include this one in your Alpha Research Toolkit.

I do also agree with your comment that future live data is the best for testing/measuring overfitting. This one is pretty good though I think when initially researching and developing alpha factors and we only have access to past data. Live paper trading, in my opinion, is more part of the final test before live trading with real money.

Using this notebook is not nearly as alluring (and addictive) as datamining in the backtester (which I'm still a victim of sometimes unfortunately), but I reckon it's a lot better for finding robust factors that are general enough to work reasonably well on future data. Assuming one starts off with some economic or market behaviour rationale first, and not just randomly trying stuff to see what works. Again, easier said than done. Datamining, because it works so well (on past data), tends to boost my ego and increase my blind-spots... :(

Thanks Joakim -

I updated my original post above with a link to Thomas' over-fit testing tool.

Size of universe over which factor is effective

Just a suggestion but maybe also include ‘type’ of universe as well? Some factors may only be predictive on small/large caps, high/low volatility or high/low beta stocks, only certain sectors, etc.

Some factors may also be a better short indicators, and others long? They may not all be symmetrical in other words. One challenge might be how to combine all these different factors on different universes into a ‘meta-factor’ that the Optimizer can use? Perhaps by using multiple pipelines and combine them outside, e.g. in Rebalance?

Thanks Joakim -

Yes, there are many "flavors" of factors that would then need to be combined. One issue is the potential for over-fitting goes up with the number of degrees of freedom (e.g. one could concoct a long-only factor and then apply it to the best-performing sector in recent times, and presto--we have a winner!).

Perhaps for a given factor, there's an optimal long/short tilt (without going totally long or short)?

One issue that I don't ever recall being tackled on Quantopian is financialization. At a gross level, governments and the financial sector are in cahoots and probably exert a dominant effect on individual stocks and the markets that may make this whole factor research thing moot. I wonder if there are companies/industries that tend to be more immune to the whims of finance, and actually make money the old fashioned way--by steadily providing goods and services to their customers that are more valuable than what their competition can provide.

Yes, good point. Overfitting I think is a real risk as the complexity of the factor model goes up. I find getting the balance right quite difficult and sometimes I have to remind myself to just follow the ‘KISS’ principle. The hunt for real alpha ain’t easy!

@Grant, you are looking for factors that might be almost irrelevant going forward due to the very nature of what you are trying to observe.

Some years back, someone in a study came up with the highest coincidental correlated factor he found to the DJI index. It turned out to be the price of turnips at a local market in Mumbai. Now, even if I know this, I would still not bet my shirt on this one for it to continue being highly correlated in the future.

In the '70s, studies showed that we had the length of dresses correlated to the DJI index, and the sale of aspirin inversely correlated. And again, going forward, I would not bet that these kinds of trends might have or would have continued. Moreover, if we redid those studies today, we would find that they, in fact, did not. Note that I have not done those studies, do not intend on doing them and am not interested in ever doing them. But, you do not need to do them to ascertain that you would not rely on such things to build a stock portfolio.

Above, I added another consideration:

Sensitivity to specific time/day of week trading

There is some guidance by Thomas W. on :

If your algorithm is sensitive to trading times, it's indicative of short-term alpha or some noise you're trying to overfit to. My advice is to set to always trade as close to the close as possible and never change it.

...we are now focused on your EOD holdings, and evaluating your algorithm using those (using the tearsheet I posted).

So is the standard tool for evaluating both factors and backtests now the notebook Thomas W. posted (and not Alphalens or something else)?

I'm also confused...if trades are to be entered at EOD, what is the point of having a slow, minutely backtester? Why not revert back to the daily backtester? Overall, my sense is that Q wants slowly varying daily alpha factors, that they can combine at the fund level--the minutely backtester (and even running backtests) would seem to be overkill, right? If I understand correctly, they just need alpha factors for the Q fund.

If your algorithm is sensitive to trading times, it's indicative of short-term alpha or some noise you're trying to overfit to.

This isn't exactly clear to me. Why is it unlikely that an alpha factor has latched onto some structural time-of-day inefficiency in the market? e.g. at open there is a consistent overreaction to some risk that sorts itself out by the end of each day as the market digests information. Seems very plausible to me. I'd like to see some evidence that algorithms that are sensitive to time-of-day scheduling consistently perform worse OOS.

what is the point of having a slow, minutely backtester?

Presumably produces somewhat more accurate fill simulation than using EOD prices would. However, yes, it seems like if Quantopian wants us to find alpha factors where it doesn't matter if we get into a position a day or two or three after the signal flashes, minutely zipline doesn't seem the ideal tool for the task.

Viridian Hawk -

Thanks. Yeah, I'm not sure I follow the guidance, and it is buried in Thomas' thread, but it would seem like a pretty important consideration. I'll send a question in directly to Q support.