cardinality constrained mean reversion - proof of concept

Here is an attempt to implement the ideas in this research paper titled "Identifying small mean reverting portfolios": http://www.di.ens.fr/~aspremon/PDF/MeanRevVec.pdf

The gist of the paper is that given N stocks in a universe, the algorithm identifies k stocks that are mean reverting (stationary). As a proof of concept, the attached code does not implement a trading strategy but only logs if the smaller portfolio is stationary or not.

Unfortunately, the optimization using cvxopt for this algo is quite intensive and quantopian blows up with out of memory or timeouts if universe contains more than 15 or so stocks. I think this algo will work well if universe is large (>50 stocks) and we select a basket of 8-10 stocks that are mean reverting.

Please play around with it and let me know if you identify any defects.

168
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 553a868cd40f100d9417a1e3
There was a runtime error.
49 responses

A simple crude trading strategy around the stationary series.

168
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 553a9323ef02fe0d1a37a387
There was a runtime error.

Satyapravin, This is a really interesting result. Looks like all the 'jumps' in the portfolio are to the upside. Have you tried backtesting it over a longer time frame or over other stocks or ETF's?

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

The algo is too computationally intensive and demands lot of memory and CPU. I don't think it scales well in quantopian unfortunately. Have to run it outside quantopian with a larger universe of atleast 50-100 stocks.

There's an outfit (also in Boston) that offers high-performance computing for financial applications:

https://elsen.co/

Perhaps they could lend a hand, if the Q research platform can't handle it.

Thanks for the reference Grant. I was thinking of splitting the universe into buckets of 10-15 stocks and trying to identify a mean reverting subset in each basket on Q. I will take a look at elsen.

Finally managed to crack it. The lesson I learnt is that mean reversion is very noisy around mean and non gaussian. Standard deviation or zscores don't help. I need to find a better measure of dispersion. Anyway this backtest has a sharpe of 2 and can be improved because there is a long period of 6 months where algo does not trade at all. This can be changed by increasing the universe of stocks to scan for opportunities.

168
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 553f0cdee0e8390d36933492
There was a runtime error.

Hi Satyapravin,

The paper http://www.di.ens.fr/~aspremon/PDF/MeanRevVec.pdf is quite involved; I'd really have to roll up my sleeves to understand it (although I have some of the basic linear alebra, statistics, and conceptual tools). Generally, my interest is to see if the approach could be applied in the Q research platform to identify baskets of securities that could then be used in an algorithm like OLMAR (http://arxiv.org/ftp/arxiv/papers/1206/1206.4626.pdf) or a variant.

I have the feeling that the Q research platform may "break down" at some point. For example, say I wanted to pluck out mean-reverting baskets of 50-200 stocks out of a universe of 2000. Would it work, if I'm willing to wait hours/days?

Would you be willing to provide a break-down of what portion of the paper you applied, what it does, and why it works?

Also, it seems important to be able to explore the time scale of mean reversion? Does this analysis provide a means to do that? For example, if the time scale is too short, then Quantopian, running on minute bars, won't work, and if it is too long, then the deviation will have to be very large (which seems unlikely).

Thanks,

Grant

Hi Grant,

The paper is complex but after several reads became much clearer. Here is the gist of the paper. First we run lasso regression on stock prices to obtain an estimator which is the coefficient in VAR(1) model like this:

for j in range(0, col):
Ahat[:,j] = Lasso(0.1).fit(pval[:-1,:], pval[1:,j]).coef_


Then we compute the covariance matrix using shrinkage model like this:

oas = OAS()
oas = oas.fit(pval)
B = oas.covariance_


The next step is to solve a semi definite optimization program described below. Most of the complexity is because Q does not support CVXPY and we have to make do with CVXOPT.

# minimize    Tr(MY)
# subject to
#           ||Y|| - kz <= 0
#           Tr(Y) - z   = 0
#           Tr(BY)      = 1
#           Y          >= 0 - positive semidefinite
#
# where
#           Y =    X             z =     1
#               --------             ----------
#                Tr(BX)                Tr(BX)
#
#           M = A'B A
#           A = lasso regression estimator
#           B = covariance matrix
#           X = xx'    - x are the weights
#           k = max no. of stocks in mean reverting portfolio
#


Important points:

1. Since X = xx' we require N square variables for N stocks in universe. This quickly explodes. For example if universe has 30 stocks then optimization requires 900 variables and Q dies.

2. Most times mean reversion quickly collapses out of sample.

3. I used adfuller to check for stationarity which implies mean reversion

4. Sometimes the weights are heavily skewed. What I mean is that if there are 10 stocks are we are investing 100K, then if a single stock has an exposure of 50K it is uneven. But that is easy to solve by introducing a new constraint on each stock's exposure.

5. You can use half life to measure the time it takes to mean revert. Please see fit_OU method.

6. There is also another paper which introduces variance in portfolio. That can be clubbed with this strategy by introducing a new constraint.

By the way, since mean reversion breaks out of sample, I am actually trying out a strategy that bets that mean reversion is going to break. That means if a stationary portfolio is 2 standard deviations about mean value then we assume that it will break out even more. This seems to work as well. I will post results shortly.

I think the algorithm in paper works well to identify stationary portfolios but I am not experienced to trade it. This is where I am failing.

Also it does not make sense to look for mean reversion in a portfolio of random stocks. A friend of mine who works in this industry suggested that I use clustering techniques to group similar stocks and then look for mean reversion amongst similar stocks. For example look for stocks that move in same direction with oil prices and build a mean reverting portfolio of these stocks.

Yes, brute force searching for mean reverting portfolios of random stocks is very time-consuming. On Quantopian, it's also really important to only use these techniques in really liquid assets with narrow spreads, otherwise you'll find phantom mean reversion due to bid-ask bounce in the trade data.

On that note, here is a clustering algorithm demo in python if it wasn't shared before. It uses affinity propogation (haven't understood it yet) on correlations to group stocks into clusters.

Thanks Satyapravin,

At least now when I refer back to the paper, maybe I can focus on the right things.

Regarding the clustering, I've attached an example that was posted on Quantopian some time ago. Perhaps you'll find it useful.

Grant

89
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 55416fc8d8811d0d24656440
There was a runtime error.

@Satyapravin B., You appear to be a pretty smart guy. And I wish you all the best if you decide to participate in the Q's Open. But your code is so obscure as to be impenetrable to my tiny brain. I've reviewed a number of your submissions and frankly have to look away else my mind get sucked into some quantitative induced vortex from which I'll never escape. Sluuuuuurp -- there went my half a billion neurons. But I'm drawn to your investigations like a moth to a flame. Singed every time however, I wonder if there's not some salve to ease my pain. For example:

def getGs(n):
G = np.zeros((n**2,2*n**2+1))
row = 0
for i in range(0,n):
for j in range(0,n):
g = np.zeros((n,n))
if i == j:
g[i,j] = -1.0
else:
g[i,j] = -0.5
g[j,i] = -0.5
g = np.ravel(g).tolist()
g = g + ([0] *(n**2+1))
G[row,:] = g
row += 1
return [matrix(G)]



My head has started to throb just seeing this code from the periphery of my eyesight.

Is it at all possible to implore you to elaborate your code in such a way as to render it intelligible to the quantitatively challenged -- such as myself?

Hi Market Tech,

Apologies. I am trying hard to enter Q open but haven't managed to find a consistently profitable strategy. If things work in 2013-15 they fail miserably in 2006-2008 or something like that. I was in a rush to get something going and hence the lack of comments in my code.

Unfortunately CVXOPT is very complex. All of the complexity will reduce once Q supports CVXPY which I have requested. I will post a cleaner version as soon as its done.

Also I am not a python programmer which you would have known by now looking at my code. I program in C++ mostly and am learning python just for Q.

always great to be able to learn from someone like you Satyapravin, but I must say that it takes me a long time to read the lovely condensed code, lack of brain power like MarketTech I guess

@ Satyapravin

Regarding the contest, keep in mind that the only time periods that matter are the 2-year backtest, 1-month paper trading, and 6-month real-money trading. Personally, I didn't even bother backtesting beyond the 2-year period, and I got very lucky during the paper trading period, and so far I've been unlucky trying to profit off of Q's capital (which, by the way, I knew could happen--the contest is all about reward to the winners with all of the risk burdened by Q).

@ Market Tech,

As a professional programmer (right?), I think I hear you volunteering to assist Mr. Bezwada in writing elegant, computationally efficient code that could be read by a 5 year old. Regarding the example you provided, getGs(n), it appears to be a matrix of constants needed by the secret sauce optimizer, CVXOPT's sdp. If you really want to dispense with some neurons, have a look at http://cvxopt.org/userguide/coneprog.html?highlight=sdp#semidefinite-programming (or you could buy a nice bottle of Scotch instead). This is actually a great case for Q to consider in their quest for domination of the hedge fund space. It is my belief (which I've expressed to them) that they are missing a key element, which is that hard problems get solved by diverse multi-disciplinary teams, not by "geniuses" working in isolation. The person who can understand and explain CVXOPT's sdp probably won't be the same person who can implement it best in code. And you may need an experienced, creative, intuitive type to steer the whole effort in the right direction, but who would never be able to understand what's going on "under the hood." And someone who understands the nitty-gritty mechanics of trading, market microstructure, etc. And so on. I'm guessing that the best hedge funds have such teams working elbow-to-elbow.

Hmm, hard to choose, I do rather like scotch...

Collaboration is always an option that the Q has curiously provided through its web interface. I've only ever used it once.

I would mention, as I am wont to do, that the trading headspace is an odd duck. On the media extolled side we have "proprietary algos potentially worth millions of dollars" -- so why would you ever think of sharing them (collaborating)? On the other side, the reality side, we have, as you've explained Grant, single disciplined individuals who would benefit greatly by teaming up with others who could then, as a group, build much better systems and strategies. I've run into this dichotomy time and time again.

"I can't share my secret trading technique."
"Yeah, but you can't code it yourself so you'll have to share it to build it."
"But, but, someone will steal it!"
"Right, so keep it to yourself then so that it will never actually make anyone money."
"But, but..."

This philosophy may not be as prevalent here, it is after all a forum of algo writers. But I would point out how few of the leaderboard contestants actually participate here. So maybe it's more systemic than one might think. It's an intriguing situation.

But I would point out how few of the leaderboard contestants actually participate here. So maybe it's more systemic than one might think. It's an intriguing situation.

True. The contest did change things a bit.

But I would point out how few of the leaderboard contestants actually participate here. So maybe it's more systemic than one might think. It's an intriguing situation.

True. The contest did change things a bit.

Is there a more-or-less canned, tried-and-true brute force method for scoring random baskets of securities (from a limited universe) for their suitability to collectively mean revert? If so, it would be straightforward to write some code in Q research, just to see how it plays out. There is no parallel processing, and there are unknown memory limits, but it is my understanding that a notebook could run indefinitely just churning over data.

I suppose one could do the clustering first, followed by testing each cluster. Are there other divide-and-conquer techniques?

Any thoughts? Pointers?

This wanders some distance off Satya's path, but that's pretty much what the Directed Acyclical Graph DAG was intended to do, identify a group of securities that were highly likely to mean revert. Alternatively one could perform a simple "arrhg!, who's with me?" kind of selection by picking an anchor security and then locating every other security that was either correlated or cointegrated with it. That way you'd limit your search. Pick 10 or 20 consistent anchors and then run dickyfuller or leastsqrs on the universe against each one; picking the top N as your group. Could even be automated perhaps and included in a strat.

Last year, in MATLAB, I tested all baskets of 2, 3 and 4 ETFs out of a universe of ~200 ETFs I think on two years of daily trade data, using Johansen's method for cointegration, several different stationarity tests, and a quick backtest as a final filter. It ran on my Macbook Pro for about four months. So far, none of them have really held up to intraday backtesting with quote data, leading me to believe that most of the so-called mean-reversion was just bid-ask bounce noise.

Why does "bid-ask bounce noise" lead to false positives? And is there a way to filter it out?

I deal with it by back-testing these things using quote data, and making sure that my backtest algo buys the basket at the asks and sells at the bids. I am not sure how one could get rid of it when using trade data, you just have to restrict your universe to those stocks with very narrow spreads.

I drew a picture of how it is deceptive non-mean-reversion on this thread: https://www.quantopian.com/posts/inaccurate-max-drawdowns-on-no-trade-days-slash-periods

Well, I'll have to think about why the bid-ask spread would come into play. I guess you are saying that price swings would need to be outside the bid-ask range to be useful, but if you don't know the bid-ask, then how would you ever know signal from noise.

Yeah, well imagine you had a perfectly hedged basket of X and Y which was perfectly stationary when long +1 of one and short -1 of the other. Suppose the value of this spread is now, and forever, $5. Generally, it would awesome to buy the spread when it dipped below 5, and sell it when it poked above$5.

Now, suppose that X has a bid-ask spread of $1 and so does Y. That means that the spread of the basket is$2, and so the bid of the basket is $4 and the ask of the basket is$6. Now, through the random noise trading of X and Y, some trades will be at the ask and some trades will be at the bid. If you calculate your basket value from trade data alone, you'll see the value of your spread jumping from $4 to$6 and in between. A correctly optimized algorithm for trading this mean-reverting spread would be to buy it when it dips to $4, and sell it when it trades at$6.

However, this is totally not executable, unless you are market-making both legs, since it costs $6 to buy the spread and it costs$4 to sell the spread. All the motion is really just random noise.

Statistical security trading seems replete with exceptions to the rules of mean reversion trading. I've built and tested so many variations of pairs, triangular arbs and basket strategies that I'm not even sure why I bother anymore. For the record NONE of the clients we ever had traded securities in such a fashion. They either day-traded equities using highly proprietary strats, traded options using broader strats or, most often, didn't trade equities at all. 90% of our clients traded futures. And there they had much better success using mean reversion in pairs and small groups. Grain complex, metals, energies, all with multiple expirations have much more predictable behaviors.

if you don't know the bid-ask, then how would you ever know signal from noise.

Indeed!

I believe the issue lies in the fact that there is so little information in the price values securities. On the surface it seems tractable, analyzing historic OHLCV price to project the path of said price over the next few minutes or days. But the lack of true predictive information tells another tale. This along with the mechanics, as Simon points out, belie the truth of profitable executions.

How much information is really in a WIO (weighted index oscillator) that has come out of its tight standard deviation channel and jumped up a deviation or two? Oh, it's bound to revert, stock A will fall back down or stock B will lift to match its brother; so sell A and buy B. But then the relationship changes. WIO is now at a new level. The information we had was incomplete. Stock A signed a big contract or stock B got hit by an EPA lawsuit. Price just doesn't contain enough information to reliably, profitably trade.

That's where all these social data feeds and insider trading stats (the legal kind) and earnings and fundy and blah blah blah type information sources try to overlay price with something, anything that can give credence to the reason why stock A jumped while stock B fell. But all these extra information sources are so damn hard to leverage. I bet the Chicago Traders Convention will have a ton of clever characters talking about their strikingly new data sources.

@ Market Tech,

For a small number of securities, the risk will always be there. As an institution, it seems you are gonna want something like a portfolio of at least 100 securities, and a minimum of $100M in capital. Then, you can have$1M in each security (on average) to keep trading costs down, and all of the special events should average out. Of course, this assumes that one could find 100 securities for which there is enough inefficiency for there to be an overall arbitrage, right? Sounds kinda hard.

@ Grant,

Once I get access to research platform, I will try to run it locally on my PC to see if CVXOPT can work with 100 stocks. That should provide some insights.

Found this interesting paper on how to model mean reverting spreads. Since above algorithm already finds a stationary portfolio the next step would be to trade the spread. This paper describes an algorithm to model these spreads so that they can be traded. Still above me for the moment but hopefully in time I will be able to understand and verify it.

@ Simon,

Regarding this business of the trade prices bouncing randomly between the bid and ask, shouldn't this be evident in a histogram of prices? Wouldn't one see a bi-modal distribution, at least over short time scales? And for OHLCV minute bars, wouldn't the H & L values be be representative of the ask & bid prices, respectively? Or are you thinking more in the limit of thinly traded securities where perhaps bid-ask becomes kinda murky? I did some quick skimming, and I gather that if I, guy-off-the-street retail trader goes to buy, I'll pay the ask, and when I sell, I'll sell at the bid. So, I'm supposing that if I looked at something like SPY in the research platform, I'd see trades distributed around two values (e.g. something like two superimposed Gaussian distributions). Or no?

Not really, since the mid-point of the bid-ask spread moves throughout the day. It contributes to the appearance of mean-reversion on short timescales, and I suppose it affects the kurtosis of returns, but it won't be bi-modal afaik. The bid-ask is never really murky, it's always there, no matter how thinly traded the securities are.

Perhaps I'll make a research notebook that has a model...

Yeah, if the midpoint is noisy or drifts quickly, then the bid-ask effect in the reported trades will be all smeared out, I suppose. But if you factor out the drift then shouldn't you see a bi-modal trade price distribution for large enough spreads? Or maybe the spreads are always buried in the noise?

Plot a couple of WIOs and see if you think you can trade them. In the end what you end up trading is a single composite price-like-thing. You can fairly easily see these in TradingView... For instance, here's a few simple WIOs

You get the picture.

Now, those are daily prices of course. And most likely one would try to pairs/basket trade on a much tighter periodicity. But then you get into Simon's case of what's a spread and what's a bid/ask wiggle? So you'd have to back out your trading analysis time period to one where that wiggle gets swamped by the actual drift in the spread. And then what you've got to worry about is your WIO going directional. And ultimately that's what kills every pairs strat. The basket strats, as Grant points out, are where I've tended to concentrate my arb strats. But still, they have their directional tendencies too.

If you look at futures calendar spreads they seem to be more easily traded in this way:

(In a basket arb strategy one ends up realizing that the instruments you trade are effectively (n * n-1) / 2 WIOs (pairs strategies). 10 legs gives you 45 pairs that you wind up trading. Now, with that thought in mind you get the added information of the composite WIO, either a blend of 10 individual technical variations, or a grander more complex bundle of 45 WIOs. Trading 45 WIOs, cherry picking the widest aberrations among them, may be the best bet.)

Grant: I stand corrected, in my simulations, if the spread is much wider than the drift, there are detectable traces in the distribution of returns, but only if the spread size is constant. If the spread size is stochastic too, then it's completely smeared away...

@ Simon,

That's the picture in my mind. If the spread dominates, then it'll be detectable. But if it "breathes" versus time, then the cumulative stats. will be smeared. Instantaneously, though, the distribution will be bi-modal.

@ Simon,

Well, there you go. If you can detect a signature of the bid-ask spread in the trade data, then perhaps it's enough information to help in a trading strategy. And if you can't detect it, then the strategy could just stay out of securities with unknown estimated bid-ask until the spread can be detected above a level of statistical significance. Another thought is that over short enough time scales, perhaps the drift in the mid-point (average of bid-ask prices) could be modeled as a low-order polynomial (linear or possibly quadratic), over N minutes intra-day.

Regarding thinly traded securities, I don't understand why, at an instant of time, there would always be both bid and ask prices. Say I want to sell 100 shares of PIG (Pork Industries Gigante), my massive Midwestern farming enterprise. If nobody is interested in PIG, then when I put the lot out for sale, there will be only an ask price, right? Or does a bid price always magically appear since there will always be some algorithm following PIG that will automatically pick a bid of \$0.

Generally, because designated market makers are required to make two-sided markets. I don't know if there are any stocks without designated market makers or specialists, it's been a while since I was involved in that stuff, and I think some rules changed after the flash crash.

Just a FYI: I tried something similar some time ago, see here: https://github.com/harpone/PyArb

The holding times I used were very short and profits were pretty much killed by transaction costs... I think it's quite a bit simpler to implement than the other model in regards to estimation etc. Also IMO the model is way too simple, and in fact it might be smarter to model the returns instead of (log) prices (there are additional subtleties in estimating the price time series process).

Never seemed to have enough time to try that out in Quantopian, but if anyone gets something out of it, let me know :)

Hi Pravin,
I was wondering if you tried to implement this nice algo with CVXPY to scale with bigger stock universe ?

Hi Mat,

I wasted several months on this algorithm only to realize that it relies on covariance matrix in level data (non-stationary). I have since then discarded this approach to look for alternative techniques. Will port it to CVXPY if time permits.

Best regards,
Pravin

@Mat, if you read page 3 of this paper, you will see that it assumes that asset prices follow a stationary VAR process which is not at all reflective of true nature of stock prices. It could work on something like implied volatility but Quantopian does not have option data.

Thanks for your answer Pravin, maybe we can add some regime switching model to this algorithm like HMM or other based on volatility to detect changes in series behavior ?
What is your last research for finding stationary asset prices ?
Regards,

@Matt, If could find a way to compute stationary covariance matrices from non-stationary prices, it will be revolutionary. My current research is much simpler based on return series which are at least weakly stationary.