Back to Community
Trading baskets co-integrated with SPY

My first attempt at co-integration. Run only on 2014 and still work in progress. Could someone please help with following issues:

  1. This is a daily trading strategy but I had to run it on minute mode because when algo finds a profitable spread it should capture it immediately instead of waiting for next day's close when spreads could change. The solution I found was to run it on minute mode but use schedule function so that algo runs only once per day. Is there an alternative?

  2. The implementation of Johansen's test I borrowed from internet fails very often because matrices are not invertible. I understand that it is possible to get "useful" matrices that are invertible from non-invertible matrices using regularization? Please advise if you know of any such methods. Because of this limitation there are long periods of time where algo doesn't trade at all because Johansen fails.

  3. Ideally would like to run Johansen's test on all combinations of stocks to find a nice cointegrated basket of stocks. But handle_data timesout. Is there a better alternative?

Clone Algorithm
Backtest from to with initial capital
Total Returns
Max Drawdown
Benchmark Returns
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 54abba532e72500913fcdf1a
There was a runtime error.
8 responses

I think I have stumbled on the same Johansen's test that you used here, I believe there might be an issue with it somewhere because it never made it into the statsmodels repo. I'm not sure if this is the same implementation, but you might want to check this one out

When the Johansen test fails, you could just try again on the next bar until it is successful, but you might not be able to use schedule function if you do it that way. I wish I could help out more, but the math gets pretty intense in this one, I'm not sure I have the background to really critique the implementation.


Thanks for the reference David. I will try this implementation and see how it works.

Hi Pavy,
I found this fairly recent stackexchange thread talking about using Johansen from statsmodels.tsa.johansen . It looks like johansen was recently added to statsmodels within the past year or so. I haven't tried using the method yet in Quantopian, but thought I'd share in the event it's useful.


The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hello Pavy,

Regarding your #3 above, the problem with handle_data timing out, you could consider extending your computations over multiple calls to handle_data. If I understand, you want to compare each security in a set of N securities to SPY. So, could you just compare M of the securities, per call to handle_data (M < N)? For example, do 10 comparisons per call, and in 10 calls, you'll have cycled through 100 securities. Another approach would be to use a timer, and do as many comparisons as you can within the allotted time for a call to handle_data (~50 seconds).

Ideally, it should be possible to properly handle the time-out error, but I think this is a fatal error that can't be managed, and your backtest/algo will crash. But maybe someone has a clever way of making it work.


Great idea Grant. I will try the multiple calls to handle_data. Many thanks.

For item 2, does the error happen when doing pseudo inverse?

Hi Pavy, I'm not sure if this is a bug in your code: coint_johansen is called by
jres = coint_johansen(y, 0, p)
But the function is defined as coint_johansen(y, p, k)
However, if y is detrend by 0 order, and k=1, the code is no problem.

Thanks Han Chen. I will take a look at the implementation. Maybe I got the order wrong.