Back to Community
Improved Minimum Variance Portfolio

After a nudge in the right direction from Grant Kiehne I made a new version of a minimum variance portfolio. It uses a Lagrangian to solve for the weights that minimize the variance. I used returns on the vwap for everything and the portfolio was randomly generated.

I'm not a huge fan of shorts so I added context.allow_shorts, which if False, the algo will use non-negative least squares regression to solve the Lagrangian. I also added the ability to re-invest cash but I'm thinking there's a cleaner way to do it. Ideas?

I'm pretty happy with the results I've seen so far, I'm gonna have to port it over to min data at some point to start paper trading. Does anybody have any idea how to stop that negative cash dip on the first day it invests? I commented out what I tried (line 51).

Dave

Clone Algorithm
208
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 52a57b7b3aa1c3076d9c084d
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.
19 responses

This fixes the crazy oscillations in the positions and cash. It performs a lot better too. Over 400% on the same portfolio

def re_invest_order(sym, weights, context, data):  
    P = context.portfolio  
    if P.cash > 0:  
        new_pos = int(  
            (weights[sym] * (P.positions_value + P.cash) / data[sym].price)*(1 - context.cash_buffer))  
    else:  
        new_pos = int(  
            (weights[sym] * (P.positions_value) / data[sym].price)*(1 - context.cash_buffer))  
    return new_pos  

Very interesting result with the non-negative least squares! I noticed were that there seems to be a lot of churn in the portfolio with NNLS approach perhaps a slower re-calibration period may save on transaction costs/slippage if you included those especially if you are considering paper trading. I would be interested to see some measure of portfolio turnover comparing the two methods.

Just to note, I adjusted your function min_var_weights a bit where you called lagrangize, its not really necessary, the version with short sales that I posted originally has the closed form built into it. If you look at the difference in the calculated weights you can see that it is within machine precision.

def min_var_weights(returns, allow_shorts=False):  
    '''  
    Returns a dictionary of sid:weight pairs.

    allow_shorts=True --> minimum variance weights returned  
    allow_shorts=False --> least squares regression finds non-negative  
                           weights that minimize the variance  
    '''  
    cov = 2*returns.cov()  
    x = np.ones(len(cov) + 1)  
    x[-1] = 1.0  
    p = lagrangize(cov)  
    if allow_shorts:  
        #this is exactly the same as the line below (weights = ...) with less mess  
        weights2 = np.linalg.solve(p, x)[:-1]  
    else:  
        weights2 = nnls(p, x)[0][:-1]  
    precision = np.asmatrix(inv(returns.cov()))  
    oned = np.ones((len(returns.columns), 1))  
    weights = precision*oned / (oned.T*precision*oned) #these are nearly equal, check the logs  
    weights2 = np.asmatrix(weights2).T  
    log.info(weights - weights2)  
    return {sym: weights2[i] for i, sym in enumerate(returns)}  

excellent coding style by the way, much neater than mine, nice job!

Nice, they are definitely within machine precision, that's good confirmation that they're both working as intended. I separated out the Lagrange function because they show up in other problems, I thought the portability might help at some point down the road. Thanks for the style props, I borrowed the enumerates and a couple other things from you so your complimenting yourself too.

I fixed the churn in the portfolio by using the positions value rather than starting cash to re-balance. It smoothed everything out and made the returns a lot higher. I also added a couple lines to invest evenly on the first day while the number of observations build up. That way there's no lag getting into the market and it fixed the initial negative cash dip I was getting before. This test has those changes with the same portfolio and settings.

Something worth noting is that I ran a test where everything went horribly wrong with the non-negative approach. It was caused by the algo going .999999 into a single security and 1e-15 ish in some others. I'm thinking that a ceiling needs be put on the weight that can be given to a security and any weights within machine precision of 0 need to be set to 0. I would be a poor man due to an aberration if that happened in a live situation.

Clone Algorithm
120
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 52a6044f6508bd07620f3a20
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

How was the list of stocks chosen for this portfolio? Is it a random list? Has anyone tried other sets of symbols.
Is there a process for picking stock list for this algorithm if I wanted to try other portfolios?

Sarvi

Ya it's a random list, I'm not sure what they are. You can just replace the context.stocks list with whatever securities you want. Just make sure they were traded throughout the time period your testing. There's no method to it, just change that one list.
Also, I think there's a mistake in this version.

# in min var weights  
x = np.ones(len(cov) + 1)   

# Should be  
x = np.array([0.]*(len(cov)+1)])  

The vector being solved for should be all zeros and one 1. For some reason this doesn't seem to change the outcome of the solutions but it's worth mentioning.
I would clone the updated version above, it got rid of the oscillations in the cash and positions.

I want a huge global universe of stocks ie 1000 not just a cute handfull of 15 stocks!

Back tests let you use up to 100. Look at DollarVolumeUniverse in the docs, it gives you a sample of the market

Has anyone tried modifying this to run on minute data to replicate the results of daily data.
You can't do paper trading until the algo can handle minute data. The simple tip I got about converting goes like this
" exchange_time = pd.Timestamp(get_datetime()).tz_convert('US/Eastern') log.info('{hour}:{minute}:'.format(hour=exchange_time.hour,minute=exchange_time.minute))
if exchange_time.hour != 10 or exchange_time.minute != 0:
return
" to the begining. But that does reproduce the results of daily data though.

Sarvi

I have a version that runs on minute data, however the back tester tells me the sids I used with daily data are not available in minute mode. I'll run a full test on min data, then replicate it with a daily test after to avoid this. Ill post the results later.

Marcus, I have warned about this before here but I repeat myself, let's assume you even wanted to use 43 stocks;

One pitfall of this Markowitz type of analysis is the curse of dimensionality. You have 43 stocks which means that you are estimating the covariance matrix containing 43*42/2+43 = 9,073 parameters utilizing just 252 observations. This is complete statistical nonsense, the parameters are not even uniquely identified. You are going to want to increase the length of your observation window to tighten the standard errors on the covariance estimates. 90,000 days would probably be sufficient for nice tight estimation with 43 assets. Obviously that size of estimation window is unreasonable which is why the next step would be to implement dimension reduction techniques. An exogenous factor model could work nicely, the principal components approach is another methodology or, the factors on demand methodology of Meucci. Asymptotic principal components from Connor and Korajczyk 1986 could be an excellent solution with a very large number of assets.

Wayne, I forgot that you wanted to see a comparison of the negative vs. non-negative results. I have found that the non-negative is a little bit more consistent but there is neither one is conclusively better. For example, this backtest does 473% while allowing shorts and about 153% without, but the test above does 400% with the non-negative approach and about 180% when allowing negative weights.

The code for this test works with min and daily data, and the shorting can be toggled also. I made another version that solves the problem on page 12 of the paper in your comment above, it is more erratic though, especially with negative, and it goes all in on one security without the negative weights.

Clone Algorithm
120
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 52b08f3c4699f2074840372f
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

@ Wayne Nilsen

"One pitfall of this Markowitz type of analysis is the curse of dimensionality. You have 43 stocks which means that you are estimating the covariance matrix containing 43*42/2+43 = 9,073 parameters utilizing just 252 observations. This is complete statistical nonsense, the parameters are not even uniquely identified. You are going to want to increase the length of your observation window to tighten the standard errors on the covariance estimates. 90,000 days would probably be sufficient for nice tight estimation with 43 assets. Obviously that size of estimation window is unreasonable which is why the next step would be to implement dimension reduction techniques. An exogenous factor model could work nicely, the principal components approach is another methodology or, the factors on demand methodology of Meucci. Asymptotic principal components from Connor and Korajczyk 1986 could be an excellent solution with a very large number of assets. "

Interesting. I have to read it again to understand it :-) Could you please have a look at my paper:

Davidsson, M (2013) The Use of Least Squares in the Optimization of Investment Portfolios,
International Journal of Management , Vol 30, No 10, pp 310 – 321
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2366298

Let me know what you think?!

I agree with Wayne, the dimension of the covariance matrix is O(n^2 ), but the observations is O(n). However, I don't agree the dimension of covariance matrix is exactly n*(n-1)/2+n, there is hidden relation there. So when too many equities in the portfolio, we need more or much more observations to make it has statistical meaning.

On the other hand, consider in time series model, too many observations means using outdated data to do the prediction, which does not make sense either. Thus we cannot make the observations unlimited many.

The final conclusion, we need restrain the number of equities in the portfolio, or this Markowitz type of analysis is the curse of dimension.

Also I rewrite part of the code to fix the negative cash dip on the first day problem.

And I implemented the SPDR sectors as the portfolio picks, since they are with low variance, which is our trading idea. I was going to implement set_universe feature of Quatopian, but we only have one implementation DollarVolumeUniverse, which picks the stocks by the liquidity. However, if we pick high liquidity equities and all of them have high volatility, which is not our trading strategy idea, minimum the variance. Second, some of these liquidity stocks are new IPO, such as EEM, QID, and some are going default, such as MER, OIH, they are strongly influence our portfolio picks and rebalance.

Clone Algorithm
10
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 52b1265e8590290781328d37
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

Last word, I don't believe this trading strategy can print money, since it is very conservative, but it should beat the market.

The host can get so good result, only because the picks are too lucky. Say I long these 20 equities on the first day evenly, and do nothing, my return is still 251%. :)

Clone Algorithm
12
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 52b25ff5293c16274c6fde4a
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

I agree that this is a long strategy. The game would be to put promising securities in there and just leave it. It has flaws but is a good start towards something better.

There is new versions of this on this post. One is min data with history() and the other works like this but can run minute or daily data. The also don't have the churn in the cash and positions. There was a couple mistakes in this version, see my earlier comments for the fixes.

I think there is a lot of promise in combining this with something like Frank Grossman's process of selecting a portfolio on the basis of relative strength and volatility. See http://02f27c6.netsolhost.com/papers/darwin-adaptive-asset-allocation.pdf for some qualitative ideas.

I have played around with this strategy, and found when I include assets that are not correlated, or negatively correlated, the weighting algo overallocated weights to assets that tend to be less or negatively correlated. I adjusted the max weight, but it only seems to register a log output, suggestions on how to adjust to restrict overallocation?