Back to Community
The efficient frontier: Markowitz portfolio optimization in Python using cvxopt

In this blog post you will learn about the basic idea behind Markowitz portfolio optimization as well as how to do it in Python. We will then show how you can create a simple backtest that rebalances its portfolio in a Markowitz-optimal way. We hope you enjoy it and get a little more enlightened in the process.

To view the full blog post, see here.

While cvxopt is available on the research platform, we're still in the process of adding it to the Quantopian backtester.

Loading notebook preview...
Notebook previews are currently unavailable.
Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

68 responses

Nice, I guess it is better known as Efficient Frontier right?
How can I clone it?

Xavier: Good point, I changed the title to include that key word.

Are you signed up for the research alpha?

Nice Thomas. Just curious, does finding the efficient frontier require an optimizer, or is there a closed-form solution? --Grant

If I remember correctly, there are closed-form (Lagrange, critical line?) solutions if one only has equality constraints. If there are inequality constraints, in general, I believe you need an numerical optimizer.

Correct Simon. The unconstrained version that allows long and short positions can be solved easily by using Lagrange multipliers. The constrained version can also be solved by Lagrange multipliers, but the evaluation is non-trivial due to the inequality constraints.

So which problem did Thomas solve, the unconstrained or the constrained one? If the former, it would be interesting to compare CVXOPT performance to the closed-form solution, particularly in the context of backtesting/live trading, where only 50 seconds are allocated for computation (due to the handle_data time-out). --Grant

Correct, this is the constrained problem (all weights > 0 and sum to 1). David has a version of the critical line algorithm here: https://www.quantopian.com/posts/critical-line-algorithm-for-portfolio-optimization

As you can see, David's code is quite hairy (not his fault but that of the algorithm) so I very much prefer to use the optimizer for this and solve the problem directly. Since it's a convex problem this is very fast and accurate but of course it depends on universe size.

Hi Thomas,

Poking around, I came across this:

http://www.researchgate.net/profile/Ming-Chang_Lee2/publication/264547651_Capital_market_line_based_on_efficient_frontier_of_portfolio_with_borrowing_and_lending_rate/links/53e493160cf25d674e94daa8.pdf

Skimming over it, it appears to solve the same problem. In Section 4.1, the authors state:

Portfolio optimization involves a mathematical procedure
called quadratic programming problem (OPP). There
considered two objectives: to maximize return and minimize
risk. The OPP can be solved using constrained optimization
techniques involving calculus or by computational algorithm
applicable to non- linear programming problem. This paper
use matrix operation, it includes matrix inverse, matrix
multiplication, and matrix transpose.

So, I'm wondering if there isn't a more efficient, scalable approach that would involve a single computational iteration (presumably, CVXOPT is a iterative solver).

Grant

Hi Grant,

That might very well be the case. cvxopt is iterative but in general these type of problems can be solved with a small number of iterations. In addition, matrix inversion is quite costly too so to me it's not a given that the analytical approach will be much faster and scale that much better. There are also some other, commercial, convex solvers that are more optimized than cvxopt, so that's also a way to make this faster.

Thomas

Thanks Thomas,

Good point about matrix inversion being non-trivial. Regarding your comment about the difficulty of using scipy.optimise.minimize, I've been playing around with it, and it's not so bad (although maybe CVXOPT is better-suited for quadratic problems).

In your blog post, if I'm reading it correctly, you are saying that the return versus volatility is parabolic ("...perfectly fit a parabola..."), but isn't it expected to be a hyperbola (e.g. see http://en.wikipedia.org/wiki/Modern_portfolio_theory)?

Grant

Grant,
I'm pretty sure you're correct about the hyperbolic curve of the efficient frontier. I don't recall the justification for using the quadratic in the whitepaper, I believe it was a simplistic approximation to simplify the math. I went ahead and compared quadratic vs. hyperbolic curve fits for this and the hyperbolic curve fits the data much better, especially towards the vertex of the frontier curve.

EDIT: Pretty sure the plot in the original notebook was incorrect, the quadratic works well, I updated the notebook.
Take a gander at this notebook

David

FWIW a casual google indicates it's a parabola in variance space, hyperbola when using volatility. In any case, I would expect the parabola to be fit sideways, apex on the left.

Sounds like both are correct depending on the problem, I had always heard parabola, but read it was a hyperbola more recently. I re-ran that notebook making sure all the axes were lined up the same way and the quadratic fit looks more correct, I must have been doing something odd in that first one.
I forgot how to insert pics, so here's a link

May I note that the general problem w/ Markowitz optimization isn't the model fitting (that's well known, though code and example is always good) but estimating the future mean returns and covariance. In this example, you simply use the historical mean return and covariance which isn't necessarily particularly predictive or reliable. If you're interested in applying this in production, refining those estimates are a good direction to push on.

Hi David,

The paper I referenced above shows a hyperbolic relationship (see eq. 16):

It explains that the portfolio frontier is a hyperbola in
mean-standard deviation space.

Odd that you are getting a better fit with a parabola...doesn't seem correct.

You use this function:

def hyperbola(x, x0, y0, a, b):  
    """  
    formula used:  
        (y - y0)^2 / a^2 - (x - x0)^2 / b^2 = 1  
    """  
    return y0**2 + np.sqrt(a**2 * ((x - x0)**2 / b**2 + 1))  

But, it is incorrect. You should return:

y0 + np.sqrt(a**2 * ((x - x0)**2 / b**2 + 1))  

Grant

Thanks Grant, that's pretty bad I screw up algebra. They both fit about the same when the math is correct, but the hyperbola is really sensitive to the initial guess for the curve fit function. I generally don't think mean variance optimization is very good for equity portfolio construction anyway, like Alex and others have pointed out, it's the future returns and covariance that really matter. It's still interesting though

Hi David,

For the hyperbola fit, you can still fit to a polynomial of the form:

y^2 = a*x^2 + b*x + c

Once you find a, b, and c, then you can solve for y by taking the square root of both sides. Then you won't have to use a solver that requires initial guesses for a, b, & c.

Grant

David, Thomas,
This code is always assigning ~100% to one of the securities and 0 to the rest. Is there a way to obtain optimal allocation that maximizes sharpe using cvxopt?

Karol,

Sounds like adding an inequality constraint would be useful here (e.g. sharp_ratio > 1.5). It appears that it might be doable, per:

http://cvxopt.org/userguide/modeling.html

Note also that you should be able to change the bounds on the individual security allocations, since (0,1) for each will allow the optimizer to land on a single security as the optimum.

Grant

Karol -- yes that's a good observation and certainly not the behavior one would expect.

I didn't want to go into the intricacies of applying this to real-world data so as to not distract from the theory. There are several issues here:
1. The universe is way too small. If you were to plot the stock returns the same way as in the bullet plot one would be in the upper left corner and appear to have much higher returns and low volatility. That's why the optimization always puts all it's weight behind all of them.
2. The mean is a terrible terrible estimator -- especially for stock returns. In practice you'll often find people to just heavily regularize the mean to some common value. In the extreme case you can just set the mean vector to be just 1s and essentially ignore their contribution. That way you'll get a weighting that's only trying to minimize the variance.
3. Covariance estimation is extremely noisy. df.cov() looks pretty harmless and you'd expect to get a clean covariance matrix but unfortunately it's very brittle. That's why you'll also want to apply regularization to the covariance matrix. See here for more details and tools to do that: http://scikit-learn.org/stable/modules/covariance.html

In case it would be useful to someone here are some routines I wrote to estimate the covariance matrix in a more robust way:

def cov2cor(X):  
    D = np.zeros_like(X)  
    d = np.sqrt(np.diag(X))  
    np.fill_diagonal(D, d)  
    DInv = np.linalg.inv(D)  
    R = np.dot(np.dot(DInv, X), DInv)  
    return R

def cov_robust(X):  
    oas = sklearn.covariance.OAS()  
    oas.fit(X)  
    return oas.covariance_  
def corr_robust(X):  
    cov = cov_robust(X)  
    shrunk_corr = cov2cor(cov)  
    return pd.DataFrame(shrunk_corr, index=X.columns, columns=X.columns)

def is_pos_def(x):  
    return np.all(np.linalg.eigvals(x) > 0)  

And here is an updated optimization function that takes a covariance matrix as well as allows you to shrink the means to 1:


def markowitz(returns, cov=None, shrink_means=False):  
    n = len(returns)  
    returns = np.asmatrix(returns)  
    N = 100  
    mus = [10**(5.0 * t/N - 1.0) for t in range(N)]  
    # Convert to cvxopt matrices  
    # minimize: w * mu*S * w + mean_rets * x  
    if cov is None:  
        S = opt.matrix(np.cov(returns))  
    else:  
        S = opt.matrix(cov)  
    if shrink_means:  
        pbar = opt.matrix(1., (n, 1))  
    else:  
        pbar = opt.matrix(np.mean(returns, axis=1))  
    # Create constraint matrices  
    # Gx < h: Every item is positive  
    G = -opt.matrix(np.eye(n))   # negative n x n identity matrix  
    h = opt.matrix(0.0, (n ,1))  
    # Ax = b sum of all items = 1  
    A = opt.matrix(1.0, (1, n))  
    b = opt.matrix(1.0)

    if not shrink_means:  
        # Calculate efficient frontier weights using quadratic programming  
        portfolios = [solvers.qp(mu*S, -pbar, G, h, A, b)['x']  
                  for mu in mus]  
        ## CALCULATE RISKS AND RETURNS FOR FRONTIER  
        returns = [blas.dot(pbar, x) for x in portfolios]

        risks = [np.sqrt(blas.dot(x, S*x)) for x in portfolios]  
        ## CALCULATE THE 2ND DEGREE POLYNOMIAL OF THE FRONTIER CURVE  
        m1 = np.polyfit(returns, risks, 2)  
        x1 = np.sqrt(m1[2] / m1[0])  
        # CALCULATE THE OPTIMAL PORTFOLIO  
        wt = solvers.qp(opt.matrix(x1 * S), -pbar, G, h, A, b)['x']  
    else:  
        wt = solvers.qp(opt.matrix(S), -pbar, G, h, A, b)['x']  
    return np.asarray(wt).ravel()  

Do you already regret asking? ;). Unfortunately those are the issues one has to deal with when actually applying clean theory to messy real-world data.

The notebook should not publishable in forum until everyone has access to it. Or maybe create a forum to users only on the beta version?

Lucas, you can also use the stand-alone version from here: https://github.com/quantopian/research_public

Thanks Thomas,
So let me see if I'm following correctly. I was expecting that by using a small universe the optimizer would actually find it easier to get the optimal solution, not the other way around. Finding the best combination of 4 symbols should be a piece of cake, put 1000 and then things get complicated.

So, are you saying that because of the noisiness of the real data the optimizer is not being able to find an optimal solution? (my definition of optimal solution is the one that maximizes sharpe ratio, not the one with highest return nor the one with smallest variance).

P.S. @Grant, I tried changing the lower bound constraint to -0.1 instead of 0, h = opt.matrix(-0.1, (n ,1)) and that basically assigns all the time 10% to each symbol except one that allocates the rest of the portafolio, (if using 4 symbols, 3 get 10%, one gets 70%), definitely not the optimal solution.

Optimizing for the maximum mean return will always find the one stock with the highest return, since return is a simple mean. If you want to optimize for Sharpe, you need to set up the optimization for that. I don't recall whether that is still a convex optimization.

However, if I remember correctly, the maximum Sharpe portfolio is the tangent of frontier and the risk-free rate intercept, so that might be another way to get at it.

Some relevant info. and links to papers here:

http://quant.stackexchange.com/questions/8594/derivation-of-the-tangency-maximum-sharpe-ratio-portfolio-in-markowitz-portfol

If I understand correctly, the efficient frontier is the minimum standard deviation for a given return, so in effect, the Sharpe ratio is maximized along the frontier, so long as the returns are referenced to a benchmark.

I don't think it makes sense to talk about maximizing the Sharpe ratio without a constraint on the expected return. I could go to the local bank and take out a CD at 1% interest and have an infinite Sharpe ratio, but my return would be disappointing.

Well I would love to have @Simon's contest algo annual return of just 23.68% but with a superb Sharpe of 4.5 because of it's low volatility of 4.743%. That's on a clear path to win this month's contest.

Is there a way to read an external csv from the Research platform? fetch_csv is not there yet and pd.DataFrame.from_csv seems to be disabled.

Hi Karol and Q support,

I think one should be able to drag a file into the notebooks/data area, and it'll be uploaded. However, for the life of me, I can't get it to work. The operation just ends up making a copy of the file on my local drive. I get the same behavior with both chrome and firefox browsers.

Grant

Karol, please keep discussions relevant to the thread.

As to the NB, there's a specific area where you have to drop the file I think, specifically the top line that says "Notebooks" with the "new notebook" button.

Hi everyone,

  • thank you much Thomas for this nice post. I am learning a lot by playing around with this code.
  • I just wanted to attached the results of the backtest compared to the S&P 500. I thought that it might give a nice way of looking at the results in a bit more detail. I hope that I translated it properly into the backtester, but it is my first try....
  • Also concering the question by Grant on the Sharpe ratio. I understand the sharpe ratio be basically the slope of a straight line as it is basically return/volatility. If you now optimize the Sharpe ratio you get a line that goes through zero (or the fixed interest rate) and touches the efficient frontier. I really loved the interpretation of Shiller in his coursera course on this line. Imagine that you have the possibility to put any fraction into the tangential portfolio or the zero risk option. And you can even borrow money. Then you can chose any amount of risk by moving money between the fixed rate and the tangential portfolio. And the neat thing is that you get a higher return for a given risk than for any individiual stock. Does this make any sense ? And do others agree with this interpretation ?

--Fred

Clone Algorithm
118
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 552697ee43ce990d45991068
There was a runtime error.

Fred,

That's really cool. What's especially cool is that cvxopt didn't use to work on Quantopian and I'm not sure why it does all of a sudden (we did roll out a few bigger changes so must have fixed it without realizing)!

The code looks good to me but note that you don't have to skip the first 100 days as history() automatically backfills on Quantopian (only needed for zipline where we don't do that).

I was surprised that it runs too, but figured that you simply did not annouce it.

Both Robert Shiller's and Tucker Balch 's Coursera courses exposed the EF as class material and I agree, it was enlightening. What I've continued to take issue with however, is this concept that one need only "pick a point" along the curve and achieve one's balanced risk/reward. Like it's easy to just slide your finger along the curve and select, at your leisure, what you'd like to earn on your money. Yeah, if only. The focus on past prices that make up these curves and the assumption that those prices exist in perpetuity always dumbfounded me. 20/20 is great, and these EF curves are exactly that. What is rarely talked about is that the future will NOT fall along that line. Yeah it's nice to know what you could have earned. But what you will earn surely won't be so easily cherry picked off of some historic curve of "what ifs".

Addition: What would be interesting to see would be a series of EF charts where, like @Fred J. has used, the re-optimized portfolio is plotted within the historic EF curve to see how well the optimization performs and stacks up against a perfect view of the past. "Here's the frontier, and here's the actual returns for the optimized portfolio as the group and its ratios stood prior to the measurement."

Hi Market,

what do you mean with 20/20 ? For the rest, it seems that everyone in the world agrees that the Markowitz optimization is a rather rough framework, but it seems to be a nice starting point for portfolio optimization

--Fred

@Fred J., From what I remember of T Balch's class, I spent a hellofa lot of time creating these EF plots (and all the data that drives them), then assuming I could pick an N sized group that made up one of the excellent points that fell along the perimeter of my curve and forward testing them -- to continued disappointment. Even when I decided to simulate a monthly recreation of the curve, and pick a new portfolio from approximately the same spot on the curve, the walkfoward returns from that point were always way off of what put that group on the curve in the first place.

It was as if I was picking the brightest stars, right before they dimmed and faded. Every time.

A starting point for portfolio optimization? Sure. But wouldn't you rather want to find the groups that were NOT on the curve yet, but were headed that way? That was what I concluded. That the EF was and will always be history and, from my observations, have less (than I'd expected) to do with the future. All of this, of course, is my findings and your results will most certainly vary or refute what I've said here. Looking back to build the curve produces a nice curve of what "could have been", e.g. 20/20.

Hi,

I played a bit more around and summed up the results in a notebook .

@Market it is really funny to see how poorly the portfolio performs returnwise. But the algorithm is actually not that bad in reducing the risk, so I do only half agree. What I also find impressive how large the errorbars are on the return anyway, so I really do not know what one should expect. It is really sad/impressive to see how robust fully diversified portfolios are.

@Thomas I wanted to learn the pymc stuff to get actual errorbars all over the place. Does this work on the Quantopian platform ? And do you know of an example where the minimization of the risk is done with pymc or things like that instead of cvxopt ?

I gotta say, the biggest problem with these studies is not the Markowitz part, it's using historical returns/covariances as estimators for future returns/covariances.

The really interesting stuff comes when you combine this stuff with the "low volatility anomaly", dissected thoroughly by Eric Falkenstein's former blog http://falkenblog.blogspot.ca/ , and commercially exploited by ETFs such as SPLV, USMV, EFAV and EEMV (fyi all of which I own). It's a great anomaly; persistent, broad, high-capacity, and with a solid rationale in behavioural finance.

I can't get cvxopt to work in the backtester, but apparently Fred did?

@klon

Yes, i were able to run backtests. Does it throw errors for you if you clone 'my' algorithm?

Really enjoying all these discussions.

I agree that the efficient frontier is a very idealized framework and mostly a way of looking at the past. Especially the mean returns are a bad indicator. If you replace the mean vector with ones, however, (as is the case in the code I posted above with shrink_means=True), you only get diversification according to variance and covariance. Based on returns that might still be very noisy however. One other approach is to instead only take the exposure to certain risk factors and then correlate based on those. Here's an interesting paper I'm currently reading on that topic (by Markowitz): http://pubsonline.informs.org/doi/abs/10.1287/opre.1050.0212

FYI I just started reading http://www.lhpedersen.com/efficiently-inefficient , it talks about this, but more generally it's (so far) a really excellent book, and relevant for folks here!

@ Thomas Wiecki,

I agree. It is a way of looking at the past. It could be a brilliant technique if we can simulate (using any pricing algorithm) scenarios that have not been encountered in the past, but can be realistically expected sometime in the future. For example negative interest rates. This might give a better idea of the expected value of returns (mean) and the expected variance of returns.

By the way, what is your rebalancing philosophy?

The original Markowitz paper is still one of the best intros to portfolio optimizations. Like Simon alluded to, he doesn't fall prey to the pitfalls that many EF practitioners do. IIRC, he suggests that the ER estimates should be subjective and even historical covariance was up for modification.

https://www.math.ust.hk/~maykwok/courses/ma362/07F/markowitz_JF.pdf

Here is an alternate version that reformulates the cvxopt input a bit and also more accurately focuses the search range for the efficient frontier closer to the 'convex' part of the curve.

Together this results in an algo that does select multiple securities to minimize variance and with maximum sharpre ratio- typically 3 to 5 on any one day - from a universe of 8.

The search for the efficient frontier is improved by limiting the range from zero (or the minimum expected return) to the maximum observed return. If a solution exists, this is where the efficient frontier curve is most "convex" which sets the table to get a good polynomial fit. I selected 25 points which should be plenty to get a good fit for a second order polynomial.

The change to the cvxopt input is in the G and h matrices. The idea here is to include the target return in the G and h matrices rather then multiplying the covariance matrix by the target return. I think this is a more accurate formulation of the problem to be solved.
Reference : https://wellecks.wordpress.com/2014/03/23/portfolio-optimization-with-python/

Also added is some basic record statements to track the number of holdings and close outs.

Just as a comment, I agree with Market Tech and others in the thread in that you need to look more closely at how the covariance and expected returns are determined if you plan to use this type of algo in a real market. Variance minimization algos always seem to latch on to the outliers!

Richard
http://quant-coder.prokopyshen.com/ContactMe

Providing custom quant programming, tutoring, linux cloud support and troubleshooting services.

Clone Algorithm
100
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 55324656e071250d496053f9
There was a runtime error.

Hey there,

I'm new to this field. I've read through the posted code and I do not get why

x1 = np.sqrt(m1[2] / m1[0])

is the optimal mu to optimize.

Can you explain me, how you derive x1?

Cheers

Christopher -

The function that is curve fitted is a second order polynomial, which in general can be written as a*x^2 + bx + c = y , where x is the return and y is risk.

np.polyfit returns these constants as a=m1[0], b=m1[ 1] and c=m1[2] .

The algo wants to find the maximum sharpe ratio, which is the value of return where return/risk - x/y - is a maximum. This is more easily found by solving instead for y/x is a minimum.

This is the return value (x1 in your snippet) that intersects the second order polynomial curve with slope equal to zero, and it will be a tangent line from the origin:

y/x = ax + b + cx^-1

Take derivative, set to zero and solve:

dy/dx = a + 0 - c*x^-2 = 0
a = c*x^-2
x^2 = c/a

And finally
x = sqrt(c/a)

which is coded as:
x1 = np.sqrt(m1[2] / m1[0])

Richard
http://quant-coder.prokopyshen.com/ContactMe

Hey Richard,

thank's a lot for the nice and detailed explanation! Now it's clear for me.

Christoph

Hi everyone,

Strange that CVXOPT works in the backtester only if you skip the first day.
Else, it will throw an error (Runtime exception: ImportError: cannot import name misc).

From Fred's code above, you don't have to skip the first 100 days as history() automatically backfills. Just skipping 1 day and it will work. Thanks

Thanks for the intuitive and clear explanation Richard.

Desmond: That's really curious. This will certainly help us track down this bug.

See also here for some portfolio optimization strategies implemented in Python.

Christian: That looks really promising and definitely fills a gap. Do you plan to add other optimization methods?

Also, in case anyone wants to clone the NB, here is a separate thread that allows for that: https://www.quantopian.com/posts/the-efficient-frontier-markowitz-portfolio-optimization-using-cvxopt-repost-cloning-of-nb-now-enabled

I'm not sure yet, any suggestions? Currently I solve a quadratic program using CVXOPT, if things don't fit into this problem class then it's getting more tricky.

Hi everyone. First of all, thanks Thomas for the great blog post. I've added some features to the original post. As can be seen in the notebook, the Capital Market Line (which is tangent to the maximal sharpe ratio portfolio and passes through the risk free rate) is added to the plot.

Loading notebook preview...
Notebook previews are currently unavailable.

Hi Thomas Wiecki,
Thanks for the great post.
I see you use numpy to generate random numbers for the returns. Does your calculation assume logarithmic or arithmetic returns, and how significantly does this impact the output?
I have seen 'one-shot' optimisations typically use arithmetic returns and multi-period optimisations use either arithmetic or log and wanted to get your take on this.
Thanks

Tom: that's a great addition!

Toby: Yes, these are assumed to be arithmetic returns. Log returns make some of the math easier but can have some non-intuitive consequences (see http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1549328).

Dear Contributors:

As an economics student it is really amazing to see some of the principles we learn in class applied in the real world:
However, it seems that I still am left with a couple of questions:

  1. The code does not incorporate expected utillity theory.How would one go about to incorporate indifference curves for individual investors into the code to determine the optimal portfolio?.
    I am asking this since I thought it might be interesting to rebalance the portfolio according to a changing risk aversion that is linked for example with VIX or investor sentiment on twitter.

  2. We saw that introducing a short-sale constraint will always lower expected utillity for an individual investor .
    Even the CLA algorithm mentioned above will only allow short-positions of maximum -20%
    How would one solve this limitation.

  3. Thirdly, I notice that the more assets we introduce the farther our efficient frontier shifts away from our individual portfolio bullet point .
    Is this simply because of diversification effects?

  4. Finally we saw that historic volatillity and means are a bad predictor for future values.
    Indeed many of the literature in Risk management seems to agree. What if we were to use EWMA and the weighting of observations in line with Boudoukh,Richardson and Whitelaw.

Is there any chance of reposting the original blog post because right now it is unavailable?

The blog is having an extended maintenance window - I expect it to be back up in a few hours, tomorrow at the latest.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

It's back.

Why plugging in "mu*S" solves the optimal weights for return=mu?

# Calculate efficient frontier weights using quadratic programming  
portfolios = [solvers.qp(mu*S, -pbar, G, h, A, b)['x']  
                   for mu in mus]  

Hi Thomas
Thanks your program ,after I run your program find a problem that I can not solve it . I can't understand what's wrong with ''error converting array''?

Program:
def optimal_portfolio(returns):
n = len(returns)
returns = np.asmatrix(returns)

59 N = 100
60 mus = [10 ** (5.0 * t / N - 1.0) for t in range(N)]

61 # Convert to cvxopt matrices
62 S = opt.matrix(np.cov(returns))
63 pbar = opt.matrix(np.mean(returns, axis=1))
Error:
File "D:/Python/meachine learning/.idea/Markowitz.py", line 63, in optimal_portfolio
pbar = opt.matrix(np.mean(returns, axis=1))
TypeError: error converting array

Note that CVXPY is now available on Quantopian, in case folks prefer to use it instead.

Perhaps a silly question but does anyone know how to use the code here to replicate the results of the Lagrangian approach here https://www.quantopian.com/posts/improved-minimum-variance-portfolio? This gets the weights as follows:

def min_var_weights(returns):  
    """  
    Minimum Variance Portfolio. Solves the Lagrangian for the weights that minimize the variance  
    see: https://www.quantopian.com/posts/minimum-variance-portfolio  
    and http://faculty.washington.edu/ezivot/econ424/portfolioTheoryMatrix.pdf  
    :param returns: dataframe of returns  
    :return: weights  
    """  
    cov = 2*returns.cov()  
    cov['lambda'] = np.ones(len(cov))  
    z = np.ones(len(cov) + 1)  
#     x = np.ones(len(cov) + 1)  
    x = np.array([0.]*(len(cov)+1))  
    z[-1] = 0.0  
    x[-1] = 1.0  
    m = [i for i in cov.as_matrix()]  
    m.append(z)  
    return np.linalg.solve(np.array(m), x)[:-1]  

I just want to be able to cross-check the results using the optimal_portfolio function here by rephrasing the problems with the constrains. Thanks!

I am having trouble running the last chuck because 'add_history' no longer exists. Does anyone know a workaround? Thanks!

I was wondering if someone could explain to me why the P matrix = mu * S and the q vector is -pbar. I actually thought I understood this, when I thought mu represented a risk tolerance parameter (q in the wikipedia article https://en.wikipedia.org/wiki/Modern_portfolio_theory#The_efficient_frontier_with_no_risk-free_asset). However when I realized that mu was not just a risk tolerance constant, it represented the return of the portfolio, I looked further into it. I see that further down in the wikipedia article and in this notebook, it states that we can parameterize mu = RTw and aim to minimize the variance while keeping mu constant. I believe I understand this, and this is expressed better in either of these 2 links:
http://dacatay.com/data-science/portfolio-optimization-python/
OR
http://ehremo.blogspot.fr/2013/08/markowitz-optimization-with-cvxopt.html
The difference I see between the method in this notebook and the methods used in those other links is that the solver in this notebook captures mu in the P matrix, while in the other links, mu is captured in the h vector. Obviously the results are the same.

When I treated mu like any risk tolerance constant, I was fine. But now I know mu represents the return, I am struggling to understand multiplying the covariance matrix by mu gives us the P matrix for the solver.
Apologies if this is a vague explanation, I've been looking at this so many times that the question makes sense in my head, but possibly not on paper.

I believe that Richard Prokopyshen touches on the same issue that I described in my previous post, in a post from April 2015:

The change to the cvxopt input is in the G and h matrices. The idea here is to include the target return in the G and h matrices rather then multiplying the covariance matrix by the target return. I think this is a more accurate formulation of the problem to be solved.
Reference : https://wellecks.wordpress.com/2014/03/23/portfolio-optimization-with-python/

However, while he believe that it is a more accurate formulation to use the G and h matrices, rather than multiplying the covariance matrix by the target return, I am much stupider and I don't even understand why multiplying the covariance matrix by the target return even works.

Thanks to Richard for that post as I believe it articulated my query much better than I had done.