Back to Community
The Process of Naming The September Prize Winner

Today we named our 7th winner of the Quantopian Open. The judging process was harder than usual. For the first time we had to invoke the "financially prudent" rule and disqualify two of the entries.

Over the last 7 months we've learned a lot about judging algorithms. We have taken what we've learned and put it into software tools. We've made those tools available to the community and made them open source for further scrutiny and improvement.

We used these new tools on the contest leaderboard, and they showed us that two of the leading algorithms wouldn't be financially prudent for investing. The prize, of course, is for the winner, but the trading risk is entirely ours. We need to manage that risk. When an algorithm isn't financially prudent, it isn't eligible to win the prize.

We thought it would be helpful to share the risks we see in these algorithms so that we can all learn from them, as a community.

Algo 1: Too Unpredictable

The first algorithm lacks consistency, and it is impossible for us to determine what the algorithm will do going forward. It's easiest to explain with these pictures coming out of pyfolio.

Here is the performance of the algorithm starting July 2013 through this past Friday, August 28th. The algorithm was submitted to the contest early on August 3rd. As you can see, it drops 19% in its out-of-sample test period.

returns

However, when you look at the paper trading results on the leaderboard, the algorithm had positive returns of more than 20% during the out-of-sample test period. That is head-scratchingly different - why would an algorithm behave differently in the month of August depending on what day the test started? You gain additional information - and uncertainty - by looking at the algorithm's long/short exposure over the longer backtest:

exposure

What you see there is the algorithm making a dramatically different bet on the first day of out-of-sample testing. After a couple years of generally flat investment and unmanaged (declining) leverage, the algorithm suddenly doubles its long-only exposure, and it also doubles the number of stocks that it holds.

When you add all of that information up, we have an algorithm whose activity, risk, and performance are unpredictable. We don't have an in-sample backtest with matching out-of-sample trading behavior. That's not something that we can prudently invest in.

Algo 2: Too Overfit

We disqualified the second algorithm because it was overfit. The first thing that concerned us was the month-to-month comparison of the returns of the algorithm.

returns2

As you can see, the month of July was very high, far higher than the other months. That comes through most clearly when you we looked at the Bayesian cone. Below, I have a Bayesian cone that uses January 1st as the cone's start date.

cone2

What you see there is an algorithm that behaves in a certain way for many months, and then the month before the contest starts, it takes off dramatically. The algorithm exits the Bayesian cone; it's no longer behaving the way it used to. That's what it looks like when a machine learning algorithm trains on a specific set of data. That training period looks fantastic, but it's only on in-sample data. Once the algorithm gets out-of-sample, the performance drops dramatically. It did pretty well in August, but it looks like that was more luck than a repeatable trading strategy.

I know that some people will look at that graph and think we're crazy to not use this algorithm. The argument goes something like this: "Who cares what's going on, so long as you're making money?" We don't subscribe to that school of thought. When you don't know what's going on, it hurts you just as often as it helps you. Furthermore, we have experience watching overfit machine-learning algorithms, and they always crumble in time. We don't believe that this algorithm can maintain the upside, and we believe it will fall apart in the coming 6 months of the prize period. It's not a financially prudent investment.

Algo 3: The Winner

Having shown you a couple algorithms that we weren't comfortable investing in, I'd like to show you one that we do like.

This algorithm was submitted on May 29th, so it has three months of out-of-sample to study. As you can see, it has managed to maintain itself within the predictive cone for that period, and is making pretty good money.

cone3

There are some nits to pick in the full tearsheet review. The algorithm doesn't do everything right. But on balance, it's one that passes the prudent test. The backtest and out-of-sample test are consistent and positive. It's a good one to manage the $100,000, and we hope to write the author a big check at the end of the prize period.

Wrapping Up

It's important that algorithm writers keep the long-term in mind, and not optimize for a single month's performance. The fund page has a list of "What we look for" and "What we don't want" that should be helpful. The way to win the contest, and the way to get in the fund, is to build a fundamentally sound algorithm that is based on a good investment thesis.

Finally, over the coming weeks we plan on doing a lot of education about how to use pyfolio to evaluate your own backtests.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

9 responses

That makes perfect sense to me. Your post is also a good example for people (like me) to keep in mind how a good/bad algorithms look like.

This is a useful example of pyfolio usage. Thanks for sharing.

Hi Dan,

How did you manage to pull in the live trading results, and combine them with the backtest data? Or did you simply do the analyses on a backtest, run over the entire period of interest, and then delineate the original backtest and live trading periods?

I realize that you can only pick one winner (your capital is finite, and there's a benefit of spreading it out over time), but it would be interesting if you looked at say the top 50 contestants, to see to what extent Pravin is statistically truly the best pick, or if he just got lucky? In other words, there is the implication that he created something special, but maybe there were 25 other algos that had just as good results--maybe there was no winner, but it was a 25-way statistical tie! I'm not saying you shouldn't have individual winners, but if we are being scientific here, it should be clear if the winner actually stands out from the crowd.

Pravin's algo is making about 14% return, compounded annually. Is it reasonable to assume that you need annual returns of about 10% or higher for your hedge fund? Is there some rough cut-off for returns (after all, assuming you diversify away the risk, absolute return is what matters)? Assuming you are starting to understand the institutional market for your hedge fund product(s), what is it expecting?

Grant

Hi Dan,

That makes perfect sense to QuantvsMarket but not to me.
I really do not understand why you disqualify Algo 2.
I think it is Tibor Szabo algo which has second largest Sharpe ratio in September 2015 Prize.
Is not it the same as disqualifying marathon runner for running last mile faster then others?

Marathon runners don't spontaneously turn around and run back to the starting line.

Vladimir: I believe they disqualified the other algos because they aren't sufficiently confident that the backtest performance reflects expected live results. Part of their difficulty comes from not seeing the source -- who's to say that the algo writers didn't hard-code or (accidentally) immensely overfit the past to get great backtest performance that doesn't generalize to walk-forward. Alternatively, the algo could have a couple of different regimes where the backtest / walk forward only covers one and they're not sure if the others will perform well in the future (leverage change in algo #1, unusual performance in algo #2). That said, I'm not quite sure what their problem w/ algo 2 was -- that equity curve looks great for the kind of Sharpe ratios they're looking for and occasional burstiness isn't unreasonable to expect.

Grant: Unless the returns are insanely low, they don't care too much about the absolute number; instead what matters is risk-adjusted returns because they can leverage up as needed.

@quantopian staff - did you ask the contest disqualified winners to explain what might have been causing the peculiar behaviour with which you were concerned?

@Alex S: I read many times what you stated "what matters is risk-adjusted returns because they can leverage up as needed". But how can Quantopian be sure the algorithm can be leveraged up as needed? How does leveraging work? I though that leveraging was like giving more money to the algorithm but if it was true not all the algorithms would perform well with more money (for example they would buy too much shares for one single stock and that would influence the market). So there must be something that I don't understand regarding leveraging.

@Alex S. - Regarding the return, I was just trying to get a feel if Pravin's algo is somehow exemplary? I understand that if the risk appears low enough, then leverage (borrowing) can be applied. Say I have an algo that consistently returns X% per day, without any variability whatsoever. It's like a high-yield bank CD. Assuming my strategy is not already leveraged, Quantopian can go to 3X% per day, it seems. But, there must be some minimum value of X that would make sense, to overcome expenses, compete with other hedge funds, make enough money to pay back the Quantopian VC's, sustain the business, etc. Is there some number that Quantopian needs to hit?

@Luca - The $100K figure for the contest is unrealistic, compared to the $25M to $100M the algo would need to sustain in the $10B Q hedge fund (in theory, if Q can get a huge number of algos, then the capital could be less, maybe $5M to $25M?). Q hasn't really discussed how they are going about managing the leap from $100K to fund-level capital levels, which would include best execution to minimize slippage and managing leverage. As I understand, there are so-called prime brokers (not IB) who manage this kind of thing.