contest algo - feedback?

Attached, please find a tear sheet for one of my contest algos. Per https://www.quantopian.com/contest, it was "Created at 2018-03-09 6:24:19 am" so results after this date are out-of-sample (I think there is a way to show this on the tear sheet--if anyone knows, please advise).

Questions:

1. If the performance continues, might it be attractive for an allocation? Or is it a dud based on current, or anticipated new requirements (we've seen some dramatic changes over the years)?
2. If the answer to #1 is "yes" then would a 2-year backtest plus 6 months of out-of-sample data be sufficient to make a decision? Or would a longer backtest, plus possibly more out-of-sample data be required? Or would I be done?
3. Are there any results on the tear sheet that indicate the need, or opportunities for improvement?
4. Aside from the tear sheet, are there other analyses that could be applied at this point to provide actionable feedback?
5. I do not have a "Strategic Intent" for the algorithm (per the requirement on https://www.quantopian.com/allocation). The algorithm has more than one factor, and I have not attempted to suss out the relative contributions of each, nor do I have a "hypothesis" for each that I have tested. If the algo has legs, I could try to piece together a story, but I'm unclear what would be required (e.g. a slide pack? report?). Perhaps an example could be provided of the expected deliverable?
6. I would permit a limited number of Quantopian employees to view the code, and make specific recommendations for minor changes. One benefit of this approach would be that perhaps treating the algo as completely new could be avoided; if changes are understood and minor, then the risk of introducing bias ("over-fitting") could be mitigated, and an additional 6 months of out-of-sample testing could be avoided.
7. To what extent might the limitations of the backtester give misleading results for this algo? For example, in light of the problem raised on https://www.quantopian.com/posts/short-selling-in-backtester-time-for-improvement-1, and the stocks traded, might there be problems when going live with real money? Aside from shorting, is there anything else that might lead to lower performance when trading, relative to the simulation (other than "alpha decay" which is not a problem with the accuracy of the simulation)? Generally, are simulation inaccuracies taken into account when assessing algos for an allocation? If so, what are the risks for this algo, and what penalty would be assessed in determining its allocation-worthiness?
8. Are algos evaluated anonymously on stand-alone technical merit first, before engaging their authors? I would think that this would be the practice, to avoid bias (e.g. an algo submitted by a famous hedge fund manager might be looked upon favorably, and the evaluation biased). However, I can also see that user profiling and meta-data might be brought to bear, as well, and presumably would be in line with the Quantopian terms of use (one example is the paper and talk, https://www.quantopian.com/posts/q-paper-all-that-glitters-is-not-gold-comparing-backtest-and-out-of-sample-performance-on-a-large-cohort-of-trading-algorithms). Some insight into the evaluation process, vis-a-vis meta-data would be helpful (e.g. would running lots of backtests on a given algo count against me?).
9. Is there a representative licensing agreement available for review? Putting a lot of effort into this, and then finding out I would not want to sign (or could not sign, for some legal reason) would be a bummer.
13
Notebook previews are currently unavailable.
34 responses

Hi Grant,

Thanks for sharing this tearsheet with the community. I'll see if we can round up someone to review it for you.

In the meantime, there are a couple of options that you can use with the create_tearsheet() method and research notebooks that you could use to improve the analysis and generally be productive in a research notebook:

1) In order to see docstrings, you can append a ? on the end of the method name and the Jupyter notebook will invoke a new window with the docstring. So if you create a cell with bt.create_full_tear_sheet? in a cell, you'll see the full docstring and all the other parameters you could feed to the tearsheet call.

I'd encourage folks, if they're not familiar with Jupyter notebooks to check out our lecture series, specifically the first lecture, which provides an introduction to Juptyer notebooks and Quantopian Research.

2) Given you have some out-of-sample time on this algo, one of the parameters listed in the docstring will be useful.

The live_start_date is a good one to use in this case.

So running a cell like bt.create_full_tear_sheet(live_start_date='2018-03-09') using your specified out-of-sample date, would provide an improved analysis.

All the best,
Josh

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

@Grant, first, great work. Most interesting.

A 25.9% CAGR is impressive. Not many fly at that level. Especially, with a -5% or less drawdown and a -10% beta. Your strategy mixed with another 25% CAGR strategy having a higher beta and higher drawdown could temper its volatility level and beta swings (if >+10%) since yours is inversely correlated to the market. Hint: (-10% beta + +10% beta = ?)

Your payoff matrix has for equation: Σ_n (q∙Δp) = n∙x_bar = F_0∙((1 + g_bar)^t -1), where n is the number of trades, x_bar is the average net profit per trade, and g_bar is your average return rate. Note: n and x_bar are provided in the round_trips section.

Therefore, to improve on your design you simply need to find ways to increase n, x_bar, or both at the same time. This would increase your average CAGR (g_bar).

A simple suggestion to improve on your design: find ways to do more of what you already do.

Hi Grant,

Very impressive! The positives I think speak for themselves, so here are my 'critical' feedback and questions:

1. The strategy looks very position concentrated, and just within the bounds of the max 5% requirement of the contest. Does it do as well if you increase the number of bets (and decrease individual bet size)? My guess is that it doesn't since your probability of making a profitable trade is below 50% (yours is about 45% in total it looks like), meaning 'the house' has the edge using a gambling analogy. It looks like few large winners (and not as large losers) are making up for it though so over all the strategy is quite profitable. If this is by design e.g. by using the Kelly criterion (when odds are in your favour, 'bet big'), then that might be fine. If it's not by design, and 'just luck' then that probably wouldn't be too interesting to investors. I believe this is one reason why they ask for 'economic rationale' behind the strategy. Perhaps look at trying to increase your batting (and betting :)) average, unless your strategy is intentionally ok with having lots of small losers that are more than compensated by a few big winners?

2. It looks like over all your Longs are more profitable than your Shorts, but that your average Short winning trades are much larger than the Short losing trades. If you understand the reason behind this, and have a rationale for it, you may want to try to adjust the Long/Short weights accordingly (maybe you're already doing this if you're indeed using the Kelly criterion)? I have a similar challenge and I need to do a lot more research to better understand the reason behind my return skewness.

3. By design, it looks like your algo is mostly looking to trade volatile large caps that are either trending or due for a short-term reversion? There must be some rationale behind this? I believe investors would likely want to have a good reason for believing that the strategy is likely to continue to perform in the future, and not 'only' as evidenced from true OOS performance.

4. I wonder how the algo is able to have a few positions over 5% and one over 6% even, and still be qualified for the contest? Since your max position size parameter must be close to 5%, you may be running a risk of having the algo pulled from the contest if any position goes over again.

In regards to your Q #2, I could be wrong, but hopefully Jess and her team has automated jobs to run longer backtest on algos that look interesting and that have held up in >6mo true OOS.

Just my 2 SEK worth :). Overall a very impressive strategy that is so far proving to hold up very well in OOS.

@Leo, the tear sheet option command is:

bt.create_full_tear_sheet(round_trips=True)

Thanks @Guy. Very interesting statistics.

I updated the post above, with a notebook run with:

bt = get_backtest('5afd4565a6737843af0292a3')
bt.create_full_tear_sheet(live_start_date='2018-03-09',round_trips = True)


This gives the in-sample/out-of-sample analyses, as Josh recommended above.

We'll see where this goes. My goal is to do nothing else, and get an allocation in 4 months. Show me the money!

@Grant, it would appear you did not let your tear sheet terminate. The “round_trips=True” option did not execute.

Nonetheless, great numbers.

Hmm? I’ll have to look into that. Should be able to set both the out-of-sample start date and the round trips jazz at the same time.

@Grant, it might take a few minutes between the two sections. And, you could also time out.

@ Guy - Should be fixed now. I see both the in-sample/out-of-sample analysis and the round-trips analysis. If something is still missing, just let me know.

Hello,
Seeing the comments on return of 25%, I realised that:-
In contest 36, the top 3 algorithms (which are the only ones to receive capital allocation) have a return between 3 and 5%. Given that bond yields are about 3% and may rise shortly, why would a hedge fund invest any money in such an algorithm? Maybe the website needs to change its criterion on which alogirhtms make the cut? I mean, if the risk free rate of return is 3%, why pay fees to a hedge fund to get you 3.05% (the 3rd ranked algorithm) or 3.96%(the 1st ranked algorithm(?
That aside, I think Kelly's criterion will flop in case of a cataclysmic event or if the market goes haywire.

thanks
-kamal

There has been feedback from Q that hedge funds don’t care about the risk-free rate but I never quite understood the perspective. Perhaps Josh or somebody else can fill us in.

@Grant, to show how difficult it is to achieve your CAGR numbers, I added the live_start_date='2016-01-15' to the example presented in Backtest Analysis Webinar with the following results (see attached notebook).

The future is often not too kind to badly designed stuff. Even if it was intended to be only for “educational” purposes.

Now, we both would agree that you have the foundation for a great trading strategy. Your notebook shows 5,310 trades generating on average a net profit of \$1,248.88 per trade. And 5,310 of something is a sufficient sample size to accept the term: on average.

So, the questions would be: how can you increase the number of trades over the same time interval? And, how can you increase the spread (Δp)? Presently, you are requesting, on average, in the vicinity of a 0.5% profit. Only half a percentage point... Could you not do more?

Remember, your payoff matrix equation: Σ_n (q∙Δp) = n∙x_bar = F_0∙((1 + g_bar)^t -1). All that is required to improve your CAGR is to increase n the number of trades, and x_bar, the average net profit per trade.

2
Notebook previews are currently unavailable.

@ Guy - Thanks. Let’s see what actionable feedback I get from Q, if any. I can certainly understand if 4 more months are required to avoid it being a potential waste of their time. Frankly, if it might be good enough as is, then I’ll look into something entirely different, rather than working to improve it. Also, I gather that every time I run a new full backtest, I reset to zero and would need another 6 months of out-of-sample data. Hence, my basic question for Q here is “What, if anything, will be required for this algo to qualify for an allocation, assuming its performance holds for the next 4 months?”

@Grant, yes. Then use your current design to make a better one for next time. Note that the two variables are somewhat independent of your methodology but are more part of gaming your trading strategy than anything else.

Just changing a few numbers controlling your trading strategy, or adding new procedures, could have a major impact. I think I have demonstrated that a number of times.

Also, at the 25% CAGR level and up, Quantopian is not the only game in town...

Quantopian is not the only game in town...

Yeah, and I'm curious if there might be a supported use case for a Quantopian Enterprise investor (where presumably the algo would run without modification). I'd inquired about this on https://www.quantopian.com/posts/important-news-for-our-community, but there's been no response yet publicly, and I've heard nothing privately.

Another avenue would be to look into zipline-live, and then have the option of either feeding to Quantopian (via the My Data feature) or to somebody else with capital (or perhaps as a signal feed to the masses). The My Data approach would seem to be more risk for Q, though, since I can't see how Q would track code changes, but maybe they don't care? I don't thnk that Q is fully supporting zipline-live for 1:1 compatibility with Quantopian Community/Enterprise, and the data feeds might be an issue, etc...not the most attractive option.

Splendid, Grant, after all that you have shared with the community you deserve success, if I may say so. Also, I've always assumed that algorithms with a performance like that (i.e., in the league of Jim Simons) existed somewhere out there, but never saw a really proof until now. It is good to know that algorithmic trading can deliver afterall :-).

@ Tim - Thanks. It's been a fun hobby. I'm skeptical about the origin of the returns for the algo used to generate the tear sheet posted. I can't say I was at all systematic in its development. I probably need to circle back and try to sort out what might be going on, lest it go down the tube.

@Grant - I am sure you will find a rationale behind the algo's performance. Still, understanding is desirable, but serendipity is pivotal. :-)

Hi Grant,

I admire the confidence you have with this algo. Since you asked for some feedback, here's my honest but frank opinion. Your algo's volatility at 10.5% is I think on the high side for what Q is looking for in their specific long/short market neutral strategy which they plan to leverage 6-7x. Having said that, your nice high returns can only be acheived with these levels of volatility (i.e. high risks, high returns). Your stock selection filter is perhaps based on the top percentile of volatility. What this implies is you're catching stocks that are either recent high flyers or duds with a good mean reversion algo and together with the current bullish market in these past 2-3 years, you may have created an "almost perfect storm". If the current market conditions holds for say the next 6 months to a year, overall, it is not a bad short term strategy for those that have a risk tolerance of 10% volatility in exchange for 25+% annual returns.
I highly doubt though that this algo's performance will hold in long backtest with different market conditions. Try backtesting for years 2008-2009 and see if it holds, if it does, I'll be truly impressed!

Thanks James -

Yes, perhaps the volatility to too high. I'm running a longer backtest now (starting in 2010), and it is ugly, and if that is a problem, it would be great to hear that from the Q team (presumably, they can just grab the backtest and run it). Or maybe more recent results are weighted heavily, and 2 years is sufficient?

@Grant, I do not consider your strategy's volatility as too high. On the contrary. Your strategy has a negative beta, and this changes the picture.

From your backtest: '5afd4565a6737843af0292a3', your beta is - 0.10, meaning that you are slightly inversely proportional to the market. And if combined with a strategy having a +0.10 beta, your strategy would help bring down the other closer to zero, making the ensemble a close to a zero-beta strategy. This would also reduce the volatility of the combined trading strategies. However, you would see the overall strategy CAGR decrease, unless the other has a higher CAGR than yours. And, as you know, that is not very common. You are more likely to see a reduced combined CAGR with lower beta and lower volatility. But, that is what you were looking for in the first place. Note however that two years for a backtest is not much, but it does give you a nice head start to developing something even better. Since at least, you know you are in the right direction.

A hedge fund which provides a rate of return close to the risk-free rate of return and collects 2% + 20% above the watermark is definitely doomed. Same with hedge funds and actively managed mutual funds which deliver less than the S&P500. Ask the fund amnagers and they will tell you Im telling the truth.

thanks
-kamal

You're assuming a hedge funds role is to outperform something (as the media likes to portray), it's not, a hedge funds role is to provide an uncorrelated returns stream for people to add to their portfolio to diversify it.

It looks like not many people who can afford a hedge fund actually don't want it, unless the returns are commensurate with the high fees. Same holds true for actively managed mutual funds. Im saying the audience doesn't want that diversification at the expense of returns-and if the returns are going to be close to the risk free rate of return, that is an even worse selling point.Best way to look at it is to look at the top performing hedge funds and the ones with a rate foo return close to the risk free rate of return. Mostly likely, the latter are in the process of closing their shop.

thanks
-kamal

A few quick thoughts on the original tearsheet.

Sharpe Ratio seems to be decreasing as time goes on. This could be due to a new regime, model decay, or the strategy being a bit overfit. Doesn't disqualify from an allocation, but the hope would be that the Sharpe was at least consistent during out of sample.

Style factor exposures seem to be just inside the box, mostly, but have some pretty consistent patterns. I would look and try to determine why the strategy has consistent exposures to one style factor or another and decide whether that's safe/makes sense given the economic rationale.

Otherwise everything looks pretty good.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Thanks Delaney -

Yes, I noticed the SR trend (shameful, though, that you use the term "Sharpe Ratio" since, in academic circles, I gather that the risk-free rate is subtracted).

My end goal here is to determine if the algo is good enough for an allocation as-is, or if I need to put more work into it. I guess you are saying it might be o.k.

Note that a longer backtest really sucks. If you want, I can give you permission to view the algo, roll up your sleeves, and analyze it to the Nth degree. What I really need is a high-level go/no-go on the potential for an allocation. If the latter, then specific recommendations for changes would be helpful. I have very limited time these days to faff around, but if there might be a path to money in my pocket, I could make time.

If it might be a go, perhaps you could provide guidance on how to determine and articulate the "economic rationale" required for an allocation, as I have no statement at this point.

And if you help, and I make some money, I'll kick back some to you--how does a 15% gratuity sound? Or if this is not allowed, I'll buy you a beer.

@Grant, your trading strategy will decay and the longer the time interval, the more so. Note that It is “designed” to do so. But at least, it starts from a higher level than a 3-5% CAGR. In fact, we could model your strategy's payoff matrix as:

Σ_n (q∙Δp) = n∙x_bar = F_0∙((1 + g_bar∙e^(-φ t) )^t -1)

where the average return g_bar is decaying at a rate of e^(-φ t). Your question should be: how large is φ, the decaying rate? And as second question: how can I reduce its impact or reverse it?

The declining “Quantopian” Sharpe ratio is more pronounced than the real one (might be about 50% overstated), therefore the decay rate might appear larger than it really is. Nonetheless, decay there will be.

What I would suggest with such a strategy is let it ride while it can. After 3 to 4 years, or after reaching a lower threshold, shut it down, start anew, and repeat the process. Use the strategy for the time interval where it performs best, even if it is just for 3 to 5 years.

BTW, it is totally normal that the “Quantopian” Sharpe will decline as your portfolio grows for the simple reason that your variance is growing at the square-root of time.

What I strongly recommend is getting as wide a variety of different ideas accumulating OOS as possible. For each individual idea/strategy, nobody really knows if it's gonna hold up. The best way to spend your time is likely to try getting some new ideas up and running while this one seasons over time. If you can maintain your current stats during out of sample, we certainly might be interested in taking a look.

Basically I don't think it's as productive to really hone one strategy a ton, I think it's likely best to get something to a state where it's producing positive alpha and constrained risk exposures, and then move on to your next idea. With our new data upload feature, plus all the new data coming in as part of the FactSet deal there should be plenty of new ideas for models/strategies.

I wish I could give you more certainty now, but if I could it would mean we could predict the future and wouldn't need the out of sample. I think your best bet is to diversify your model pool and hope a few survive out of sample testing.

For economic rationale, you need to have a justification for why you believe this model works. Basically what is the underlying reasoning behind why your model makes predictions one way or the other. An example might be: "I believe that companies which make large campaign contributions will see favorable govt treatment and therefore higher profits and share prices. My algorithm therefore longs companies with high campaign contributions in the last election cycle and shorts companies with low. Additionally, I've found this model has more efficacy in the defense sector. So my strategy trades more frequently there compared to other sectors." This of course would be using data that's not on the platform currently and imported through our MyData feature linked earlier.

Thanks Delaney -

I guess I'm hearing that things are potentially headed in the right direction, with the additional feedback:

1. Too early for y'all to take a deep dive and do an assessment--6 months OOS is the minimum, so I'll check back. Seems reasonable.
2. A backtest longer than 2 years may be irrelevant (since you did not comment on that), although I'd thought the guidance was you like to see at least back to 2010.
3. While waiting for ~3 more months, try some other stuff--good advice. I'll put this aside and try a fresh start.
4. Have some sort of plausible story about the how an algo might work, in a handful of sentences (even if I don't care, since the economic rationale statement is required to get an allocation).

One observation is that it seems really inefficient to require full 6 month OOS testing, prior to engaging with authors in a detailed way. I'm wondering if this is also how things are approached for authors who have already received allocations? If Q applies this approach uniformly, you basically have authors working in isolation. Algo A is licensed, after 6 months of OOS plus whatever else is required. Then the author comes up with a concept for Algo B (unrelated to Algo A), and wouldn't get any feedback for 6 months? It doesn't sound very synergistic; Q has lots of expertise and potential resources and guidance, but if authors are isolated, I think the end result will suffer. Or maybe the licensing of Algo A also includes some additional terms, that make it easier to work with the author on follow-on work (e.g. non-disclosure agreement, exclusivity agreement, etc.). It seems that rather than licensing individual algos, you'd be better off with consulting agreements that would allow broader interaction. I kinda understand for the algo results I posted above, more "seasoning" is required. However, I got to thinking, "Hmm? Does Q really have a policy of not engaging until 6 months of OOS can be provided on any algo?" Probably not the best policy across the board.

The other consideration is that it is not clear how authors can add to existing algos. For example, say I develop a multi-factor algo with 5 factors, and it gets an allocation. What if I find a sixth one, which alone wouldn't merit an allocation, but combined with the other 5 would provide an incremental improvement. Your system captures backtests, so I guess I would write a single-factor algo, and run a full backtest, wait 6 months, and then contact you, and claim that by adding the factor to the algo I reference above, it would be an improvement? Or do I add it to the algo above? Or something else? The mechanics of the addition of factors is not clear.

On the last point, one possible approach: Author receives allocation after OOS, says 6th factor is an improvement. While the original is running, real, both of those started in paper trading could provide enough confidence to adopt it instead. So the author could begin the parallel paper tradings for comparison ahead of time. It would lend itself to one's own certainty at least. I wonder what their point of view would be.

Blue - Yeah, that kinda makes sense, but the system is really set up to treat each algo as independent, unless Q is working more hand-and-hand with authors, once one of their algos is licensed. I'm wondering how Q intends to connect the dots between the most-excellent multi-factor Algo A, and the addition of a factor to that algo that improves it. I guess it would just be treated as a brand-new algo? A new evaluation, and second license, I suppose?

You're kind of getting at the heart of what Lopez de Prado talks about in his already famous new book. He believes most quant firms fail because they have e.g. 50 Phds working in parallel doing the same thing, while they should have them working together sequentially, as a sort of quant assembly line.

Quantopian sort of acknowledges this and does the data preparation, simulation and execution for us, but this still leaves feature analysis, algorithm development and backtesting. I'm personally not 100% sure it's a good idea to have everyone write their own entire algorithm, while essentially just providing alpha should be enough, in theory. (I.e. what's the difference between asking users for a stock ranking at any point in time vs. asking them for an algo that basically says which stocks to own at which point in time?)