Back to Community
New Quantopian Contest: Daily Prizes, Enter Today

Today, you can enter the new Quantopian Contest. The new competition will award daily cash prizes to the top quantitative trading algorithms written on Quantopian. Contest entries will first be scored on Friday, February 16th, with prizes being awarded the next morning.

The new contest moves at a very fast pace. Instead of waiting months to see your performance, the new contest is going to give results - and 10 prizes - every day.

In the new contest, many more community members are going to get paid. The old contest had only 31 different winners over three years. We're expecting to eclipse that number in just the first week of the daily contest.


How Does It Work?

Criteria
The contest is designed to evaluate cross-sectional, long-short equity strategies. As such, contest algorithms are required to have certain structural properties that are aligned with our allocation process. The following criteria must be met by all contest algorithms:

  • Positive returns.
  • Leverage between 0.8x-1.1x.
  • Low position concentration.
  • Low beta-to-SPY.
  • Mean daily turnover between 5%-65%.
  • Low net dollar exposure (holds equally weighted long and short books).
  • Trades stocks in the QTradableStocksUS.
  • Low exposure to sector risk.
  • Low exposure to style risk.
  • Place orders with Optimize API.

The official rules have all the details.

Scoring
Contest algorithms that meet all the criteria are ranked after every trading day based on a score. The score of an algorithm is based on its out-of-sample returns and volatility. Algorithms are rewarded a high score for achieving high returns and low volatility. The exact scoring function can be found here.

The score of an algorithm is strictly cumulative over its first 63 trading days (~3 months), starting from when it was submitted. Each day after it has been submitted, the daily return of the algorithm is divided by the trailing 63-trading-day volatility to compute a volatility adjusted daily return (VADR). The score of the algorithm over the first 63 trading days, is the sum of the algorithm’s VADRs since submission. After an algorithm has been running in the contest for more than 63 trading days, its score each day will be the sum of its most recent 63 VADRs.

Submitting
The Quantopian Contest is a continuously running competition. Entries to the contest can be submitted at any time from the IDE or the new contest dashboard (more on this below). When an algorithm is submitted to the contest, Quantopian runs a backtest on the algorithm over the last two years with $10M starting capital, and default slippage and commissions. This backtest is used to verify that the required criteria have been met. If all criteria are satisfied, the entry will receive a score after the next trading day. After each subsequent trading day, a new backtest is run starting from 2 years before the submission date, through the most recent trading day (the backtest grows in length by one day each night). The criteria are re-checked on the new backtest each night.

Entries that meet all of the criteria are considered ‘active’ entries. Participants are allowed to have a total of up to 3 active and/or pending entries (submission that haven’t yet been evaluated). All active entries are scored after each trading day using the out-of-sample returns in their nightly backtest.

Prizes
Each day, contest participants assume the score of their active entry with the highest current score. Participants are then ranked by their score. This means that all contest participants receive a single rank, even if they have multiple active entries. Every day, the top 10 participants are awarded a cash prize and displayed on the leaderboard. Here are the daily prizes:

Rank     Prize
1st $50
2nd $45
3rd $40
4th $35
5th $30
6th $25
7th $20
8th $15
9th $10
10th $5

Dashing New Look

Along with the new contest, we’ve added a new contest dashboard. The dashboard allows you to track the status and score of your own contest entries. The dashboard is also where you can find the leaderboard for past days of the contest, cash out your winnings, and more.


Get It While It's Hot

The new scoring function is designed to reward algorithms that perform well consistently over a long period of time. Entries that make it to the top of the leaderboard will likely stay on the leaderboard for a while since they can accumulate score over a ~3 month window. However, at the start of the new contest, the leaderboard will be empty, and everyone will be starting with a score of 0, so the leaderboard will have more turnover. Be sure to test your algorithm to see that it meets the criteria, and then submit it by Feb. 16 @ 9:30am ET so that you start accumulating score on the first day!


Learn More

To learn more about writing an algorithm that meets the contest criteria, check out the Contest Tutorial. The tutorial provides more detail on each of the contest criteria and offers tips and references to help you write or tweak an algorithm that is eligible for the contest.

For more details on the contest rules, read the official rules here.

We will be running a webinar on Monday, February 5th to go over the new contest including a walkthrough of the new dashboard, the contest tutorial, and the rules.

The Quantopian Closed

The Quantopian Open is no longer accepting new submissions. Contests 33-38 will run to completion under the rules with which they began, including the old prize structure. They will run in parallel to the daily contest until their respective end dates. The limit of 3 entries in the daily contest and 3 entries in the old Quantopian Open are unrelated. If you had 3 entries in the Quantopian Open, you can submit 3 more to the daily contest.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

112 responses

Not sure, Jamie if my understanding is correct, further that:

  • Entries can be made at any time into the contest and upon entering the 63-days OOS starts afresh for that entry.
  • One may withdraw/crash out any entry at any time, and re-enter at any time from whence the 63-days OOS restarts.
  • If so, the new entry would be competing against others at varying ±63 days into the contest.
  • Prizes are awarded only after the first 63-days ie. as the VADR can be computed for scoring.

Thanks for clarifying.

Karl,

Entries can be made at any time into the contest and upon entering the 63-days OOS starts afresh for that entry.
One may withdraw/crash out any entry at any time, and re-enter at any time from whence the 63-days OOS restarts.

Correct.

If so, the new entry would be competing against others at varying ±63 days into the contest.

Correct, with the exception of the first 3 months, where others won't have that much OOS yet.

Prizes are awarded only after the first 63-days ie. as the VADR can be computed for scoring.

Prizes are awarded after the first trading day, Feb. 16. So we will be awarding prizes after the first day when participating entries have 1 day of OOS data. The volatility reaches back to the in-sample period for the first 63 days of OOS.

Let me know if you have any other questions.

Will new allocations be selected from these contests?

I feel as though it would make more sense to only score algorithms once they have run for 63 days OOS? I'm curious if the current/proposed setup will be rewarding lucky algorithms that perform well for a day or two (which would encourage participants to be constantly withdrawing and resubmitting entries). Or does the scoring look at cumulative return over the trailing 63 days? In which case a fresh algorithm will have a hard time competing against ones that have been running for a few months.

I do like that the scoring is only on the past 63 days, keeps it apples-to-apples for new and old algorithms. But I'm again confused if an algorithm that hasn't been OOS for 63 days should be considered.

On a somewhat related note, any idea when contest 32 will become finalized? I have a horse in that race and I'm curious if it's placed!

Lastly, will the leaderboard still be displaying our Quantopian assigned code names or our username?

@Kieran: The allocation process and the contest are aligned in their requirements of structural criteria. However, the allocation process still requires a longer out-of-sample period (usually 6 months) to measure performance. Allocations won't be decided based on contest performance, but qualifying for the contest, and ranking highly for an extended period of time are both steps in the right direction.

@Stephen: Good questions. At the start of the new contest, luck will play more of a factor than it will a couple of months from now. The first few weeks will only see a few days of OOS returns contributing to the score. The scoring function uses cumulative returns (not annualized), which means that fresh entries will have a harder time winning a prize later on. However, it will be possible to start winning before 3 months. We're expecting most of the prizes to go to entries that have been doing well for 3+ months, but some of them should be won by entries in less time.

We're still working on finalizing the results of contest 32. We'll likely publish the result early next week.

The leaderboard in the new contest will continue to use assigned color-animal names.

Thanks Jamie for clarifying. Now that you explained it, the time window makes a ton of sense. And I like the scoring system overall, really removes many of the frustrations I've had and seems to be better aligned with what your team is looking for. It is a tough set of criteria so I'm excited to see how people stack up against it.

One last question, will there be any history of the leaderboard or will users only be able to see that day's leaderboard (and their previous winnings, if any)?

Hi Stephen,

No problem. We're planning to add a feature that will allow you to see historical leaderboards on the contest dashboard. We're expecting to add it when entries start getting scored in a couple of weeks.

Having a hard time submitting my algorithm. Validation says I must use order_optimal_portfolio, which I already do.

must use order_optimal_portfolio

Only. We can't use stop or limit orders for example. Make sure there are no instances of any other order method.
If there are any other order types commented out, they are picked up, you can neutralize them with spaces or something like

#ord _ er_target()

Yep that's the only order function I use

Ben, is there any chance you have one of the old order functions commented out somewhere in the code? Unfortunately, the code validator will think that the ordering function is used, even if it's commented out.

That was the issue. Thank you, never would've caught that otherwise.

Hi Jamie -

Just curious - how will winnings be transferred? Sending lots of paper checks to daily winners sounds pretty clunky.

Jamie - what's the story with transferring winnings? I'm hoping to buy a sandwich or take my wife to dinner with my Rube Goldberg algo (I've added to the complexity of the risk model, optimizer, plethora of constraints, etc.). Or maybe we will have the option of shares in Quantopian and/or its 1337 Street Fund? Now that would be cool...maybe...sandwich...start-up crowd-sourced hedge fund...sandwich...start-up crowd-sourced hedge fund...can't decide...

Hey Grant, sorry for the radio silence. We're integrating with a 3rd party service to facilitate payments. I'll have more details on this either tomorrow or next week. To start out, we are expecting to pay out accumulated winnings at the end of each week.

@Everyone: The first deadline for the daily contest is tomorrow at 9:30am ET, so be sure to make your submissions tonight. And don't forget to test your algos before you submit them to make sure that they pass the criteria!

We are expecting to post the first day's results this weekend.

Hi Jamie,

I submitted an algorithm to the contest last night that has its rebalance function set for the beginning of each week. Will this mean that today my score will be 0 because my algorithm won’t make any trades? Just want to confirm this won’t negatively effect my cumulative score going forward?

Hi Joe,

When the leaderboard is updated, your algorithm will be backtested from 2 years before the start of the submission date* up to today. Each day in the contest, that backtest will be extended by 1 day, so if your algorithm holds positions going into today, you should see non-zero returns. Does this make sense?

  • Note: Any algorithms submitted in the last two weeks will have an effective submission date of today.

Thanks Jamie, that makes sense.

Hey Everyone,

The new contest is officially underway. All entries that were submitted before 9:30AM ET this morning will be evaluated this weekend. Entries passing the criteria will be scored and ranked. The top 10 participants will be awarded a cash prize, and displayed on the leaderboard.


After the leaderboard has been posted:

If you have an entry that passes the criteria, you will get an email telling you that it passed, and the entry will appear as an active entry on your dashboard. The active entry section of your dashboard will be the place where you can track the performance of your running contest entries.

If you have an entry that failed at least one of the criteria, you will get email saying your entry was stopped along with the reason why it was stopped. You can also find stopped entries on the pending/withdrawn section of your dashboard. You will be able to see the backtest that was used in evaluating your entry by clicking the icon next to your entry in the ‘withdrawn’ section (the withdrawn section is updated whenever the leaderboard is posted). Entries that raise a runtime exception during the evaluation backtest can also be found in the withdrawn section.

If you are one of the prize winners, you will receive a congratulatory email this weekend. You will get another email in about a week to initialize the prize payment. Winners of prizes next week should expect a similar email.


There’s Always Tomorrow

Remember, if you missed the submission deadline this morning, you can still enter the contest. Submissions to the contest can be made at any time, and will be evaluated and scored after the next trading day. The next deadline is 9:30AM ET on Tuesday, Feb. 20 (Monday is a market holiday). That said, the sooner you enter, the sooner you could win one of the 10 daily cash prizes!

Dear Q, i have been "out of the loop" for a few weeks. From what I understand, effectively the "Quantopian Open" with its potentially meaningful reward structure has now disappeared, to be replaced by daily cash prizes of $5 to $50 .... i.e. the maximum Daily prize amount is equivalent to only 0.2 points on the S&P500 futures contract. That's just not worth the effort for most people who are serious about trading. Are all meaningful rewards at Q now gone altogether, or have i got this wrong? Please advise ... anyone?

Yes, the Quantopian Open is no longer, and this has taken its place. The more meaningful award is an allocation for the hedge fund; the contest, in my opinion, is more-or-less training for the crowd to write hedge-fund conforming algos with small, regular Pavlovian incentives in the form of cash prizes.

Overall, they are paying out $275 per trading day, which amounts to $69,300 per year (assuming 252 trading days per year). Not much in the grand scheme of things...so your overall point is still valid.

Talking about the more meaningful award which is in form of an allocation say tops $10M to manage of which you receive 10% of profits in this equity market neutral strategy which historically yields one notch above the 3 month US Treasury rates. Given 5% returns should be about the average, your total potential annual pay would be $50K. Now if you're a super quant and can churn out a 20% return on a 2-3% annual volatility of the algo, then that is potentially a $200K take. So as not to give false hope, I want Q to publish a model algo with long term backtest that meets all of its criteria, preferably one that was given allocation, so that we are given guidance as to what to expect. On the other hand, Q might not want to do that for the risk of being exposed but then if the "unique product" is doing well, shouldn't one want to flaunt it for its competitors to chew on!.

Hey Everyone,

The results from yesterday were just posted to the leaderboard. Congratulations to the 10 winners and to everyone who passed the criteria! The leaderboard will next be updated on Wednesday with the results from Tuesday's trading day (since Monday is a market holiday). If you have an entry that passed the criteria, you can now find it under the active entries section of your dashboard. You can track your entry's score on that page. We will be adding some other features to the dashboard next week, so stay tuned.

If you didn't pass the criteria, or if you didn't have time to enter, the next submission deadline is Tuesday, Feb. 20th at 9:30AM ET.

Good luck to everyone next week!

@ James Villa -

The idea behind Q, I think, is to attract folks who aren't necessarily doing the math on time invested/opportunity cost versus potential payout from an allocation. It is more for hobbyists, students, quant-wanna-be's, tinkerers, etc. than professionals; there is no implied full-time career path with Q (just the potential of a short-term contract). It is also important to note that an allocation payout (even one-time) in the range of $50K to $200K is a lot of money for most folks (potentially life-changing, for some), and sure beats your typical hobby that just costs money.

So as not to give false hope, I want Q to publish a model algo with long term backtest that meets all of its criteria, preferably one that was given allocation, so that we are given guidance as to what to expect.

It would be interesting to understand the alignment of the contest rules/constraints with what Q customers would be willing to invest in. For example, presumably Q is still in cahoots with Point 72, and they are on board with pouring money into algos that meet certain criteria. How do the Point 72 expectations match up with the contest rules/constraints? Taking the top ten contest algos six months from now, what is the agreement with Point 72 for picking the fund-able ones, and what are the steps to a full allocation? Or if Point 72's requirements are not so relevant, how would one know, in general, if a given algo will be fund-able, and at what level? It has always been a wish-washy answer, but now that we have the full-blown set of constraints, maybe something more definite could be said?

Hi @Grant, thanks for your comments. Based on what I know so far, I would agree with your analysis. Regarding the sort of people you think Q is now targeting, your comment almost exactly parallels that of my wife, who has nothing to do with financial markets but who is a professional psychologist. Her comment on it today was that the target audience certainly would now appear to be students. Actually I think that its a great incentive for young kids; a fun educational tool in line with Q's stated aim of "democratizing finance". My concern however is with the ".... what else?" part. If this "new contest" is IN ADDITION TO the more serious process of providing allocations for funding, then great! By all means, let's incentivize the participation of bright young students. But what about all of the existing people who are almost certainly not going to continue to bother participating for a few payouts of a few tens of dollars each? What is the basis now for allocations, if in fact they will even continue at all? Basically, is this an EITHER/OR situation or a BOTH/AND situation?

Hi @Jamie, clearly you have put a lot of thought & effort into organizing this "new contest", with its focus on very small fast cash payouts on a daily basis. I assume that you have done this either in response to what most participants have said they want, or perhaps is it because of a shift in the focus of what Q wants? It's certainly a great idea for satisfying those people who aspire to fast $5 to $50 payouts. But what about allocations? Please can you (or @Dan Dunn, or someone else at Q) give some guidance on that? Also please can you explain what is the "Point 72" that Grant is referring to?
Cheers, best regards, TonyM.

It is more for hobbyists, students, quant-wanna-be's, tinkerers, etc.
than professionals;

I wonder what prevents professionals to write algorithms for Quantopian. Why wouldn't they be interested in aiming to an allocation? All in all they know what strategies work and it would be just a matter of implementing them on Quantopian. Maybe that's a problem, standard hedge fund strategies require tools, data or computing resources not available on Q? I don't know but I would be interested to know how professionals see Quantopian.

@ Luca - Not to derail Jamie's post (Q really needs to add discussion threading), but one obvious issue is conflict of interest for professionals. Presumably, if one works at a hedge fund already, then moonlighting for Quantopian would be discouraged, if not, forbidden by non-compete/NDA type arrangements, and perhaps regulatory requirements.

@ Tony - Google "Point72 Quantopian". Point72 is/was putting up $250M to kick-start the Q hedge fund. So, presumably, what they want, they'll get. I'm assuming that a big driver here is Point72, but maybe not, if other money has become available.

Hi @Karl, certainly i have no disagreement with your philosophical overview. Also i have no problem with Q providing fast-payout incentives for students & others who want to learn about quant finance. As an educational platform Q is great. I also think people like Delaney & Jamie are right on track with that.

My concern however is with what has happened to Q with regard to incentivizing people who are interested in making a longer-term commitment to writing algos for professional use? Specifically:

1) Why was the Quantopian Open actually removed?
2) Why not a "both/and" approach for short-term daily payouts as well as the pre-existing long-term incentives (as per the old Q Open) as well?
3) Is Q actually making allocations for algos for fund use?
4) If so, then how are the allocations (if any) being made now?
5) What fund? Is Q already running its own hedge fund, or is Q assisting or setting up some other fund, as per @Grant's comment: "Point72 is/was putting up $250M to kick-start the Q hedge fund.". I had thought that Q's fund was already well and truly "kick-started" and up and running for quite some time already. Clarification please....

Perhaps someone at Q might also like to comment on the following observation:
After having watched the Q Open leaderboard carefully every day, as well as learning as much as i could about all the individual participants by reading their Forum posts, it became evident that some participants were clearly attempting to game the old system by writing algos overly curve-fitted to very recent data, so as to get immediately to No.1 position after they entered and then posting comments on the Forum like: "...so where is my money?". No doubt the new competition will be very satisfying to those sort of people, although much less so to the people who are genuinely trying to construct robust, stable, long-life algos that are actually suitable for fund use. Are "one-day wonder" algos what Q really wants? Anyone care to comment?

Hi Tony -

The basic idea behind the new contest, as I see it, is to stimulate interest in and conformance to a set of algo characteristics that would put authors in the best possible position for a fund allocation. Q has actually just gotten the skeleton infrastructure in place described by Jonathan Larkin on A Professional Quant Equity Workflow, and now are rolling it out to the masses in the form of the new contest. The recent addition of the experimental risk model has rounded things out.

My sense is that early prizes are basically a reward for putting the effort in to understand the constraints and associated tools and applying them, while as the contest progresses, legitimate long-term algos should bubble up and more consistently win prizes. Maybe gaming is still possible, but it would seem to be a lot harder.

There was an announcement awhile back regarding allocations that had been made to the fund (perhaps someone has a copy?). My sense is that perhaps $25M has been allocated across 10-15 authors with a plan to do even more in the coming year. I don't know if that is still the case.

If you do a search on "1337 Street Fund" you'll see that Quantopian has established a fund under this name.

Hi Tony,

I share the same sentiments, observations and frustrations as you have. Personally, I think part of the problem here is the design, configuration and structure of the contests are not exactly mirroring their allocation process and philosophy. The biggest flaw is the 2 year backtest which is very susceptible to gaming or overfitting. Second would be accurate measurement of metrics. We have witnessed this happening in the old contests and will continue to see this again in the new contest structure. With all their modelling and risks experts, I cannot understand why they are not comprehending these shortcomings. I would rather that they standardized a longer backtest with say, 8 years of training and 2 years of out-of-sample, then go and evaluate live. Aside from their established requirements and thresholds, scoring should also include other pertinent factors such as (1) CONSISTENCY - are training and OOS results similar to rule out "luck" and "gaming" factors, (2) ORIGINALITY - is the algo not highly correlated to others and (3) ADAPTIBILITY - is the algo able to adapt to different market conditions/regimes.

The way I understand their current process is they pick some algos from winners of the contests and do further evaluation presumably including long backtests before they became candidates for allocation. So it begs to suggest, why not do it all in one shot by structuring the contest to be in line with their allocation process?

I have no problem with them giving daily payouts as this attracts more participants which is one indicator that VC's and investors look at for valuation purposes in this kind of business model. More members = increased valuation, I get it. But I would also like for Q to retain the reward payouts for the first ($5,000), second and third prizes after the six month live trading. Best of both worlds!

Secondly, the contest should be targeted to anyone including the proverbial "monkey throwing darts", as no one have a monopoly of skill, talent or intelligence in this stock market game.

Hi @James, again our thoughts here seem to be very much aligned. My concern is that Q's "new contest, fast payout" approach is almost guaranteed to encourage algos that are curve fit and have a 1-day prediction horizon. This would seem to be totally at odds with Q's previously expressed or implied concerns about wanting to get stable & consistent algos that can be trusted to continue working after their implementation in a fund setting.

Yes, we know that even having a long OOS test period does not guarantee that an algo will continue to perform the day after the test period ends, but a long test period gives at least SOME reassurance that an algo can continue to perform over time. However Q has now effectively reduced the OOS test period from 6 months to only 1 day! Are these 1-day algos what Q now wants for the fund? If so, then why the sudden almost complete about-face?

There is one aspect to your post that i had though of briefly but not really seriously considered until now. Specifically, you write: "... daily payouts .... attracts more participants ... that VC's and investors look at ...for valuation purposes" This starts to look like perhaps Q's objective might have changed altogether. Rather than encouraging committed people to write reliable algos, if what Q wants most now is to attract people simply to increase the numbers and thereby raise the potential value of Q itself for sale to other VC's so that the current principles can flip it, then we might indeed wonder how long Q will actually continue to be around .......

@Tony: Grant's latest answer in this thread is a good one. In particular, I want to highlight this comment:

My sense is that early prizes are basically a reward for putting the effort in to understand the constraints and associated tools and applying them, while as the contest progresses, legitimate long-term algos should bubble up and more consistently win prizes.

The criteria for the new contest are aligned with what we look for in our allocation process. An allocation is still the "big prize" and the new daily contest is meant to provide faster feedback (and rewards) to community members who take the first step toward an allocation.

Regarding the 1-day horizon, the score of contest algorithms is cumulative, which means the score of competing algorithms is only generated from 1 day OOS data on the first day of the contest (last Friday). As the contest moves forward, entries will be scored based on rolling 63 trading day windows of OOS. The expectation is that there will be some entries that 'stick' on the leaderboard for a while. Here's a quick description I gave earlier in this thread:

At the start of the new contest, luck will play more of a factor than it will a couple of months from now. The first few weeks will only see a few days of OOS returns contributing to the score. The scoring function uses cumulative returns (not annualized), which means that fresh entries will have a harder time winning a prize later on. However, it will be possible to start winning before 3 months. We're expecting most of the prizes to go to entries that have been doing well for 3+ months, but some of them should be won by entries in less time.

Adding to this explanation, there will likely be participants who win many consecutive prizes, which should result in larger payouts to participants who do consistently well over a multi-month period.

Regarding the choice of a 2 year backtest over a longer period of time, most of the contest criteria are 'structural' properties of an algorithm that generally remain true regardless of the backtest length, so the 2 year backtest should be enough to determine these properties. And when it's not, the criteria are checked every day going forward, so an entry will eventually start to fail the criteria OOS.

@Jamie,

I guess I'm still a bit confused about the scoring, so I'll try to illustrate by way of some hypothetical examples and see if I got the calculations right (or wrong) below:

Entry # 1:
Entered on Feb. 16, 2018
On May 16, 2018:
3 month cumulative returns - 4%
3 month cumulative rolling 63 day volatility - 2%
score = 2.00

Entry # 2:
Entered on March. 16, 2018
On May 16, 2018:
2 month cumulative returns - 7%
2 month cumulative rolling 63 day volatility - 3%
score = 2.33

Entry # 3:
Entered on April. 16, 2018
On May 16, 2018:
1 month cumulative returns - 10%
1 month cumulative rolling 63 day volatility - 4%
score = 2.50

Questions:
1) Did I compute the scoring correctly?
2) If not, can you please provide the correct computation by way of same illustration as above?

Hi James,

The examples you provided don't have enough information to compute the score. Each day, and algorithm's score is the sum of it's trailing volatility-adjusted daily returns (VADRs). A VADR can be computed as follows:

vadr = daily_return_today_non_annuallize / max(63_day_trailing_volatility_annualized, 0.02)  

It's important to note that the numerator in a VADR is not annualized. This was decided so that algorithms that run for longer periods of time in the contest have the ability to accumulate score. In the examples you provided, I noticed that you gave higher cumulative returns to the examples with less OOS data. Generally speaking, we expect the non-annualized returns to be higher for algorithms that have been OOS longer.

I've attached a notebook which you can use to see the score over time of a backtest. The notebook is currently configured to run on a backtest that is 2+ years long, and it considers the entry date into the contest to be 2 years after the start of the backtest. Note that the score is floored at 0, which is true in the contest.

Let me know if this helps clarify things.

Loading notebook preview...
Notebook previews are currently unavailable.

@Jamie,
Hi, and many thanks for your reply. I was under the (mistaken) impression that Q was implicitly encouraging people to write algos with an effective 1-day life, which some people would simply re-tune and re-submit daily. Thank you for your explanation. No doubt the fast cash prizes may be an incentive to some, so nothing lost by having them. As to whether that provides useful feedback with regard to robust algo development, well let's see. Mostly I'm delighted that Q is still on track with regard to the longer term goals and allocations for good algos. That provides me with the encouragement to want to come back. Thanks & best regards. TonyM.

Hi Jamie -

There is a significant difference between the backtest length typically used for assessing algos for allocations (starting in ~ 2010) and the 2-year backtest required for the contest (although I guess it ends up being longer, as the contest progresses). As you say:

Regarding the choice of a 2 year backtest over a longer period of time, most of the contest criteria are 'structural' properties of an algorithm that generally remain true regardless of the backtest length, so the 2 year backtest should be enough to determine these properties. And when it's not, the criteria are checked every day going forward, so an entry will eventually start to fail the criteria OOS.

So, if the backtest is solely for checking the 'structural' properties, I guess it makes sense. However, it is a very short amount of time for assessing the quality of an algo, and you run the risk of incentivizing bad practices by the crowd (myself included). My sense is that a better practice would be to do development as far back as the data will support, with perhaps a 2-year hold out of data.

James asked basically the same question I had asked awhile back:

why not do it all in one shot by structuring the contest to be in line with their allocation process?

Frankly, I feel as though you are being evasive. Surely, a bit more thought went into deciding on the 2-year contest backtest period? Please elaborate.

@Jamie,

Thanks for the notebook and clarification. I am still unclear though as to how you will put the algo entries in equal footing for evaluation and ranking purposes given that entries have different start dates. Will you run/evaluate all algos starting with the official kick off date (Feb. 16) and starting capital ($10M) and only account for the true OOS period based on the entry's start date? Can you please explain how that would work? My concern is those entries with later start dates have the benefit of more in sample data and the gaming aspect of this benefit. I have seen in the old contest format, newer entries overtake older entries because of this unwarranted advantage.

@Grant:

My sense is that a better practice would be to do development as far back as the data will support, with perhaps a 2-year hold out of data.

Many of the datasets have different start dates. If backtests were started depending on the datasets used, entries would be evaluated over different time periods. In the allocation process, dataset lifespans can be taken into consideration. To keep things simple in the contest, we decided to pick a single backtest period that could accommodate all of the backtests.

The other thing to keep in mind is that the performance (score) of the algo comes from out-of-sample returns. If an entry is completely overfit to the in-sample period (regardless of the time period), it is unlikely that it will rank highly in the contest for a significant period of time.

@James: The start date of each contest entry will be 2 years before its submission date. Each entry will start with $10M at the start of the backtest, and the score uses the returns from the true OOS period of that entry (returns after the submission date). For example, entries that were evaluated this weekend have a submission date of Feb. 16th. These entries were backtested from Feb. 16th, 2016 - Feb. 16th, 2018, and their score was generated using the returns starting on Feb. 16th 2018. If an entry is submitted on Mar. 1, its backtest will be run from Mar. 1st, 2016 - Mar. 1st, 2018, and its score will be generated using the returns starting on Mar. 1st, 2018.

Does this make sense?

@Jamie,

Yes, it does makes sense since it measures only true OOS. However, the point I'm driving at is how do you put two algos with different OOS size on equal footing when you evaluate and rank at a forward date. In your above example, the first entry had a start date of Feb. 16 and the second entry on Mar.1. Come April 1 when you evaluate and rank the entries, the first entry has 1 1/2 months of OOS while the second entry has 1 month of OOS. Assuming all things equal, wouldn't you think the one with a shorter OOS timeframe has an unwarranted advantage over the other one?

@James:

Assuming all things equal, wouldn't you think the one with a shorter OOS timeframe has an unwarranted advantage over the other one?

The opposite, actually. The scoring function is designed to favor entries that have accumulated a longer OOS period. Remember, the score is based on cumulative, non-annualized returns. The longer the OOS period, the more time you have to accumulate score.

To make it possible for new entries to be compared on equal footing, the OOS period is limited to the last 63 trading days (~3 months). It's possible for entries to win before the 3 months, but they won't be on equal footing OOS-wise until they have been in the contest for 3 months.

@Jamie,

The opposite, actually. The scoring function is designed to favor entries that have accumulated a longer OOS period. Remember, the score is based on cumulative, non-annualized returns. The longer the OOS period, the more time you have to accumulate score.

Really, I have an equal chance to make cumulative returns of 5% in 1 and 1/2 months as I would in 1 month. Also, the longer you are in the market, the riskier it is based on the principles of time value of money.

To make it possible for new entries to be compared on equal footing, the OOS period is limited to the last 63 trading days (~3 months). It's possible for entries to win before the 3 months, but they won't be on equal footing OOS-wise until they have been in the contest for 3 months.

As in your example, the entry that started on Feb,. 16 would have reached 63 trading days sometime in May 16 while the entry that started in March 1 would have reached the 63 day in June 1. In my opinion, the only way you can compare the two entries in equal footing is to recalculate the first entry as if it also started on March 1 with same initial capital. This way, you remove the time bias, the start / end dates are the same and number of OOS days are same .

@James:

Really, I have an equal chance to make cumulative returns of 5% in 1 and 1/2 months as I would in 1 month.

Wouldn't this only be true if the expected (mean) daily return of a strategy is 0%? If you had an expected daily return of 1% (extreme example), wouldn't your expected return after 1.5 months be higher than the expected return after 1 month?

As in your example, the entry that started on Feb,. 16 would have reached 63 trading days sometime in May 16 while the entry that started in March 1 would have reached the 63 day in June 1. In my opinion, the only way you can compare the two entries in equal footing is to recalculate the first entry as if it also started on March 1 with same initial capital. This way, you remove the time bias, the start / end dates are the same and number of OOS days are same .

This is a fair point, and was taken into consideration during the design of the new contest. The issue with shifting the backtest start date every day for all entries is that an algorithm's path might change from day-to-day depending on the start date. In expectation, changing the start date of the algo shouldn't change things too much, but it's more confusing for someone tracking their score on a daily basis. At some point in the future, we may bring all entries back to the start date (for example, if there's a major rule change), but for now, we're ok comparing with different start dates given that the OOS periods are aligned, and returns are measured as a % instead of absolute dollar amount.

@Jamie

Wouldn't this only be true if the expected (mean) daily return of a strategy is 0%? If you had an expected daily return of 1% (extreme example), wouldn't your expected return after 1.5 months be higher than the expected return after 1 month?

You got to be kidding me. Expected returns depends highly on the accuracy of your model predictions. What if the other guy had an expected daily return of 2%, wouldn't his expected return after 1 month be higher than the expected return of 1% after 1.5 months?, And even if they had equal expected daily returns, if you account for time value of money, the entry with a shorter timeframe will have a higher net discounted rate of return. The daily returns of stocks are not static nor linear, it is non stationary and non linear. So your assumption on expected returns over time does not hold water.

Bottomline, I just want the measurement of metrics to be fair and accurate, so we all are playing a leveled field.

@James, the methods proposed by @Jamie are the most reasonable to make the contest for anyone worthwhile whatever their entry date.

If a strategy comes in later than yours, and manages to outperform yours on a cumulative return basis, it simply states that it was better than yours. Period.

It was able not only to catch up the lost time, but also exceed your own level of performance. And thereby deserves to be ahead.

Will there be a luck factor involved in such a contest setup. Definitely YES.

However, those starting late will be disadvantaged. The more they wait, the more so.

@James:

Expected returns depends highly on the accuracy of your model predictions.

No argument from me on this statement.


I interpreted your example as suggesting that a single algorithm should have equal chance to make 5% cumulative returns in 1 month as it does in 1.5 months. Maybe I misinterpreted what you meant, but I thought you were trying to say that an algo's expected cumulative return is independent of the amount of time over which it is measured. Taking an extreme example, an algorithm measured over 1 month should not have the same expected cumulative return as it does over 24 months, which is why I disagreed. If I misinterpreted your statement, I apologize.

I'm having a hard time understanding the case where an algorithm with less OOS has an advantage in the scoring function over an algo with more OOS. I think it would help if I had a concrete example. Any chance you could provide a comparative example in the scoring function example I provided above? Even if they're sample returns streams (not from a real backtest), I think that would help.

@Guy,

If a strategy comes in later than yours, and manages to outperform yours on a cumulative return basis, it simply states that it was better than yours. Period.

It was able not only to catch up the lost time, but also exceed your own level of performance. And thereby deserves to be ahead.

Period? Better? Yes, with the benefit of more recent data on his in sample backtest that the earlier entry did not have. In financial time series prediction, the most important data is the most recent data. and the most important element is the accuracy of the predictions. I'm suprised you reached this conclusion.. You are not considering the time bias factor and the time value of money. Is that fair? Measurement should be apples to apples not apples to oranges.

I have seen you take other people's backtest and back track their backtest to an earlier date to show them that it fails under different market conditions. So you can do the same here, take the later entry and backtrack his backtest to the earlier entry's start date and see if it holds. Market conditions change over time and the only way to accurately measure cumulative returns / 63 day volatility is to have the same start date on the 2 year backtest because it freezes the market conditions of that period.

@James, the following chart is a crude estimation of late contest registration. The cumulative return is randomly generated, and some of the starting dates are delayed.

The point I was making, which was easy to picture, is that the delayed entries are at a disadvantage.

@Guy,

I'm glad you posted this randomly generated chart . Why do you conclude that the delayed entries are at a disadvantage, when you can clearly see that some of the delayed entries beat some of the earlier entries? The delayed entries are in the middle of the extreme spectrum of the earlier entries. What does that tell you?

@ James -

Take two equivalent good-performing algos. The one that starts later will win less money. Seems like a fair deal. I don’t see any advantage or gaming to be had by starting late.

@Jamie,

Most of my explanation is articulated in my above answers to Guy. It is not the scoring mechanism that is flawed, it is fine since it measures only true OOS, but the mesurement of OOS performance in terms of time. From a modeling / backtesting standpoint and knowing how market conditions change over time, the earlier entry has a disadvantage simply because of some of his OOS data, which by strict definition, data his backtest hasn't seen is being used by a later entry in his in sample backtest. Given that market conditions change over time, the most recent data is the more valuable in sample data for the backtest. This is what I call the unwarranted advantage of the later backtest. Now if all entries started on the same date for their two year backtest, they are all using the same in sample data with market conditions frozen in time then measurement of OOS performance is leveled.

@Grant,

Take two equivalent good-performing algos. The one that starts later will win less money. Seems like a fair deal. I don’t see any advantage or gaming to be had by starting late.

The advantage of a later start is having the most recent data available as in sample data for the backtest. For the early starter, this is already an OOS data. As I mentioned above, given that market conditions change over time, the most recent data is the more valuable in sample data for the backtest. So given two equivalent good-performing algos, the late starter has the benefit / advantage of having more recent data which represents the most recent market conditions that is now included in his in sample backtest. Hope this explains it.

@ James -

I'll sleep on it, but it sure seems like your basic foot race. Unless the guy who starts later can run faster than everyone else, he'll have a hard time getting to the head of the pack.

I guess if you think there's a big advantage, just wait to enter the contest...I'm in now.

@James, very simple answer. Look at it from start to finish for each participant. Each player had a randomly generated CAGR. Some went up while others went down. And randomly meant you could not know who would be on top or do miserably. However, they would all reach the end point.

Someone registering late like (P3, the yellow line), can still outperform others like the blue line (P4). But would still be at a disadvantage if compared to P1 or P2.

Pressing F9 would generate a new set of curves with different random returns, different performance levels. Nonetheless, the late entries would still underperform, even if they could beat some of the other entries.

If a trading strategy comes in late and outperforms all the rest, then, it needed a higher CAGR than everyone else to do so. And, thereby deserved to be finish on top. If you took P5 which came in the latest on that run, it hardly had any time to show its merits.

Coming in 3 months late in a 6-month contest is not an advantage, even if you had 11% more data to consider. And, if you did catch up and beat everyone else, then you deserved to win since I would consider that strategy as the best of the group, all other things being equal.

The knowledge gained by the increased in-sample data could go both ways. My estimation is that in this type of contest, it will not have any significant impact. In fact, almost none at all.

It is only if, and it is a big if, your trading strategy has real predictive powers that you could outperform others, even if you came it late. Still, the task becomes harder and harder the longer you delay your entry.

You already had 2 years to show what you had. And the other late guys will also have 2 years to show they had it too.

Starting late will still give them the presented chart.

I agree with @Grant's observations.

Hi Jamie, a few questions about the criteria. Forgive me if these have been beaten to death.

Leverage between 0.8x-1.1x -- I thought it was standard practice for quant funds to significantly vary their leverage based on their models. I understand the reason for the upper bound, but wouldn't an algorithm that scales its leverage based on a measure of confidence on its alpha signal be preferable (better performing) than one that maintains constant leverage? I mean if the algo is throwing up its hands and saying "this is just going to be a wash today" then what's the point in risking the money? Seems like this constraint would remove good algorithms from consideration, or they might have to introduce inefficiencies in order to not be DQed. If the worry is that people are underleveraging in order to improve their volatility stats, doesn't the volatility-adjusted daily returns gauge account for that now? If an algo with 0.6 leverage outperforms an algo with 1.0 leverage, are you guys really going to dock its score? Market conditions and associated opportunities are constantly in flux -- not sure why you'd want to force rigidity onto something so fluid.

Mean daily turnover between 5%-65%. -- Why does this matter? I can understand that the problem with low turnover is you would need a longer backtest and OOS in order to generate enough data points to be statistically viable. (Considering that, wouldn't it make sense to make the OOS requirement something like "6 months and at least 2000 trades"?) I also realize that high turnover adds frictional costs, and so an algo with a higher turnover is at a disadvantage. But if an algorithm with a 75%+ turnover delivered alpha that far eclipses those frictional costs, then what's the advantage in leaving money on the table? Again, this constraint seems like it discourages people from pursuing potentially profitable strategies. Seems the constraint could also discourage a dynamic algo from reacting quickly to changing market conditions. More rigidity.

Positive returns -- At the very least I really think this should be excess returns over the risk free rate. I don't think positive is enough. And you only require it positive over a two-year window? I think you guys can be a little more strict with this one! You should DQ based on drawdowns--depth and length--as well.

So on the new leaderboard we can't see returns, sharpe, beta or any metrics at all? That's not as fun. What are you guys trying to hide and from whom? :P We don't have names, we don't even have metrics. It's the most obscure competition ever. Haha Somebody (but we don't know who) is in first place because of something (but we don't know what).

Why is there a download link for a CSV file?

@Everyone: We just published the result from yesterday's market day. The leaderboard now shows the winners from yesterday. We are working on adding a feature to the dashboard that will allow you to display results for past contest days. In the meantime, only the most recent result is displayed.

@Viridian Hawk: You can find an explanation for each rule in the contest tutorial. Each lesson has a 'Why Is It Required?' section. Some of the rules you listed boil down to limitations of the risk model. The risk model was designed to determine risk exposure for algorithms that hold positions at the end of the day, and hold positions for at least 2-3 days. Some of them are there to match the cross-sectional, long-short equity algorithm structure. Regarding the positive returns requirement, we decided to keep things simple to start. If we learn that the rule is not strict enough, we might decide to change it later on.

Right now, the downloadable .csv file only includes scores, but we're working on adding more information. Right the plan is to include the metrics that go into computing the criteria in addition to the score in the .csv file.

@Jamie, can I play something back to you so it's clear. The contest is in warm up mode were most/all the participants only have 2 days of VADRs, which is an expanding window until 63 days, when it becomes a rolling window?

I like the contest by the way. #hatersgonnahate

Hi Dan,

That sounds right to me. Folks who submit to the contest later on will go through a similar 'warm up' period (expanding window) until they hit the 63 day mark when that window starts to roll.

Hi Jamie,

Any chance you could provide a comparative example in the scoring function example I provided above?

Here's an example of same algo submitted at two different start dates and illustrates how a later start can be an advantage.
This notebook is for the early start:

Loading notebook preview...
Notebook previews are currently unavailable.

And here's the one started at a later date.

Loading notebook preview...
Notebook previews are currently unavailable.

@ James - with a crystal ball to predict the future, then yes, there could be an advantage delaying entry to the contest. Otherwise, I still don't see it. Perhaps you are onto something, but it seems like it only applies if one has super-human predictive powers. It could turn out that for a given contest period, folks who enter later end up doing better, but it could just as well go the other direction, with those who enter early doing best. Given that the scoring is based on cumulative returns, I kinda think that there isn't going to be a "sweet spot" for entering the contest--earlier is better. Certainly, waiting too long is bad, since the most that can be earned in a given day is $50.

@Grant,

The above notebooks are actually a clone of your algo, so you must have super-human predictive powers, just joking! The one thing that you guys are not taken into consideration is the contest is designed to be perpetual, continuous with no designated end date. We are dealing with a time series that is non stationary and nonlinear where market conditions change over time. Given these conditions, one can theoretically find a "sweet spot" specially when there are regimes shifts and incorporating the most recent data in a backtest will give one an "edge" going forward.

If you observe some of the more mature contests out there like Kaggle and Numerai, they give everyone the same datasets to backtest and predict on. They also judge based on the same OOS periods. This is the proper methodology of measurements, apples to apples.

Here's your algo, for completeness:

Clone Algorithm
7
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 5a8e82a80e9e8c46fbc8bd5f
There was a runtime error.

@ James -

The Quantopian contest does have a standardized data set -- the required QTU universe.

You may have a point regarding comparing apples-to-apples. Q is talking about sharing a lot more statistics (as they have in the past). By normalizing the start date across algos, one could then draw more definite conclusions. It would also put everyone on an equal footing for evaluation of the trailing "structural properties" of the algo. The present contest and prize structure aside, if one were tasked with picking out the best M algos after N months of out-of-sample data (which is not quite the present contest), having a common start date would make a lot of sense. It is a reasonable question why Q did not bake this into the contest rules, so that the final ranking would apples-to-apples. My guess, though, is that they will run a different evaluation on all of the algos to assess their viability for an allocation. It will go back to 2010 (or as far as the data will support). And other criteria beyond the pseudo-Sharpe ratio will be applied, as well. For whatever reason, Q just alludes to the fact that the contest is now better aligned with their evaluation process, versus actually describing it to us in detail, and then showing how the contest overlaps with it. I guess that's all part of the shtick of being a hedge fund--mysterious alchemy is a big part of the marketing.

By the way, my algos do predict their own future...consistently miserable returns! A perfect shorting opportunity.

We just added a calendar to the dashboard which allows you to see the results from previous days. The calendar also lets you look at the scores of your entries from previous days if you navigate to Active Entries.

The results from yesterday's trading day were just published. The result published today will show up with a date of yesterday on the dashboard, since it is the result from yesterday's market activity.

@James (responding to your question from the other contest thread here for continuity): I think I understand your point a little better now. My understanding is that you're saying strategies which are making predictions farther in the future from when they were initially written/trained are at a disadvantage as compared to those that were written closer to the OOS period being tested. If I'm not mistaken, this is sometimes referred to as alpha decay. If that's what you're referring to, I agree with you. That said, my understanding is that alpha typically decays over a period longer than a couple of months. I also believe that the benefit of getting a head start and accumulating score for a month will generally outweigh the advantage of submitting a model trained 1 month closer to the contest date in terms of significance. I think Grant's suggestion from an earlier response is a good one. If you think the advantage of waiting outweighs the advantage of accumulating score, you can wait to enter the contest until a later date. I personally recommend the opposite tactic.

If you think the advantage of waiting outweighs the advantage of accumulating score, you can wait to enter the contest until a later date

Or restart your algo regularly when they hit adverse conditions.

Hi Jamie,

Sorry I posted in the wrong thread. Yes, you are correct in your understanding of that part, the alpha decay. However, the phase or rate of alpha decay is dependent on many factors, among them but not limited to, change in market conditions over time, how the algo responds to these changes, major regime shifts, no benefit of most recent data, etc. So assuming the alpha decay typically deteriorates for longer periods than just a couple of months is not really statistically valid.

But my more important point is the matter of proper measurement and this is hinged mainly on the fact that we do not really know what the future will bring in terms of price action. And to erase or neutralized that uncertainty, OOS data should be measured and ranked with the same start and end dates.

The above notebook based on Grant's algo clearly illustrates that the contest can be "gamed" by starting or re-starting at later date. All one has to do is to update his/her backtest everyday and see if re-starting at a later date will outscore the original date and if it does, he/she can withdraw the orignal entry and re-submit the entry with a later backtest start date. I don't think this is the intention of the contest. It is not too late to change the format of the contest to reflect the more equitable and proper way to rank the performance of the algo, your fine print has given you a lot of leeway to do this at any time. Please rethink this point very carefully to avoid future protests and complains.

@James, the winner of the contest will have for CAGR equation: g = (F(T)/f_0)^(1/t) – 1. As any other participant for that matter.

You are stating that a contestant getting in the game late has an unfair advantage having more OOS data to analyze. Maybe, but, that is unimportant. What is, however, is the 6-month deadline.

Comparing contestant a and b will give, is:

F_0∙(1+g_a)^(t-τ) -1 >? F_0∙(1+g_b)^(t) -1

where t = 6 months, and τ ≤ 6 months – up to 63 days. As a late entry, one is subject to the 63 days rolling window (3 months of data).

The more a person gets late in the game, the more difficult it becomes to win (not enough time to win). Second, and maybe more important, not enough trades to win either, since the winner will also have for equation F(T) = F_0 + n_b∙x_b_bar.

The only way to compensate is to actually have a higher performance level: g_a > g_b. And if such is the case, then bravo, that person should win.

@Guy,

You have things mixed up:

the winner of the contest will have for CAGR equation: g = (F(T)/f_0)^(1/t) – 1

Per contest rules as quoted above:

The score of an algorithm is strictly cumulative over its first 63 trading days (~3 months), starting from when it was submitted. Each day after it has been submitted, the daily return of the algorithm is divided by the trailing 63-trading-day volatility to compute a volatility adjusted daily return (VADR). The score of the algorithm over the first 63 trading days, is the sum of the algorithm’s VADRs since submission. After an algorithm has been running in the contest for more than 63 trading days, its score each day will be the sum of its most recent 63 VADRs.

And where did you get the 6 month deadline?

Again per contest rules quoted from above:

The Quantopian Contest is a continuously running competition.

You also don't seem to grasp the aspect of how it can be "gamed" as I articulated in my aboved post showing Grant's algo when started a month later outscored the original entry date. Same algo, different start dates and same end dates, why did the later date outscore the earlier date? Simply because one can pick and choose a start date that is favorable for him/her with regards to the scoring mechanism. Your closed formed formulas are too linear and does not apply.

Hi James -

Maybe I haven't fully grasped your observation yet, but as I understand, you would like to see a fixed starting date for all entries, for a given contest period, with the idea that this would preclude a certain type of unfair advantage (contest "gaming"). First off, since prizes are awarded on a daily basis on cumulative VADR, there is no "winning" at the end (unless the entrant with the most cumulative winnings is crowned the "winner"). If I understand correctly, after 63 trading days, Quantopian hits the reset button (with potentially tweaked rules), and it all starts over again (presumably, one has the choice of rolling over algos from the prior 63-day period, or submitting new ones). You contend that allowing late entries into a given 63-day period is unfair, but it is not clear given the way the contest is constructed.

If anything, the rules allow users to apply some judgement. If an entrant has evidence that the source of his alpha has decayed, but he has found a new one, then he can swap out algos, but the penalty for such a change is that the cumulative VADR for the new submission starts from the submission date, so getting into the top 10 will require an increasing level of skill as the 63-day period progresses. I wouldn't call this unfair. For it to succeed as a strategy, skill is required that actually is germane to writing algos for an allocation--being able to detect and react to alpha decay.

The other thing is that allowing late starts provides an entry point for folks to plug into the game, without having to wait. We'll see how things play out, but providing for the possibility of late comers to make a few bucks, just at random, would seem to be a better approach (since it is to the benefit of all to have lots of interest in Quantopian).

@ Jamie -

Once the 63-day period is over, it would be interesting to see an in-depth summary of the results. Ideally, you would put out a data set for the crowd to analyze. For example, a certain segment of the crowd is passionate about computing the Sharpe ratio conventionally. So, if you were to provide data that would allow them to do this, then it might be revealing. The other thing that would be interesting would be to scatter-plot the reward versus risk for all of the algos (e.g. expected return versus standard deviation in returns), in the framework of Modern Portfolio Theory.. This may (or may not) reveal the relevance of the risk-free rate to the Quantopian framework, among other things.

It would also be interesting to put the new contest, with its multitudinous constraints, in the context of the nice work that Thomas Wiecki, et al. did awhile back (see All that Glitters Is Not Gold: Comparing Backtest and Out-of-Sample Performance on a Large Cohort of Trading Algorithms and this post ). Presumably, you have a hypothesis that the new contest will help with the over-fitting problem; you could see if there was any improvement over the historical Quantopian baseline, or if over-fitting is just as prevalent.

Hi Grant,

Let me try to give you a hypothetical example on a specific source of alpha decay, market regime change from momentum to mean reversion. Say you have an algo that performs very well in a 2 year backtest under current momentum market conditions and you entered the contest on Feb. 16. Your performance score is doing very well up until March 1 when the market takes a sharp turn (much like what happened two weeks ago) and your alpha starts to deteriorate rapidly. Another contestant recognizing a possible regime shift creates an algo to tune it to possible mean reversion conditions and incorporates the most recent data to his 2 year backtest and gives him a good performance then enters the contest on March 16. Come June 16 (approx. 63 days later) scoring, we see the first algo's alpha decay while the delayed start algo's alpha increases and thus overtakes it. Here we have two algos of opposite styles, started at different dates and scored at same end dates and the one that wins is the one with delayed start because his algo is more in tuned to new market conditions. Historically market regimes persists for awhile, so early entry now recognizing his original algo is already out of tune with current market conditions, withdraws the entry, readjust his algo and reenters the contest. Now we will have a cat and mouse game of trying to fit to current market conditions. That is why, in my opinion, the only proper way to measure performance under this scoring system is for OOS to have same start and end dates because you are measuring performance under the same exact market conditions without any time bias.

the only proper way to measure performance under this scoring system is for OOS to have same start and end dates because you are measuring performance under the same exact market conditions with any time bias

I'll have to continue to mull it over, but I think there is already a built-in penalty for late starts, since the payout is essentially continuous, based on OOS performance. I agree that if the objective were to take a basket of algos and compare them apples-to-apples, one would want to have exactly the same in-sample and OOS sample test periods. In the end, keep in mind, that Q will have all of the algos to analyze for their fund (any algo with a full backtest can be analyzed with their automated screening system), and they can apply exactly the analysis you are proposing. For the contest, though, making late entrants wait 6 months doesn't make sense--give them a shot at making some money. And allow current entrants to swap out algos, if they want, for whatever reason. It's not just about a scientific apples-to-apples comparison of algos (which, as I understand, will be done by Q as part of their automated evaluation process); there's an element of fun randomness to it.

I'd say anyone with real predictive powers should not waste their time on the Q contest. Just take their life savings and go long or short on the market based on their forecast, and retire early.

Here we have two algos of opposite styles, started at different dates and scored at same end dates and the one that wins is the one with delayed start because his algo is more in tuned to new market conditions.

I agree the contest format by design favors YOLO algorithms that outperform in spurts that happen to align with the contest window. This problem however persists regardless of start time -- and I've argued this before -- because each author can per your example enter both a momentum and mean reversion algorithm concurrently. No matter the market regime, the author can have an algo that will at least temporarily outperform an all-weather algorithm.

The contest format encourages risk, and rewards the lucky, to the detriment of the fund.

To combat this I think a number of secret backtest periods that represent certain various market regimes and stresses should be evaluated as well, and the scoring shouldn't be cumulative, rather it should take the worst of the bunch eg 30 90 80 90 would be 30.

The contest format encourages risk, and rewards the lucky, to the detriment of the fund.

I have the feeling that Q knows this. One of the objectives, I gather, is to first get the crowd all aligned on the workflow and constrained framework, and then worry about algos that will show consistent 20-year-plus consistent returns. And they needed to replace the real-money trading offering with something casino-like, right? I wouldn't confuse marketing and herding cats with sound investment practices--in the new contest, we have a mix of both.

I also suspect that initial investors in the 1337 Street Fund may be looking for something off the beaten path, kinda speculative. So, encouraging some short-term thinking on the part of Q quants might be just the ticket.

Hi Jamie,

Attached is algo backtest for the second notebook.

Clone Algorithm
7
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 5a8e85d40acd7c46418d60bb
There was a runtime error.

Thanks James,

I realized that some of the disagreement on the 'enter today' or 'enter later' discussion might be stemming from a bug in my scoring notebook (thankfully, I'm not the one implementing this in the actual contest :) ). The contest rules say that your minimum score on any day is 0. This rule is in place so that no one has to withdraw and resubmit an entry if they end up in negative score territory. I implemented that rule incorrectly in my notebook shared above. Here's a correct version of the notebook with the two backtests of Grant's algo that James shared above. Note that the score is the exact same once the algos are both being scored. Admittedly, this is a bit of a theoretical example. In reality, the two algorithms might take different paths depending on their start date (usually only with minor differences), but I think the idea that entering earlier is better. There's definitely an argument to be had about how long a model stays predictive. If you think your algo is no longer predictive, then yes, it might be worth withdrawing it and submitting a new version. But in most cases, I believe leaving a submission running is to the advantage of the participant.

@Grant: You made a comment earlier:

If I understand correctly, after 63 trading days, Quantopian hits the reset button (with potentially tweaked rules), and it all starts over again (presumably, one has the choice of rolling over algos from the prior 63-day period, or submitting new ones).

There's no reset period after 63 days. After a submission has been in the contest for 63 trading days, it's score becomes a rolling computation. Each day, an algo's score will be the sum of is score from the previous 63 days. Of course, we will revisit the rules on a regular basis. If major changes are required, we might consider resetting the leaderboard, but it's not something that we plan to do on a schedules basis (or soon). Minor changes may be made on occasion, but these shouldn't require a full restart.

Hi Jamie,

Attach new corrected notebooks please.

Ah, sorry. Here's the notebook.

Loading notebook preview...
Notebook previews are currently unavailable.

Hi Jamie,

Tried to move the start dates further back (June 1, 2015) and ran your corrected notebook but it givs out errors. Please check.

Loading notebook preview...
Notebook previews are currently unavailable.

@James: Can you share those backtests so I can run the notebook?

Hi Jamie,

Well, after 63 days, newcomers will be at a pretty severe disadvantage until another 63 days pass. Then both old and new algos will have scores based on 63 days total. I guess this is reasonable. If an algo has staying power, then it deserves to continue to run until finally potentially being displaced by the new kid on the block. It is also consistent with your 6-month out-of-sample rule of thumb for evaluating algos for the fund.

By the way, I won $35 today...I've met my first goal of being able to buy a sandwich (and then some) from Q winnings/earnings. Thanks.

Jamie,

I’m curious why the scoring uses a sum of daily volatility adjusted returns instead of looking at the trailing 63 days of cumulative returns (and then adjust for volatility by simply dividing it like you currently have)?

By doing the sum of daily returns aren’t we not taking into full consideration how compounding works and in effect not weighting losses significantly enough?

As I was trying a few different risk management techniques I would run the notebook to check/score my backtests. There were a couple of times I’d be comparing two algorithms where one would have better metrics in my/investors mind (Sharpe ratio and total returns) but it had a lower score. These two algorithms had the same volatility of around 0.06. The contest may be missing algorithms that are actually performing better and, I’d think, more desirable from an investors standpoint.

Just curious if I’m missing something and/or this was considered?

To optimize scoring, would this work? Apply various scoring methods to past contest entries (for contests that ended prior to one year ago) to determine which ones would float to the top, then backtest the top ten from each contest through the past year and see how they do. Or the reverse, do the backtesting and determine which scoring method would best identify those that do best perhaps.

Part of the struggle here may be that the contest, in my estimation, has a mish-mash of objectives, so any discussion of a kind of global optimization of the contest rules and rewards would require getting the objectives straight in the first place. Part of the grand Q experiment is to see if they can replicate some semblance of a traditional "brick-and-mortar" hedge fund with global crowd-sourcing of analytical talent. They are effectively trying to plug into an existing market, with a little twist in how they approach development. One of their challenges is engagement and motivation. They advertise 160,000 members, yet we only have ~ 60 of them participating in the contest (rounding, up, that's 0.04%). Perhaps the participation level in writing and submitting algos for an allocation is much higher, but I would think if this is the case, it would be strange for those members not to just click a button and enter the contest. Another challenge is that I suspect most of the algos they've gotten to-date have structural problems, vis-a-vis what is expected by the market. Hence, we have the various constraints and their associated tools. The other challenge, potentially, relative to other hedge funds, is that Q is in a position where they need to fund full-up scalable strategies, along the lines of what is described on A Professional Quant Equity Workflow. My sense is that in a hedge fund that is up and running with lots of capital, the focus would be more on individual sources of accretive alpha, versus tasking quants to write diversified algos that do everything. The contest is structured to encourage soup-to-nuts algos that are stand-alone, versus individual alpha sources (e.g. ones that focus on individual industry sectors). There is also a big time scale difference in what is needed to get any statistical certainty in a given strategy, and achieving shorter-term objectives. To know if a given algo really has legs, it is probably something like a 10-year backtest, followed by 6-months out-of-sample, then seed money for another 6 months, followed by an initial injection of working capital. In 2 years, looking at a 10-year backtest plus the real-money trading, one could make a conclusion. So, the relevance of performance on the time scale of a contest that would capture and hold the crowd's attention has to be considered. For the contest to work, it needs to have a strong element of regular Pavlovian reinforcement, but also have some overlap with long-term objectives. In ~2.5 years, it would be interesting to see if the 1337 Street Fund is composed of lots of prior contest algos--this would be the best measure of success for the contest.

The new contest rules are forcing me to rewrite strategies that did well for over 12 months but are ineffective now. Kind of back to square 1 but now equipped with python language knowledge.

Hi Jamie,

Sorry for late reply. Just needing to see if the correction holds in longer OOS data. Here's the first backtest start date: June 1.2015:

Clone Algorithm
7
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 5a95900230b6c845a4e7a8d1
There was a runtime error.

And here's the other start date: January 4, 2016.

Clone Algorithm
7
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 5a8e82a80e9e8c46fbc8bd5f
There was a runtime error.

James, it looks like I needed to use a version of floored_cum_sum that returned just a value instead of a Series for the rolling component of the computation. The attached notebook seems to work.

Loading notebook preview...
Notebook previews are currently unavailable.

Thanks, Jamie!

@ Jamie -

Any thoughts on a path for parlaying winnings? My original plan was to make some money in a Q contest, and then effectively keep it with Q, by setting up a Robinhood algo. Alas, this is no longer an option (presumably, still a done-deal), so I'm trying to sort out what to do with my $35. I realize that there are regulatory impediments to allowing mere mortals into the hedge fund club (one has to be "qualified" with a series of financial initiation rites, and presumably you have set a high minimum investment), but I'd think that with a little cleverness, y'all could figure out how Q members could roll contest winnings into the 1337 Street Fund, no? Alternatively, if Q has been engaged in the Zipline.IO project, I'd be curious what your take is on the feasibility of parlaying contest winnings (and hedge fund allocation earnings) on this Q-compatible platform.

Generally, it would seem that for both contest winnings and hedge fund allocation earnings, it would be in Q's best interest to provide a path to parlay the money within the Q ecosphere, but maybe that didn't work out from a business standpoint?

Grant, the cash prizes will be paid out to winners directly. We don't currently have plans to change that path. Contributing to the Zipline live project sounds like a nice idea. It looks like the contact page is a good starting point if you'd like to contribute.

Jamie,

Any feedback on why the sum of daily returns was chosen over a trailing 63 day cumulative return for the contest scoring? As I mentioned, and I’m sure we’re all aware, an algorithm with higher average daily returns isn’t necessarily going to have higher cumulative returns. I would have thought a trailing 63 day sharpe ratio (but still putting a floor on volatility - great idea there!) would have made sense and would allow better comparisons to other finacial instruments.

Also is there a particular time we can expect the contest leaderboard to update? The old contest updated at a fixed time which was nice.

Hi Stephen,

Sorry that I missed your question earlier. You're right that the compounding effect isn't taken into account when the daily VADRs are summed together. As you suggested, generating the score from the trailing cumulative returns would include the compounding effect. There's already an advantage for entries that have been running in the contest for a longer period of time (more time to accumulate returns). Adding a compounding effect could make it even harder for new participants to win in the 3 month span.

That said, your suggestion is interesting. At regular intervals, we'll take a look at winning algorithms, the scoring function, and the criteria to see what's working and what's not. I'll keep your suggestion in mind when we review the scoring function down the road. It's hard to gather much from the results at this point since we don't have much data, but we should be able to learn more when we've collected data from a few months of competition.

Regarding the update time, the daily leaderboard is currently kicked off at around 8am ET. The leaderboard can't be published until all entries have finished running. Backtests are capped at a 5 hour limit, but there's also time needed to spin up servers, verify results, etc. In general, I'd say you can expect to see results by around 3pm. We're hoping to push forward a little bit in the coming weeks.

Hi Jamie,

It would be nice if some measure of backtest translates to the contest. The performances need to be merged in some way even if they are not out of sample for the same time periods. It would be nice if the 2year backtest performance can be used to smoothen the short term contest score.
Considering Q prefers low volatility I would say a score like min[backtest_annualized_score, contest_annualized_score, (backtest_annualized_score + contest_annualized_score)/2.0 ] is a sample way to smoothen the score and provide more of a realistic measure of what one might expect to see long term.

The problem with using a short term measure to judge a score is that these might not necessarily hold up on a very long time frame which is what you are seeking for an allocation.

-Leo

Thanks Jamie for the feedback, makes sense to adjust a bit after you've let it run for a few months.

I will second Leo's suggestion that some inclusion of backtest results would be beneficial and may help reduce the element of luck. I know you are trying to discourage overfitting, but it is frustrating to see contest winners with terrible backtests - those seem obviously lucky 3/6 month stretches instead of a truly good and robust algorithm. Taking the minimum score of the backtest and OOS score may help get the benefit of measuring consistency without rewarding overfitting.

Lastly, I posted in this thread too but I am receiving an error in live trading when using the risk pipeline. The reason I mention it here is that checking on live trading results would be nice to quickly understand where your score will move in the contest instead of waiting until 3 PM. And as I mentioned in the other thread, I find seeing live trading performance as immensely beneficial in the development of a sound algorithm.

it is frustrating to see contest winners with terrible backtests

For the new contest, we have no visibility of the contestant backtests (other than that 70 algos managed to pass the constraint tests).

I think the question is, if an algo meets the constraints for 2.5 years (2-year backtest plus 0.5 year out-of-sample), and has a realistic, decent Sharpe ratio (0.5 to 1.0 or better, with the risk-free rate subtracted), will it have a shot at an allocation?

I do find it kinda odd that Sharpe ratio is not being used...it could be reported raw and with a lower limit on the denominator. Perhaps Jamie could elaborate on why Q is not using it? SR is kinda the standard thing, right?

Q, partly because a sudden jump upward in returns lowers Sharpe? It's a measure of combined upwardness together with steadiness.

Double-edge sword, Stephen as averaging the OOS with IS scores can also benefit those algorithms overfitted for the backtests.

@ Jamie & all -

Is it a correct interpretation of the scoring that if a given algo's score drops to zero, there would be no downside in simply replacing it with one that is potentially better? In other words, is it correct that there is no incentive to keep an algo in the contest that has a score of <= 0?

Hi Grant,

I have been contemplating on this since Jamie issued the implementation of floored_cum_sum in score routine. I quote Jamie from above post:

The contest rules say that your minimum score on any day is 0. This rule is in place so that no one has to withdraw and resubmit an entry if they end up in negative score territory.

My conclusion is with the addition of floored_cum_sum, when your algo's performance score hit <= 0, it is floored to zero and this to me is equivalent to restarting your algo point in time. This is how I think they fixed the issue of putting the algos on equal footing regardless of when your start date is.

This, however, raises other issues in terms of true algo performance because negative scores are floored to zero which technically and strictly speaking results in an over statement of true performance. This confuses me! I still stand by my original statement, that the most fair way to score algo performance is to have the same set of start and end dates in the 2 year backtest and the same amount of OOS data with no flooring of scores.

@ James -

I'd think of the contest more as a kind of training exercise--the more authors with 10,000+ hours of Q-conforming algo development experience, the better off Q will be in the long run. The details of how the contest is scored, apples-to-apples comparisons, etc. aren't that important in my opinion. It's better if at a regular clip, authors are engaging with the material, and improving their strategies. I think they are basically incentivizing a daily head-scratch: "Hmm...should I replace one of my contest algos with a new and improved one?"

Part of the confusion here, as well, is that the actual evaluation process for getting an allocation hasn't been spelled out in detail. In the end, a different process will be applied to the contest submissions. I'd expect this to happen in 6 months, without any visibility to contest participants (unless, of course, one's algo is selected). It'll be done in the background (as it is now, pretty much on any full backtest, as I understand).

Good point Karl about overfitting. What if they made their “positive returns” constraint a little more realistic and competitive? In line with what Grant just said, I’d be interested in what their minimum is. Does an algorithm need 5% annual returns? Or is 4 okay, even 3?

Whatever the realistic minimum is would be helpful helpful to know and would filter out contest leaders/winners like this one that is returning ~2% a year.

https://www.quantopian.com/leaderboard/34/59d0467e663e2d0010063ab0

Grant,

If the intention of the "contest" or "training exercise" is to harness more authors with better algos that pass their 2 year backtest requirements in the long run , then measuring what "better algos" are through an equitable scoring system becomes paramount. If equitable scoring is not important then how do you measure what are better algos?

Or put another way, let's not nikpik on whether scoring is fair or accurate and just accept Q's current scoring system because it is not that important as the main purpose of the contest is to get more authors to submit, incentivized with daily prize winnings and the prospect of being given an allocation.. Six months from now, Q will arbitrarily choose which ones needs further evaluation for allocation purposes regardless of its ranking in the contest. Is this what you're intrepretation of the contest is? Maybe, I'm not understanding correctly.

@ James -

I just wouldn’t convolve the contest rules and payouts too tightly with what would correspond to evaluating algos for a long-term investment. That will be done in 6 months, by Jess Stauth’s hedge fund team (I think she runs this part of the operation).

In the end, if we just have access to the names and scores as they are being published today, it’ll be hard to draw conclusions. So it would be interesting to shift the discussion to what other data will be shared as the contest progresses.

My cut on the contest scoring is that because of the zero-floor constraints, you're better off with an algo that has long positive return runs, no matter how much, perhaps followed by a huge fall, which when it goes below zero, you just start again...no harm, no foul!

Our problem is that I look at our daily returns trace on the backtest page and our algos kinda look like a random walk, with no discernible positive runs!

@ Jamie -

Could you explain your decision to limit the published data and metadata of contest entries? There is a much larger data set you could provide, but decided to limit it to the daily colored animal scores versus time. I suppose that is all that is needed to get a sense for how one is doing with respect to the field of entrants, but it does feel like another Q step in the direction of opacity versus transparency.

Anyone know what is the role of contest in the allocation process. Gotten mixed messages from Q team.

"Doing well in the contest will get you a front row seat in the allocation process" - Thomas Weicki. "Pay more members of the community" - Jamie. "Contest is just to get you thinking along the lines of the type of algorithms we require for the allocation" - Dan Dunn.

In particular, is there special consideration/process of contest algorithms done for the allocation process?

Quantopian has very specific requirements of what they want to see and no longer want contribution of those who can’t write algorithms that is driven by their institutional investors.

I received an email saying a couple of my backtests appear to meet requirements for funding, with links to them. They were merely in the IDE and unfortunately I had deleted them already. I think they were just the Q Risk Model Algorithm example with small changes.

So they have an automatic process that does canvassing. When I run across something that looks consistently good, I'll run a Full Backtest from now on and that'll make a run less deletable and hopefully visible to that process.

Tip: I'm keeping separate little browser windows for different years to quickly toss code in them when I find good results with my usual date range full screen. So far, one of those other years always tank (English slang for plummet).

@ Jamie -

There have been several forum posts on the unrealistic simulation of shorting. This would seem like a potential for contest "gaming" since my understanding is that even if the QTradableStocksUS universe is used, unrealistic outcomes may result (due to the cost of shorting and/or availability of shorts). Is there anything in the present contest constraints that would impose a proportionate penalty for trading in difficult-to-short stocks? If not, how might the rules/constraints be modified down the road, to do a better job of simulating shorting?

Also, presumably you could analyze all of the contest backtests to get a sense if there might be gaming going on (assuming you have some way to back out the impact of not accounting for shorting costs and unavailable shorts). Is this something you could do, and report back?

+1 for simulating shorting availability and costs. Otherwise there appear to be sources of alpha that aren't really there.

In other news, contest payouts look like they're happening now via a clunky web 1.0 site called globalewallet. They're asking me to email them documentation and lie about my address because of a bug/incompatibility on their side. Where'd you guys dig up this provider? 2002? You'd think being in the fintech space, Quantopian would be a little more with it. I'm only teasing. :P