Hi, @Jamie, many of us are still a LONG way from being done yet with our comments & feedback :-) Here are a few more items ....
Some algos are reasonably sensitive to whether rebalancing is done daily, weekly, monthly, or at some other interval(s). Therefore I must respectfully but very strongly disagree with the comment by @Steven Williams: "...what about implementing a long/short scoring system and then rebalancing monthly based on the scoring system?". NO, please Q, absolutely not. Rebalancing should be at the discretion of the algo WRITER (who will then take the consequences of their own choice), not something enforced by Q monthly!
2) Scoring/Ranking basis:
@ Jamie, in your original post, you write: ".... they will receive an absolute score based on two metrics: returns and volatility" This of course makes perfect sense because, after specifying & satisfying various constraints, such as whatever Q may need to impose as requirements for the fund, and also to facilitate evaluation of the algos, some combination of the two metrics RETURN & VOLATILITY are really all that is necessary to quantify the success or otherwise of each algo. As for the return part, that's fairly easy. Either a "Terminal Wealth Relative" to starting equity, or a Compound Annual Growth Rate (CAGR%) concept are easy and are pretty much the standard ways of doing it. However the "VOLATILITY" part is definitely not so simple. I ask Q please to give this part very careful thought. Either it can be done as some sort of conventional volatility-related measure, of which there are many, such as for example Standard Deviations .... but if so then of WHAT exactly: Symmetrical Upside & Downside moves, or just Upside moves (no, only joking about that;-)) , or just Downside moves (and thanks for your support @Vladimir on that), or alternatively as Drawdowns (DD) .... but again what DDs exactly?: MaximumDD , AverageDD, Root-Mean-Square (RMS) DD? Personally I would suggest the latter, as it is the best way of also taking into consideration the TIME element of the DD as well as its magnitude.
Then there is the choice of whether the "volatility" part is best expressed on its own (as described above), or instead as some ratio to the returns part? The latter choice makes more sense to me, as it highlights the combination of reward AND risk together. If it is expressed as a ratio, then there are some resulting well known conventional measures already familiar to Q such as, for example:
Returns / StDev of Returns (Up and Down) --> Sharpe ratio,
Returns / StDev of Downside only --> Sortino ratio,
Returns / MaximumDD --> MAR or CALMAR ratio,
Returns / RMS DD --> "Ulcer Performance Index" or UPI ratio,
Slope & Standard Errror of regression line --> Lars Kestner's "K-ratio", and others.
My personal preferences would probably be either for Sortino as it highlights the downside part of volatility, or even better UPI ratio based on RMS Drawdown, but the others have merits too. What I think is most important is that Q does not take an "academic" approach (such as simply using StDev), but favors the use of more industry / practical trading related concepts such as Downside Volatility and/or Drawdown (@Vladimir & I are very much aligned here, i think) .
Finally on this point, there is the question of whether or not Q is going to continue making a single (1-dimensional) ranking of participants in contests. From the point of view of algo selection for Q, this is probably neither necessary nor desirable, and a 2-D "minimum threshold" requirement on the dimensions of BOTH 1) Return & 2) Volatility (or ratio) will presumably be required for selection of algos to actually be used by Q in future. However if Q is going to have a "ranking of winners" (which some participants like), then how will the 1) Return and 2) Volatility (or ratio) parts be combined into a single final score number? Transparency please.
3) Application of Algos in Q: This one is especially for @Dan & others in Q, as well as the general Forum readers.
My apologies if I'm taking up a fair bit of space here, but i believe this part is ESSENTIAL.
In addition to the items above, there is an aspect to the contests and how they relate to the professionalism and the future viability of Q as a hedge fund which has not, as far as I know, been mentioned in any of the Forums or elsewhere yet.
Obviously to attract and hold the interest of participants (algo authors) the contests must be limited in duration, and 6 months of live trading is probably about as long as most authors (with let's call it perspective "a") would be happy to wait. However Q presumably wants algos that generally have a longer shelf life than that and so, from that side (perspective "b"), 6 months is probably the very minimum for giving any assurance that a algo is actually viable under a range of different market conditions. In my opinion, 6 months is actually WAY too short unless some other measures are taken. However 6 months probably represents about the best workable compromise between the two sets of conflicting perspectives a) & b). Now, assuming that Q actually wants algos that are durable and robust under different market conditions in future, is there anything that Q can do to improve the chances of success in future, rather than just simply running a live test for 6 months? Yes, there certainly is, and it takes very little extra time or resources.
Before describing, it is necessary to understand the following: The RETURN from any algo or trading system of course depends on the results of all the individual trades, and is ORDER-INDEPENDENT. For anyone to whom this is not intuitively obvious, it is easy to prove. The order of a sequence of gains & losses can be changed, but the final return remains the same. Conversely however, while DRAWDOWN or VOLATILITY (or however else one chooses to describe and quantify the flip-side of returns from trading), again depends on the results of all individual trades, but this aspect is ORDER-DEPENDENT. As far as Drawdown (DD) is concerned, it makes a LOT of difference whether gains & losses alternated, or a big string of losses all occurred together.
Now, when Q tests an algo for 6 months, a lot of trades are generated. It can be assumed that this (generally quite large) set of actual trade results are reasonably representative of the algo in general and are in some sense "typical" of what the algo will probably do in future. This may or may not turn out to be true, but it is the best assumption that can reasonably be made. Of course we don't know what the future will hold in terms of overall market behavior, but if algo XYZ generated N trades in a 6 month test period, then it is reasonable to assume that i) it will probably generate about N trades in the next 6 month period, and ii) the overall distribution of trade (gain & loss) sizes will probably also be similar. Assumptions i) & ii) may turn out to be wrong, but they are about as good as anyone can do in advance and are consistent with statistical theory as well as common sense. However even if the trade distribution remains the same, the specific ORDER of the trade results will not be, as market history is unlikely to repeat itself. The astute reader has no doubt already figured out where this is leading. The best way for Q to estimate the most likely NEXT 6 months performance of an algo after 6 months of testing is not to assume that it will just be identical to the last 6 months, but rather to take a random sample (with replacement) of N trade results from the distribution of actual trades, see what the resulting Return & Volatility or other metrics are, and then repeat a few times, preferably at least about 10,000 or so, and then look at the Most Likely (or Median, or P50) outcome. Of course this is a Monte Carlo (MC) simulation technique, and it only takes a matter of seconds or maybe a minute or so on a PC. It is common nowadays in most commercial trading software packages (e.g. AmiBroker, etc), and is easy to implement.
The big advantage of running a MC simulation using the results from an actual 6 month period of trading is not just that it gives a "best estimate" of the likely results from the NEXT (i.e. future) 6 months or whatever period of trading, but ALSO it generates a probability distribution for the possible trading results that allows quantification of the Black Swan type "tail risk" (to Quantopian) of continuing to use this algo in future (at least to the best of anyone's ability to do that).
What I find absolutely incredible is that Q is putting so much effort into encouraging people to write good algos with supposedly leading edge "Risk" evaluation tools, but is apparently doing nothing at all with regard to quantifying even the basic elements of this different kind of risk, namely that of actually using the algos in future. To most people who have been writing trading systems or algos for at least a few years, this is very basic and well documented stuff. We would not consider using any system that had not made this sort of consideration of tail risk, and for a professional fund not to do this, especially when it is so easy, would be bordering on downright negligent!! I don't know if Q is doing this sort of MC tail risk evaluation regarding continuing to use supposedly "successful" algos, but so far i have not seen any evidence of it. I have to wonder why not, when it is so easy to implement?
Please Q, for the sake of your and our future, if you are not implementing this already, then start doing it now AND include it in the final evaluation of all algos, even if not (which would be easy enough) as an ongoing item while the algos are still running during testing. After all, if people can wait for 6 months to see their "final results", then a few more minutes of calculation time after that should be easy for everyone. The future of Q may just depend on this. Evaluating and quantifying as well as possible the tail risk associated with continuing use of algos is CRITICAL. Ignoring tail risk leads to consequences like LTCM. Please Q, don't let that happen to you/us. Start adding in MC evaluations of tail risk based on the distribution of trade returns from all algos ASAP.
Thanks in advance for your consideration, TonyM.