A New Contest is Coming: More Winners and a New Scoring System

We are getting ready to launch a new quantitative investment contest. We're sharing a draft of the new rules so that you can start developing a strategy, and so we can get your feedback on the rules before we finalize them.

The new contest will give you better and faster feedback on your algorithms. The rules will be better aligned with how we evaluate algorithms for allocations. The Quantopian Risk Model is a key part of our allocation process, and it will soon help drive the contest.

The new contest starts with a set of constraints to help keep your risk exposures low. Then, the contest uses a new scoring method to rank entries. Importantly, the new scoring method does not depend on the performance of other participants, so you'll be able to know the score of a backtest as soon as it finishes.

There are a few more details about the contest that we're not sharing yet, like the prize structure. The new contest includes more prizes than the old contest, so many more community members will be winning cash prizes. We'll share more details as we get closer to the contest kickoff.

Here is the current draft of the rules. We're open to feedback, so let us know what you think.

Constraints -- Updated on Jan. 31, 2018
One of the major changes we are making is that algorithms will need to behave within a set of constraints in order to be eligible to compete in the new contest. When you submit an algorithm to the contest, we will automatically run a 2 year backtest over which these constraints will be tested. As your algorithm continues to run out-of-sample, these constraints will continue to be monitored. Algorithms will be required to satisfy all of these constraints on a continued basis in order to stay active in the contest.

These are the constraints that will be applied:

Constraint                                                                Threshold                                                      Computation Window             Tool
Use order_optimal_portfolio to Order Must use. Every day. Optimize API
Trade Within QTradableStocksUS >= 95% Every day. QTradableStocksUS
Mean Daily Turnover 5% to 65% Trailing 90 days. schedule_function, tearsheet
Position Concentration <= 5% Every day. optimize.PositionConcentration
Leverage 0.8x to 1.1x Every day. optimize.MaxGrossExposure
Sector Exposures -20% to 20% in each sector. Trailing 90 days. optimize.experimental.RiskModelExposure
Style Exposures -40% to 40% in each style. Trailing 90 days. optimize.experimental.RiskModelExposure
Beta-to-SPY (absolute value) <= 0.3 Every day. optimize.FactorExposure
Net Dollar Exposure (absolute value) <= 0.1 Every day. optimize.DollarNeutral
Positive Returns > 0 Through today* Research

* Your algorithm needs to have positive total returns starting from the beginning of the 2-year backtest, up to the current date in the contest.

Scoring/Ranking
Another big change is that algorithms will no longer be scored relative to their peers. Instead, they will receive an absolute score based on two metrics: returns and volatility. Each day, the daily return of your algorithm will be normalized by its trailing 90-day annualized volatility. As your algorithm accumulates out-of-sample performance, these volatility-normalized daily returns will be summed together to form a cumulative score. All active algorithms will be ranked based on their cumulative score.

The attached notebook can be used to evaluate a backtest. It first checks 8 of the 9 constraints (it doesn't verify that Optimize was used to place orders). It then scores the backtest on the performance after the first two years. The first 2 years are used for the constraint computations and are meant to be similar to the 2 year backtest that is kicked off when you enter the contest. The score is computed on the remainder of the backtest. Try running it on one of your backtests (must be longer than 2 years), and let us know if the results you get are surprising.

Notable updates made on Dec. 13, 2017:
- Maximum position concentration lowered from 10% to 5%. After community feedback and internal discussion, we feel that 5% is more appropriate (and still not too restrictive) as an upper limit for this contest.
- Beta-to-SPY limit relaxed to 0.3, up from 0.1 after feedback from the community. In addition, the calculation is now simply the 6 month rolling beta, and is no longer a 90-day mean.
- Positive returns constraint added. This should have been included in the original post, oops!
- Tool added for controlling exposure to sector and common style risk factors. See the announcement of this new tool here.

Notable updates on Jan. 18, 2018:
- Fixed a bug in the scoring function that was incorrectly computing volatility from the entire backtest duration instead of on a trailing 63-trading-day basis.

Notable updates on Jan. 31, 2018:
- Increased the minimum threshold of the QTradableStocksUS rule to 95%.
- QTradableStocksUS updated to the new definition.
- Added outlier guarding to each criteria. Each rule has a wider limit permitted on 2% of trading days.

93
Loading notebook preview...
Notebook previews are currently unavailable.
Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

268 responses

This is a backtest that was evaluated with an older version of the above notebook. Note that it failed to meet two style constraints.

140
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 5a15d6dad86a273fd9bfb0c5
There was a runtime error.

And these new rules, once finalized, will be effective starting January 2nd, 2018?

Regardless of the details, this makes a lot of sense:

The rules will be better aligned with how we evaluate algorithms for allocations.

Regarding using the QTradableStocksUS as a base universe, whatever y'all want. But beyond that, I don't understand what is meant by the threshold >= 90%. What if I use:

market_cap = Latest(inputs=[Fundamentals.market_cap])
universe = market_cap.top(2*NUM_TOTAL_POSITIONS) & QTradableStocksUS()


How would the threshold >= 90% be interpreted? Would the universe size need to be >= 90% of the size of the QTradableStocksUS? And how does this relate to the number of positions held, on a daily basis? Or does it just mean that you'll check to see if 90% of the positions held on a daily basis are within QTradableStocksUS (e.g. if 100 positions are held, at least 90 of them need to be found within the QTradableStocksUS on any given day)? And does this allow ETFs? Leveraged ETFs? If only 90% of the portfolio (by count?) is from QTradableStocksUS, can the rest be anything?

Regarding turnover, it can be constrained using the Optimize API, unless you are not supporting this feature?

For your nice table of requirements above, it would be interesting to understand what they are for the 1337 Street Fund as a whole (where presumably some of these contest algos will land). I won't hold my breath for feedback, but it would give some insight into whether the crowd-sourcing from 160,000 worker bees is expected to have a dramatic uncorrelated diversification (i.e. Sharpe ratio) effect as predicted, or if each algo will need to stand on its own more-or-less. It sorta feels like each algo needs to be stand-alone...

Mean Daily Turnover 5% to 65%
Lower bound of 5% might be too restrictive for some datasets that update quarterly or strategies that rebalance weekly or monthly. A lowered limit like 3.33% would be more accomodative of those strategies.

Trade within QTradableStocksUS >= 90%
If QTradableStocksUS is all that the fund can trade not sure why this is not 100%

Leverage 0.8x to 1.1x
Might be better to require the leverage setting in optimize api (optimize.MaxGrossExposure) to be set to 1 so that it is uniform across the board in the contest across algorithms.

Glad Q is moving towards more automated and immediate feedback. Is it possible to standardize the automated score to a range of 10 or a fraction upto 1. Then it becomes easier to compare across algorithms. Would love to see that automated score be printed in the backtests as well.

Updating optimize to automatically handle risk exposures is excellent addition to the platform. Great job!

I second @Karl's comment re Mean Daily Turnover. Why is it necessary that portfolios should churn at least 5% / day?

As to the 2 key metrics that Q is proposing, obviously having high return as one goal makes sense.
However 90d annualized volatility, although being conventional in nature, is actually a relatively poor measure of variability for trading and it is not hard to find various significantly better alternatives. Firstly, do you really want to penalize fast windfall gains, as you will certainly do if you use symmetrical volatility? If you really must use volatility, then surely a better choice is to use DOWNSIDE volatility.

There are also several other ways to look at this. Think what you really want in a trading algo, in addition to returns. I would suggest the following attributes:
a) Smallest possible absolute value of MaxDD,
b) Smallest possible absolute value of either Average or (even better choice) RMS DD,
c) Smallest possible durations of individual DD periods between successive equity highs, and
d) Smallest "total flat time" as the sum of the individual items in c).

Another way to look at it is simply to think what you would really like in terms of the SHAPE of the IDEAL equity curve over time. It would presumably be a semi-log straight line of the steepest possible slope (highest return, already taken care of by max return). Obviously every real equity curve has blips & bumps and the aim should be to minimize these by considering not only the total magnitude of deviations from semi-log straight line behavior (as per volatility) but also the number of reversals of direction (that stress investors) per unit time. Conventional volatility does not adequately capture that.

Contests:
Statistical variability will ensure that some contests will naturally have more good entries than other contests. Presumably Q wants the best algos possible, rather than a fixed number per month. I like the idea that algos will be scored on an absolute basis rather than relative to their peers, and also the idea of immediate feedback regarding algo performance relative to Q's standards.

The requirement "Use Optimize API to Order" needs to be clarified. Do you mean that orders must be placed using order_optimal_portfolio(objective, constraints) or something broader? For example, this would do the trick:

    objective = opt.TargetPortfolioWeights(weights)
constraints = []
order_optimal_portfolio(objective, constraints)


But it doesn't really use the Optimize API; nothing is being optimized. Presumably this would be o.k.? It seems that if one can construct a set of weights that will meet the requirements without using the Optimize API, it avoids the obfuscating churn of the optimization altogether. When the algo is combined with some fraction of 160,000 other ones, you can then assess if any constraints need to be applied in the Portfolio Construction step for the fund (see flow chart on https://blog.quantopian.com/a-professional-quant-equity-workflow/). The bulk of the diversification and risk management should be coming from the variety and number of algos in the 1337 Street Fund, so it may be counterproductive to over-constrain them on an individual basis. But maybe you've already tried that, and it doesn't work...

I don't understand "These are the constraints that will be applied" and then specifying the Tool. In your notebook, you are not using the Optimize API. Are you requiring use of the specified constraints in the algo by the user? Or are you saying that your evaluation notebook is designed to match up with the Optimize API constraints? Or something else?

One minor gripe I have about the Optimize API and the risk model is that to my knowledge, you have yet to open-source the code. For those so-inclined, digging into the code (and perhaps even "rolling their own" should they see fit) should be an option.

Those new rules seem like fun :)

I really like the new rules. One thing that I think would be great to add is some measure of consistency from backtest to OOS (like the consistency score in the Bayesian analysis tearsheet). Probably a little too CPU intensive for the scoring process though...

One comment regarding the rule "Use Optimize API to Order -> Must use". This assumes one algorithm has one single entry point for its orders, but one algorithm might spread its orders across the day (every ten minutes, or every hour) to avoid excessive slippage or to simply being able to trade a bigger capital than otherwise would be impossible. I feel like Optimize API enforces rigidity on orders execution and algorithms in general (but I understand why you want/need this single entry point for the orders).

Hi @Jamie, had one more suggestion. Since the pipeline is calculated before trading start and trades get executed after market open a big market gap can cause some deviations from our optimize api settings. Hence I would recommend not having an overly restrictive daily limit and instead converting the daily limit numbers to trailing 90 days with maybe a wider daily hard limit that is perhaps 1.5X of the numbers mentioned in the post.

Net Dollar Exposure (absolute value) <= 0.1 Every day.
goes to
Net Dollar Exposure (absolute value) <= 0.1 Trailing 90 days and <= 0.15 Every day

Jamie, a couple of questions:

“We expect to be releasing a new tool to help control the style and sector exposure via Optimize soon.” Please can you include market beta in this. I know we can calculate with a linear regression to SPY, but why not use a common definition with a fast lookup.

Will December’s competition go ahead or will you have the new rules ready in time?

Happy thanksgiving!

EDIT: I get a warning:

10 assets were missing factor loadings, including: AWE-21361, CE-1403, CF-1745, COX-12527, FBF-25132..WLP-8210. Ignoring for exposure calculation and performance attribution. Ratio of assets missing: 0.019. Average allocation of selected missing assets:

AWE-21361    39689.227521
CE-1403      38545.228421
CF-1745      53628.293125
COX-12527    52956.153936
FBF-25132    53753.143175
WLP-8210     -9342.179927



NOTE: evaluate_backtest says my max position concentration was 12.32%, which is later revealed to a short position in AIG. I am ordering every day with Optimizer with a 5% max position constraint. How is this possible? Surely a short can at most double as the stock goes to zero? Anyway, it seems the short was rather successful, so a bit harsh to say it failed the constraint. Likewise, it reports a max net exposure of 11.37%, even though the positions are chosen every day by Optimizer at max 5% exposure. Presumably this is AIG again causing a very large single day gain.

I set my target leverage to 90%, as I note the constraint is not symmetric.

As Burrito Dan mentioned the Max Position constraint in with the Optimizer doesn't seem to work correctly for me either. Is anyone else seeing glitches? I also have an algo that seems to enter trades and close them 10/20minutes later once Optimizer was implemented.

Why is daily turnover necessary? What about implementing a long/short scoring system and then rebalancing monthly based on the scoring system? Does the contest includes commission fees and slippage? Also is turnover required every day or is that just the average? What about day 1-4: 0% turnover, day 5: 50% turnover (average turnover = 10%)

Hi Jamie, had one more suggestion and thoughts about beta.

Beta-to-SPY (absolute value) <= 0.2 Trailing 90 days.

Beta-to-spy is a noisy estimate in the short term. I got some clarification on that from our ex-CIO Jonathan Larkin in this post https://www.quantopian.com/posts/beta-in-backtest-seems-above-bounds-set-in-algorithm-for-brief-periods when I noticed my beta was changing considerably outside the bounds I had set in order_optimal_portfolio in the first few weeks of the backtest. I would suggest that the only requirement be what beta bounds we set in order_optimal_portfolio.

What beta we get in the future has some chance involved as to the distribution of stocks that the optimizer picks based on current beta calculations and future behavior of stocks that are completely out of our control and hence likely to exceed the thresholds we set in order_optimal_portfolio. An algo shouldn't get disqualified because of something that it cannot predict or control. I would recommend we use long term (trailing 2 year portfolio beta) as that is likely to be more in line with the bounds set in order_optimal_portfolio.

When I try and use

for stock in context.universe:

close = data.history(stock,'close',3,'1d')[:-1]
close_price = close.mean()
current = data.current(stock,'price')
current_position = context.portfolio.positions[stock].amount

Long_objective = opt.TargetWeights({stock:.10})
short_objective = opt.TargetWeights({stock:-.10})
pos_size = opt.PositionConcentration.with_equal_bounds(-.02, .02)
if current_position == 0 and data.can_trade(stock) and current > close_price:
algo.order_optimal_portfolio(objective = Long_objective, constraints = [pos_size])
print "short %s" % stock

if current_position == 0 and data.can_trade(stock) and current < close_price:
algo.order_optimal_portfolio(objective=(short_objective), constraints = [pos_size])
print "long %s" % stock


I get error TypeError: 'PositionConcentration' object is not iterable and can't figure out why?

I would like 2% allocations for up to 10% of the portfolio long and 10% of the portfolio short. Thanks for any help!

@ Luca -

You comment above:

This assumes one algorithm has one single entry point for its orders, but one algorithm might spread its orders across the day (every ten minutes, or every hour) to avoid excessive slippage or to simply being able to trade a bigger capital than otherwise would be impossible. I feel like Optimize API enforces rigidity on orders execution and algorithms in general (but I understand why you want/need this single entry point for the orders).

As far as I can tell, the only requirement is to use order_optimal_portfolio which presumably is equivalent to order_target_percent if no constraints are used. Although Jamie's example algo shows a single call to order_optimal_portfolio with constraints to perform a portfolio update, it could be used differently.

The picture changes if the requirement "Use Optimize API to Order" means using order_optimal_portfolio with a specified set of constraints. A check for conformance will be not only to confirm that the algo uses order_optimal_portfolio but that every time it is used, certain constraints are applied, as well.

One open question is, I wonder if there is an implied requirement to use MaximizeAlpha as an objective (as Jamie uses in his example algo), or if TargetWeights would be acceptable?

@ Jamie -

Requiring algos to run live to be active in the contest should be considered. The downside is that when some inevitably crash, then they are disqualified (even if the problem was due to a time-out, for example, which was not detected in backtesting).

The other angle is that in terms of a best practice, you might be better off encouraging a "Set it and forget it!" approach, versus a constant dopamine drip. One approach would be to allow submission of one algo per month and then not get back to the contestant until 6 months of out-of-sample have elapsed (which seems to be the natural gestation period for the kind of algos you need). After six months have elapsed, contestants could get feedback (and money) on a monthly clip, which seems about right to avoid unproductive fiddling with their strategies.

In create_perf_attrib_tear_sheet(), please can you change the graphs of daily common factor exposures to graphs of the 90 day trailing exposure, to be consistent with competition measurement.

@Burrito, @Jamie: Let me remind you that December contest algorithms are already running, and applying these not-yet-finalized rules to them would amount to changing the rules after the game has started.

Leverage and Mean Daily Turnover constraints will disqualify any trading system that does not hold positions overnight. Should algorithms that don't hold overnight positions really be disqualified? I am having a hard time understanding the rationale there.

Hey everyone, thanks for all the feedback and questions.

@André: The contest set to begin on December 1st will not use these new rules. It will use the same rules that were used for the contest that began at the start of November. I’m sorry that I didn’t make that clear. When we have an expected start date for the new contest, we will share it here.

@Grant, @Leo: The QTradableStocksUS >= 90% should be interpreted as “at least 90% of your positions at all times should be in the QTradableStocksUS. I can see how that was not made clear in the table. I’ll make it clear when we write up an official rules page. The reason this constraint is only at 90% is because the QTradableStocksUS has some daily churn. We want to be considerate toward algorithms that enter into positions into the QTradable and then have one or two names fall out (especially if they don’t rebalance daily). That being said, we can probably come up with a better constraint to enforce the desired behavior of “only trade names in the QTradableStocksUS”. I’ll bring up the discussion internally and post back here with an update soon.

@Leo, @Karl, @Tony: Regarding turnover, the reasons for the lower and upper bounds are actually different from each other. The lower bound of 5% is there because the less frequently an algorithm places “bets” (trades), the longer it needs to be evaluated. Algorithms that trade more frequently accumulate a larger sample of trades in less time. Leo, you’re right that the lower bound on turnover can exclude quarterly strategies or low frequency event-based strategies. For this edition of the rules, we’re focusing on a specific type of strategy: long-short, cross-sectional equity trading with a mid to high frequency trade schedule. Eventually, we’d like to be able to create new sets of rules for other types of strategies, but right now, we’re focusing on making the best contest for this type of strategy.

On the other end, the maximum turnover limit is there because algorithms that have too much portfolio turnover are likely to be significantly affected by transaction costs. Karl, if you have an algorithm with >65% mean daily turnover over a 90 day window that still performs well, would it be possible for you to email in a tear sheet to [email protected]? I’d love to show it to our investment team to take a look.

@Leo, @Dan: One of the difficulties with leverage and other risk metrics is that it’s impossible to predict with absolute certainty what the values will be the next day. For example, if you tell Optimize to move your portfolio to a new state with a MaxGrossExposure constraint of 1, there are many factors that can push the portfolio away from that exact 1x desired leverage. For example, if you have short positions that drop in value or your orders are heavily impacted by slippage, you might end up being over-leveraged. The 1.1x guard is meant to protect against your portfolio moving about that 1x bound. Now, similar to the commentary around the QTradable participation required to be>=90%, we can probably solve this problem in a more clever way. One thing we’ve talked about is measuring the 95th percentile of daily leverage to be <= 1.1x, and then requiring the 100th percentile to be <= 1.2x. I didn’t include that detail in my notebook, but we’re planning to use a similar solution in the implementation of these constraints. When I have more detail, I’ll share it here. Note that another way to limit the risk of sudden leverage jumps is to hold more positions. The lower your concentration in a given short position, the less it will affect your leverage if the price drops.

@Grant: The requirement to use Optimize is simply that you use order_optimal_portfolio to move your portfolio from one state to another. There is no requirement on the objective or constraints that you use, but you will probably want to use some of the constraints listed in the table to get your algorithm to behave within the constraints in the new contest.

@Charles: A consistency constraint certainly sounds like an interesting idea! Realistically, this isn’t something that we’ll have ready for the start of the new contest. However, I can imagine it being something that we add in at some point.

@Leo: Regarding net dollar exposure, that’s certainly a valid point about the pipeline being computed with yesterday’s prices. I think this one falls under the same category as the QTradable and leverage constraints where we can probably allow for outliers by checking that the 95th percentile is below 0.1, and the 100th is below 0.15. Similar to the others, I’ll have to think more on this and post back here. Thanks for the suggestion.

@Dan: Last week, one of our senior engineers, Scott Sanderson, discovered that the RollingLinearRegressionOfReturns factor in pipeline is extremely slow at computing just alpha and beta of the linear regression. It spends most of its time doing other things. He has a PR up that should make the beta computation ~1000x faster. As for a standard computation, I’m not sure that there is a standard. Using the trailing beta of the stocks in a portfolio to predict the next day’s overall beta is a good approach to keep your beta low, but it’s not perfect. Sometimes your portfolio’s beta can rise or fall outside the bounds you set. Because of that, it’s probably worth doing a bit of research to see what sort of constraint best works for your algorithm. To do this, you might want to use the calculate_optimal_portfolio function in research.

@Jay: Any chance you could send in a tear sheet to support with the algorithm ordering bigger-than-expected positions?

@Steven: Note that the turnover constraint is measured each day as the mean over a trailing 90 day window. There is no daily requirement. The contest will include commission and slippage models. Right now, we are planning to use a 0.001/share commission model as well as a 0.05% fixed slippage model. @Leo: You’re right that beta-to-SPY can be tough to predict. We might have to compute the 95th percentile of beta-to-SPY to avoid disqualifying algos that have a spike on a particular day. @jamie what I mean is you’ll have defined beta to spy in some specific way as a competition constraint (here absolute beta over trailing 6 months). Why not make this a built in factor for each symbol just like the other common factors? Sure I could code it up, but why is it fundamentally different from say the Momentum factor? Hi Jamie, I'd prefer we use longer term beta 1yr or more as there is lesser chance of that going out of bounds. Predicted and observed beta are diverging quite a bit for a wide range of stocks over a short time period affecting algorithm beta as well. No amount of research work can prepare one for the variations that the market can trigger in the future. I concur with @Burrito Dan that Beta should be computed and managed internally by Q. Recomputing beta on a daily basis in our algorithms using standard formulas such as the following doesn't seem to be adding any value. But this could be a larger change outside the scope of the new contest itself. beta = 0.66*RollingLinearRegressionOfReturns( target=sid(8554), returns_length=5, regression_length=260, mask=long_short_screen ).beta + 0.33*1.0 As to the returns/volatility calculation for the score, I am wondering why we are not using the more standard measure of Sharpe and how the score differs from that. Also would like to know if the new contest does not worry about MDD, because the same 90day volatility could be derived from a wide range of MDD. I find the new contest rules far more interesting than previous ones: 1) As many people noted, previous rules incentivized ultra-low vol strategies in order to avoid getting a bad ranking on any of the 7 criteria. 2) The removal of drawdown, stability, and Sortino ratios makes perfect sense to me. It is logical for Quantopian’s investors to care about drawdowns at a portfolio level (i.e. the Q fund). However, Quantopian should not be overly concerned about the drawdowns of a specific algorithm, provided that those drawdowns are not in sync with each other (hence the importance of having uncorrelated algos). 3) Relatedly, I like the return to basic risk-adjusted returns. At the end of the day, we have to accept the fact that a honest trading strategy (one which is not negatively skewed) is a random walk with, hopefully, a positive drift. The only thing we can hope for is having a high mu/sigma (i.e. Sharpe ratio). If I were in Quantopian’s shoes, I would also incentivize people to write algos that trade at specific times during the day (e.g. at the opening, near the close and maybe a few other specific points during the course of the day). This incentive could be in the form of lower trading costs / slippage, since the algorithm aggregation would benefit from trade netting. @ Jamie - I recommend changing from "Use Optimize API to Order" to "Use order_optimal_portfolio to order" if that is, in fact, the requirement. Also, I would just state the requirement "Only trade stocks in the QTradableStocksUS" and then describe how you will perform the verification in the context of the contest. 1.Productivity metrics. Replacing existing ranking system based on seven risk - reward measures heavily over-weighted by risk components with single return-volatility metric for Quantopian is step forward to bring good algos to front page. I strongly support Tony Morland request to use "downside volatility" as a volatility component. 2.Constraints. Instead of four we now have nine. Use Optimize API to Order Must use. Needs more explanation how we must use it. Trade within QTradableStocksUS >= 90%. As I do not know what is in and what is out of this list please reduce to >= 70% or replace it with: Asset Class Exposures -60% to 60% in each asset class. Good algos may have: Sector Exposures -30% to 30% in each sector. Style Exposures -60% to 60% in each style. I would increase both Beta-to-SPY (absolute value) <= 0.3 Net Dollar Exposure (absolute value) <= 0.3 For Net Dollar Exposure 0.1 you should have 55% longs and 45% shorts. To my mind paying 45% insurance in bull market is too much and may not survive slippage and commissions expenses and borrowing costs. 3.Important notes Change the rule no more than 3 algo in all contests to no more than 3 algo in any contest. Please let us change algo in forthcoming contests leaving existing in already running contests. Please make changes to the rules that will require on first day of contest run backtest for all participants for the same period and start them running in paper trading the same time with the same initial capital. I'm in a similar boat as Karl. I'm attaching a notebook from a backtest with 10M and the contest slippage and commission. I'm hoping that you reconsider the leverage and turnover rules. 17 Loading notebook preview... Notebook previews are currently unavailable. While I think these rules make sense to set a level playing field for all entries, I'm not sure they make sense in terms of creating the optimum feedstock for the Q fund. If I were running the Q fund, I would probably be keen to specifically target strategies that are optimized to mine certain factors. I could pick which sectors/style/factors are needed to complement/diversify my existing portfolio of strategies... I would recommend that you give more leeway in terms of sector/style exposure limits if you are to receive entries with some of these more specialized strategies. @Vladimir, you might not get an answer for your suggestion "Net Dollar Exposure (absolute value) <= 0.3" but I will try to explain to you why that wont be possible with the way Q fund is operating. Their approach appears to be to use high degree of leverage on low drawdown/low volatility algos. High exposure to any risk factor like that you are suggesting (market risk) => Taking on more risk at the algorithm level => High drawdown at some point => closing highly leveraged algo after taking big loss which will fall in Q's plate. Consequently they will have to manage every risk factor in every algorithm they choose to have the confidence to be able to operate it at high leverage. They pay you on profits but don't come after you on losses so it will be in their best interest to make sure the algorithm doesn't lose money to the point of having to close the algorithm and even if the algorithm doesn't make stellar returns they might be okay with that prospect as opposed to the alternate scenario. @Leo M It is well known to any practical trader that exposure to any factor is not only a risk it is an opportunity to make money. 30% Net Dollar Exposure (absolute value) is not high at all. There are many other efficient ways to hedge beta and reduce drawdowns then being 45% short US stocks in bull market.. @Vladimir, when your business model is to invest millions into algorithms sourced from the crowd that you haven't seen a single line of code of and then leverage it to a high degree, that type of exposure won't be possible for the long term viability of the business model, because the times you will end up closing an algorithm will most likely be because you didn't account for an outsized risk exposure that the algorithm was taking and unable to manage and hence the big drawdown and the bad consequences. Hi Jamie - I recommend making sure you are consistent with everything on: https://www.quantopian.com/allocation Regarding the "Low Correlation to Peers" requirement, this would seem to run counter to: the new scoring method does not depend on the performance of other participants Something may be lacking if you do not include a correlation analysis with other contest entries and the 1337 Street Fund itself (I gather you are already kinda doing this with the style risk factors, which potentially could be expanded in scope). There is also the "Strategic Intent" requirement, not captured at all by the new contest rules. So, one could put a lot of work into an algo, meet all of the quantitative criteria, wait 6 months or more for out-of-sample data, and then be rejected for an allocation due to a qualitative assessment based on whatever story provided describing the strategy. It might actually be better if Q were not privy to these stories, since it allows for bias on the part of the Q allocation assessment team (and since Q doesn't look at the code and only uses algo exhaust, there will be a tendency for authors to embellish their stories to get an allocation). In terms of process, I would strongly recommend releasing a contest/allocation specification and evaluation tools on Github, so there is a single point of comprehensive revision control. If you are going to post contest results, get rid of the mandatory goof-ball animal names; just allow users to pick separate aliases for the contest, if they chose. Hi, @Jamie, many of us are still a LONG way from being done yet with our comments & feedback :-) Here are a few more items .... 1) Rebalancing: Some algos are reasonably sensitive to whether rebalancing is done daily, weekly, monthly, or at some other interval(s). Therefore I must respectfully but very strongly disagree with the comment by @Steven Williams: "...what about implementing a long/short scoring system and then rebalancing monthly based on the scoring system?". NO, please Q, absolutely not. Rebalancing should be at the discretion of the algo WRITER (who will then take the consequences of their own choice), not something enforced by Q monthly! 2) Scoring/Ranking basis: @ Jamie, in your original post, you write: ".... they will receive an absolute score based on two metrics: returns and volatility" This of course makes perfect sense because, after specifying & satisfying various constraints, such as whatever Q may need to impose as requirements for the fund, and also to facilitate evaluation of the algos, some combination of the two metrics RETURN & VOLATILITY are really all that is necessary to quantify the success or otherwise of each algo. As for the return part, that's fairly easy. Either a "Terminal Wealth Relative" to starting equity, or a Compound Annual Growth Rate (CAGR%) concept are easy and are pretty much the standard ways of doing it. However the "VOLATILITY" part is definitely not so simple. I ask Q please to give this part very careful thought. Either it can be done as some sort of conventional volatility-related measure, of which there are many, such as for example Standard Deviations .... but if so then of WHAT exactly: Symmetrical Upside & Downside moves, or just Upside moves (no, only joking about that;-)) , or just Downside moves (and thanks for your support @Vladimir on that), or alternatively as Drawdowns (DD) .... but again what DDs exactly?: MaximumDD , AverageDD, Root-Mean-Square (RMS) DD? Personally I would suggest the latter, as it is the best way of also taking into consideration the TIME element of the DD as well as its magnitude. Then there is the choice of whether the "volatility" part is best expressed on its own (as described above), or instead as some ratio to the returns part? The latter choice makes more sense to me, as it highlights the combination of reward AND risk together. If it is expressed as a ratio, then there are some resulting well known conventional measures already familiar to Q such as, for example: Returns / StDev of Returns (Up and Down) --> Sharpe ratio, Returns / StDev of Downside only --> Sortino ratio, Returns / MaximumDD --> MAR or CALMAR ratio, Returns / RMS DD --> "Ulcer Performance Index" or UPI ratio, Slope & Standard Errror of regression line --> Lars Kestner's "K-ratio", and others. My personal preferences would probably be either for Sortino as it highlights the downside part of volatility, or even better UPI ratio based on RMS Drawdown, but the others have merits too. What I think is most important is that Q does not take an "academic" approach (such as simply using StDev), but favors the use of more industry / practical trading related concepts such as Downside Volatility and/or Drawdown (@Vladimir & I are very much aligned here, i think) . Finally on this point, there is the question of whether or not Q is going to continue making a single (1-dimensional) ranking of participants in contests. From the point of view of algo selection for Q, this is probably neither necessary nor desirable, and a 2-D "minimum threshold" requirement on the dimensions of BOTH 1) Return & 2) Volatility (or ratio) will presumably be required for selection of algos to actually be used by Q in future. However if Q is going to have a "ranking of winners" (which some participants like), then how will the 1) Return and 2) Volatility (or ratio) parts be combined into a single final score number? Transparency please. . 3) Application of Algos in Q: This one is especially for @Dan & others in Q, as well as the general Forum readers. My apologies if I'm taking up a fair bit of space here, but i believe this part is ESSENTIAL. In addition to the items above, there is an aspect to the contests and how they relate to the professionalism and the future viability of Q as a hedge fund which has not, as far as I know, been mentioned in any of the Forums or elsewhere yet. Obviously to attract and hold the interest of participants (algo authors) the contests must be limited in duration, and 6 months of live trading is probably about as long as most authors (with let's call it perspective "a") would be happy to wait. However Q presumably wants algos that generally have a longer shelf life than that and so, from that side (perspective "b"), 6 months is probably the very minimum for giving any assurance that a algo is actually viable under a range of different market conditions. In my opinion, 6 months is actually WAY too short unless some other measures are taken. However 6 months probably represents about the best workable compromise between the two sets of conflicting perspectives a) & b). Now, assuming that Q actually wants algos that are durable and robust under different market conditions in future, is there anything that Q can do to improve the chances of success in future, rather than just simply running a live test for 6 months? Yes, there certainly is, and it takes very little extra time or resources. Before describing, it is necessary to understand the following: The RETURN from any algo or trading system of course depends on the results of all the individual trades, and is ORDER-INDEPENDENT. For anyone to whom this is not intuitively obvious, it is easy to prove. The order of a sequence of gains & losses can be changed, but the final return remains the same. Conversely however, while DRAWDOWN or VOLATILITY (or however else one chooses to describe and quantify the flip-side of returns from trading), again depends on the results of all individual trades, but this aspect is ORDER-DEPENDENT. As far as Drawdown (DD) is concerned, it makes a LOT of difference whether gains & losses alternated, or a big string of losses all occurred together. Now, when Q tests an algo for 6 months, a lot of trades are generated. It can be assumed that this (generally quite large) set of actual trade results are reasonably representative of the algo in general and are in some sense "typical" of what the algo will probably do in future. This may or may not turn out to be true, but it is the best assumption that can reasonably be made. Of course we don't know what the future will hold in terms of overall market behavior, but if algo XYZ generated N trades in a 6 month test period, then it is reasonable to assume that i) it will probably generate about N trades in the next 6 month period, and ii) the overall distribution of trade (gain & loss) sizes will probably also be similar. Assumptions i) & ii) may turn out to be wrong, but they are about as good as anyone can do in advance and are consistent with statistical theory as well as common sense. However even if the trade distribution remains the same, the specific ORDER of the trade results will not be, as market history is unlikely to repeat itself. The astute reader has no doubt already figured out where this is leading. The best way for Q to estimate the most likely NEXT 6 months performance of an algo after 6 months of testing is not to assume that it will just be identical to the last 6 months, but rather to take a random sample (with replacement) of N trade results from the distribution of actual trades, see what the resulting Return & Volatility or other metrics are, and then repeat a few times, preferably at least about 10,000 or so, and then look at the Most Likely (or Median, or P50) outcome. Of course this is a Monte Carlo (MC) simulation technique, and it only takes a matter of seconds or maybe a minute or so on a PC. It is common nowadays in most commercial trading software packages (e.g. AmiBroker, etc), and is easy to implement. The big advantage of running a MC simulation using the results from an actual 6 month period of trading is not just that it gives a "best estimate" of the likely results from the NEXT (i.e. future) 6 months or whatever period of trading, but ALSO it generates a probability distribution for the possible trading results that allows quantification of the Black Swan type "tail risk" (to Quantopian) of continuing to use this algo in future (at least to the best of anyone's ability to do that). What I find absolutely incredible is that Q is putting so much effort into encouraging people to write good algos with supposedly leading edge "Risk" evaluation tools, but is apparently doing nothing at all with regard to quantifying even the basic elements of this different kind of risk, namely that of actually using the algos in future. To most people who have been writing trading systems or algos for at least a few years, this is very basic and well documented stuff. We would not consider using any system that had not made this sort of consideration of tail risk, and for a professional fund not to do this, especially when it is so easy, would be bordering on downright negligent!! I don't know if Q is doing this sort of MC tail risk evaluation regarding continuing to use supposedly "successful" algos, but so far i have not seen any evidence of it. I have to wonder why not, when it is so easy to implement? Please Q, for the sake of your and our future, if you are not implementing this already, then start doing it now AND include it in the final evaluation of all algos, even if not (which would be easy enough) as an ongoing item while the algos are still running during testing. After all, if people can wait for 6 months to see their "final results", then a few more minutes of calculation time after that should be easy for everyone. The future of Q may just depend on this. Evaluating and quantifying as well as possible the tail risk associated with continuing use of algos is CRITICAL. Ignoring tail risk leads to consequences like LTCM. Please Q, don't let that happen to you/us. Start adding in MC evaluations of tail risk based on the distribution of trade returns from all algos ASAP. Thanks in advance for your consideration, TonyM. Hi Jamie, as to your statement "You’re right that beta-to-SPY can be tough to predict. We might have to compute the 95th percentile of beta-to-SPY to avoid disqualifying algos that have a spike on a particular day." I wasn't referring to the fact that daily beta might spike. I am referring to the fact that predicted beta from last 252 days (as used by the standard formula) is useless when future behavior of a stock or stocks of a sector has changed considerably in the near term and we continue to use the trailing 1yr beta continuing to make errors in beta estimation of sectors that have gone from (hot to cold or cold to hot). It takes a longer period of time for these errors in estimate to even out and get a portfolio beta within the bounds that we set in order_optimal_portfolio constraints. That is my guess why beta is swinging so wildly over different time ranges going into the future but always converges to 0 over a very long multi-year timeframe. @ Jamie - How do you plan to handle your fee-based data sets for the contest? It would sure make sense, if you are wanting the crowd to use them, to figure out how to allow them to be used in the contest. I realize that you have to pay for the data and are presumably just passing along the cost, but my understanding is that you can evaluate algos that use the fee-based data all the way up to the present, even though the author has not paid fees. So, maybe you could apply the same scheme for the contest? Personally, I'm not going to pay for a data set (I'm not sure why anybody would, now that retail trading is no longer supported), but I might be motivated to attempt to develop an algo using fee-based data if I would eventually get summary performance feedback at the end of a 6-month out-of-sample contest period. @Jamie-- Will algos using fetched data be allowed in this new competition? I've been jamming on an algo that uses the fetcher with the sole intent of readying it for the Q Fund, but it'd be great to be able to put it into a contest, as well. Since the Q fund accepts algos with fetched data, seems like including these in the competition would work towards the goal of aligning the contest more closely with the fund, right? @ Jamie - I'd be curious about the rationale behind a 2-year backtest. I can see a requirement for a minimum 2-year backtest, but wouldn't it be prudent to run the backtest as far back as the data will support (possibly awarding bonus points for long backtests)? It helps guard against over-fitting to recent market conditions, and also samples more than just a single market cycle. @Grant I'm with you on the fee-based data. There needs to be some way to encourage the use of alternative data. Paying up front for data with only a chance that an algo might win a contest seems like a poor bet for most of us. A typical data subscription costs ~30/mo (for all data sources its $395/mo). A 6 month out of sample 'live' run would cost$180 out of pocket.

If the use of 'alternative data' is still a stated goal of the Q fund then there needs to be some way to make this data available without 'taxing' the developers.

@Georges: Right now, the risk model uses end-of-day positions to compute factor exposures. We’re currently working on improvements that will allow it to compute exposures regardless of the portfolio state at EOD, but it isn’t quite ready. I imagine when this is available for us to use in the contest, we will reconsider the lower bound on leverage, or at least how it is computed (e.g. not requiring EOD positions). For the upper bound on turnover, I’ll have to take a look at the tear sheet that you sent in with our investment team and get back to you.

@Dan (Burrito): Thanks for clarifying your suggestion. That sounds reasonable to me. Currently, it is not in the risk model tear sheet because it is not used in the multiple linear regression model mentioned [here][1]. That being said, we should be able to include the trailing 6 month value in the full backtest view as well as the tear sheet. We are currently working on improving the feedback you get on backtests (in addition to the new contest rules), so we’ll see if we can fit it in there.

@Vladimir: The requirement for using Optimize to place orders is simply “you must use order_optimal_portfolio to place orders”. Other methods like order_target_percent will not be allowed. Regarding the QTradableStocksUS rule, as others have mentioned, I was not clear enough, so I apologize. This constraint will be reworded to something like “Only order stocks in the QTradableStocksUS". You can learn about what is in the QTradableStocksUS and see how to use it in a pipeline here.

The constraints around net dollar exposure, sector & style exposures, and beta-to-SPY are unlikely to change. You are correct that there might be some algorithms that perform well without meeting these constraints. However, these constraints were chosen to more closely reflect the types of algorithms that we are looking for in the allocation process.

@Tony: I don’t think that Steven was suggesting that the schedule be mandated by Q (and that’s certainly not something that we’re considering). I believe he was asking about a strategy that trades monthly. Regarding the ranking system, you’re right that we are going to use a 1-dimensional system (your ‘score’) to rank participants in the contest. As noted in the original post, that scoring function is the cumulative sum of the 90-day volatility-adjusted daily returns (included in the attached notebook for clarity). The output at the bottom of the notebook includes a ‘Cumulative Score’ as well as an ‘Out-of-Sample Score Over Time’. Note that the rankings will be based on the ‘Cumulative Score’ over the out-of-sample period for participating algos.

@Jim: We are currently working on a new system that will allow you to import your own data for use in research and algorithms. You will be allowed to use data pulled in with the new system in the contest. We’re aiming to have this new system available by the time the new contest starts. If it’s not ready by contest launch, it should be available soon after.

@Dan W, @Grant: We don’t yet have plans to change the model around partner datasets, but we’re exploring some ideas to make these types of datasets available to you for free in development and contest participation. We’re also working to expand the number of free datasets that are available for use on the platform.

Jamie, keeping leverage at a minimum of 0.8 - I'll need ~400% daily turnover. I can send you the tear sheet of this variation as well. I kept the model trading only during the day because I see that as being significantly less risky.

@ Jamie -

For the contest, I'd be fine with how you are doing it today for allocation assessments--allow an algo to be developed and a backtest run up to the last date of free sample data, and then (presumably at your discretion, depending on the backtest results), run another backtest up to the present (and, if indicated, into the 6-month out-of-sample period). Then if the whole thing looks decent, you'd award a little prize (out of which you could subtract your cost for the partner data, if this is required to make it fly). There seem to be a variety of ways to keep the partner data cost in check, if you approach things in stages for the contest.

It would seem a bit unfair to allow your non-free data sets to be used in the contest, but not provide any path whatsoever for users to use them without paying (putting them at a potential disadvantage to those who are paying or otherwise have access).

On a separate note, it would be interesting if you opened up the contest to Quantopian employees/consultants (and used their real names), but didn't allow them to win any contest money. I'd be curious how the crowd compares with the team you are assembling.

In support to Grant's : I'd be curious about the rationale behind a 2-year backtest.

There is no doubt that market behave differently in its bull and bear legs.
The backtest period should include equal number of full bull and bear legs.
As of today it is more then 10 years.

@ Jamie
What about from above:

I strongly support Tony Morland request to use "downside volatility" as a volatility component.

Change the rule no more than 3 algo in all contests to no more than 3 algo in any contest.
Please let us change algo in forthcoming contests leaving existing in already running contests.

Please make changes to the rules that will require on first day of contest run backtest for all participants for the same period and start them running in paper trading the same time with the same initial capital.

Finally figured out the reasoning for the ranking change. Try this code folks which simulates to some extent (but not exactly the algo ranking using the cumulative score). The lesson is I think that Q wants to favor algos that have more positive returns days over those that have sudden big positive return days and steady stream of negative returns days. This is indicated by the higher ranking below for case 2 returns distribution over case 1 even though both have net returns of +1 at the end of 5 sample period. Jamie, Am I right about the motivation for the ranking change? Looks like the new contest has tightened beta and sector exposures and added style exposure limits along with a volatility denominator to penalize big jumps in returns or drawdowns. Basically the algorithm needs to make returns at a steady rate throughout the 6 month out of sample period. Returns distribution resembling somewhat of a low slope straight line could get a high score due to lowered volatility and steady rate of return.

import numpy as np

# case 1

a = [-1,-1,-1,-1,5]
s = np.std(a)
print np.sum(a/s)

# case 2

a = [-1,-1,-2,3,2]
s = np.std(a)
print np.sum(a/s)

# 0.52

Jamie, in compute_score from the notebook I think you have a typo, I think it should be [-504:].
cumulative_score = np.cumsum(daily_scores[504:])

Modified Jamie's notebook to provide average yearly score for a back test id. Might be helpful to select algorithms to enter into the contest. I gained some valuable insights by looking at the scores of some of my algorithms which are not apparent from just the backtest results.

9
Loading notebook preview...
Notebook previews are currently unavailable.

@ Guy

The start time and rules are fixed for everyone. There are no entry fees. But, in order to win anything, you must be at the starting gate with your algo and from there get to the finish line.

If it were as you wrote, there would be no problems.
No starting gate, which opens at the same time for everyone.
Really no finish line to all but the winner.

One of my entries is in the race more than a year and has already brought me 6 t-short.
It carries in the bag everything earned until May 2017 and successfully competing with those who just started seven months later in May 2017.

My opinion: Must be, as you wrote.

@ Jamie

Please make changes to the rules that will require on first day of contest run backtest for all participants for the same period and start them running in paper trading the same time with the same initial capital.

Hi @Leo.
You write: "I think that Q wants to favor algos that have more positive returns days over those that have sudden big positive return days and steady stream of negative returns days". I expect that you are probably quite correct, and it makes sense doesn't it. A large hedge fund with wealthy and mature investors will presumably have the fund's goals closely aligned with the investors' aspirations, which are likely to be the achievement of returns that are reasonably high, AND with as much as possible of the steady reliability of a bond or bank account with "no surprises". So that means an equity curve that looks as close as possible to a (semi-log) straight line.

The challenging part for the fund then is to how to achieve this on a long-term basis, using algos that have only been tested for a 6 month out-of-sample contest period. @Vladimir, @Grant & others make the point regarding the differences in likely performance of any algo during different market regimes (bull, bear, volatile, quiet, etc) and that 6 months is actually too short. As @Vladimir correctly points out, from today we have to go back more than 10 years to even see a bear's legs (and more like 20 years for multiple bull-bear cycles). That's the sort of time frame that i personally like to look at if possible when developing & testing my own algos. However if we do that, then we also find that many things which worked 20 years ago or even 10 years ago just don't work any more now (e.g. a lot of the traditional TA "indicators"), after costs are included.

As a result, while personally I agree with @Grant's notion that running backtests as far back as the data will support is generally a good idea for an algo developer to take on board, I must then DIS-agree strongly with @Grant's follow-up idea of: "possibly awarding bonus points for long backtests". I think it would be very wrong for Q to allocate any sort of "bonus" or other scoring for the back-test period, as doing so would only encourage people to "cheat" (actually to cheat themselves really) by indulging in the usual type of curve-fitting that seems great when looking backwards but is even-worse-than-useless when looking forwards.

This brings us to @Guy's comments about the contest being like a race in that it has a (somewhat artificial) "finish line", whereas real life investing doesn't, and i certainly agree with that notion and its implications, as Guy describes. The problem of course is what to do about it!!

Q has to have some basis for selecting algos, and participants won't wait around forever, but Q also needs to have some sort of assurance that selected algos will have a reasonable probability of continuing to be good AFTER each contest is over. Does anyone have any concrete suggestions about this? To me this seems like an area that is still very much lacking, not so much from the point of view the CONTESTS themselves, but rather from the point of view of the application of the RESULTS of the contests and the relevance to Q's fund's future.

Several posts earlier I expressed my views about one possible way to address this issue being the use of MC simulation applied to each algo, which should provide Q with a high "benefit to cost ratio" in terms of the relatively little effort required to implement it, in exchange for at least some quantification of tail risk. If the reason that people at Q apparently haven't picked up on this suggestion is because they don't fully understand, then I can certainly send a list of references to read.

Coming back to Guy's post above, there are 2 items that I would like to question now.
The fist question (to @Guy) is: where does the number 2,000 "algos at the gate" come from? Is this an extrapolation from the current number of about 600 or so in recent contests, or from some other source?

The second and more important question, which I address now to Q, is the following, and to be honest it really shocked me to read Guy's assertion that: ".... only one individual has had an allocation to date". Is that correct? I can only hope that it is NOT correct and that Guy's info about this is very much out of date, i.e. as in more than 30 contests ago!!

Conversely, if it is actually so, then it would make a falsehood out of Q's statement, on the very first page of the Introduction on the website, which specifically states: "Quantopian is providing capital allocations for our community's top-performing algorithms. The author of the strategy shares in the net profits on the algorithm".

I am now preparing to commit a lot of time & effort to writing good quality algos for Q and building a long-term relationship for mutual benefit, sharing in the fruits of that work. I know other people are doing likewise. Personally I like the Q platform and the Q people I have met. However, simply as part of my own process of conducting due diligence before embarking on a major effort, I would like to be reassured that Q does really deliver as promised. Thirty five contests have now been run, with the stated aim of encouraging a diversity of algos for Q to use. Of course I don't need to know any personal or proprietary info such as how many algos Q is actually using in the fund, or what they are, or whose they are, but surely we are entitled at least to know that Q does actually deliver on more than just trivial cash prizes. So then, simply for the sake of giving us confidence in Q, the question is: At least approximately, how many allocations HAVE ACTUALLY been made?

@ Jamie -

Above, you say:

When you submit an algorithm to the contest, we will automatically run a 2 year backtest over which these constraints will be tested. As your algorithm continues to run out-of-sample, these constraints will continue to be monitored.

But this doesn't say anything about how the algo performance will be evaluated. Will the scoring take into account both the backtest & the out-of-sample period? Will there be any feedback/judging based on an assessment of over-fitting (along the lines of https://www.quantopian.com/posts/q-paper-all-that-glitters-is-not-gold-comparing-backtest-and-out-of-sample-performance-on-a-large-cohort-of-trading-algorithms)?

As mentioned above, 2 years is already too short for a solid performance assessment, but if the contest awards are based solely on the 6 months out-of-sample, then you are likely to be incentivizing the wrong behavior. But then, maybe your model is that only the most recent trends are valid for investment?

@Grant, my understanding is that the purpose of the backtest is to ensure you meet the constraints going backward 2 years and also providing the seed 90day volatility to start scoring the contest algo when the out of sample period starts, the contest final score will be cumulative of the out of sample daily scores during the contest period.

@Tony, I vaguely remember that the number of algos that are currently running in the Q fund is 14 as mentioned in a Quantopian webinar two months ago in reply to a similar question as yours . Youtube url for the webinar (in the Q&A towards the end) https://www.youtube.com/watch?v=SYob6WIUaFs

@ Leo M, thanks. I wasn't able to see the webinar at the time it was presented but i just watched/listened to it now. It's a nice overview presentation by Jess Stauth about the algo selection process that Q uses. As you say, in the webinar it was confirmed that Q is indeed currently running 14 algos in the fund. However that is still not quite the same as the question I am asking. Please can someone at Q advise whether the number of ALLOCATIONS actually made (or at least agreed to, even if not yet paid out ) to individuals is 14, or only one as claimed by Guy, or some other number?

@Vladimir: At this time, we are not planning to change the rule from 3 algos in all contests to 3 algos in any contest. However, there is a longer term plan to have more contests run in parallel. In one of my earlier responses, I mentioned that we are focusing on a specific type of strategy: long-short, cross-sectional equity trading with a mid to high frequency trade schedule. In the longer term, we are hoping to have other contests running for other types of strategies. When this happens, I expect we would allow more entries to be run at a time.

Regarding the fixed start date, we hear you on the problem. With the current rules, entries can run for a year and be compared directly to those that run over 6 months. We are planning to fix this problem another way, by capping the evaluation window. The reason we don't want to fix the start date is because we want someone who submits an algorithm to the contest to be able to see their entry on the leaderboard as soon as possible. The fixed evaluation period window length should solve both problems.

Regarding downside volatility, we are not currently planning to use it in the score computation, but I'll discuss with some other folks internally. It's an interesting suggestion.

@Leo: That's certainly a big part of why we're looking at volatility adjusted returns. I don't think there's a typo in the line cumulative_score = np.cumsum(daily_scores[504:]). The goal of the slice is to remove the first 2 years of the backtest to emulate the automatic 2 year backtest that gets kicked off when you submit to the contest. The score is only generated from the out-of-sample performance, which comes after that 2 year period. Of course, any backtest that you run through this notebook doesn't really have out-of-sample data, but the idea is that you can run it as though you had submitted on a particular day. For example, if I wanted to see how an algo would have done had I submitted on 6/1/2017 using this notebook, I'd run a backtest from 6/1/2015 to today. The entire 2.5 years would be used to evaluate the constraints, but the score would only come from the returns since 6/1/2017. Does that make sense?

Jamie, thanks. I made a wrong judgement that you were trying to compute the 2 year backtest score as in the current contest, makes perfect sense after your explanation.

Jamie,

At this time, we are not planning to change the rule from 3 algos in all contests to 3 algos in any contest.
Think it over.
It's very easy to implement.

In live trading we have in the right upper corner the red Stop Algorithm button.
If you add to that button draw down list with the live running contest numbers (36,35,34,33,32,31,30) than every one may choose at which contest he(she) may stop the algo.

If the algo was stopped in forthcoming contest (36) then there should be space for another one in this contest.

@Tony: We have had a few authors receive two allocations, but most have only gotten one so far.

@Vladimir: You're right that stopping an algo to free up a spot for an upcoming contest is frustrating. We'll think that over. Thanks for the suggestion.

Hello!

I would like to join colleges who disagree with turnover limitation. Limit below 65% excludes daily strategies. Here is an example of profitable daily strategy (rebalanced once per day) despite trading costs. The strategy could be improved but I think it is a good example. It fails also leverage and net exposure limit, but I believe this could be managed (more liquid stocks, net exposure control during trading session). I expected turnover around 85% but it is actually above 100%. I think because of short position turnover is doubled. That means that acceptable strategies are with weekly or longer period or maybe strategies based on fundamentals.

If we talk about fundamentals I found out with one of my client that quality of Morningstar data is not very high. We compared data with FactSet. Even ebit is many times for example calculated incorrectly, this happens even more often with variables from a balance sheet. So, I wonder how algorithms based on current fundamentals are reliable?

7
Loading notebook preview...
Notebook previews are currently unavailable.

Hi @Jamie, Thanks for the info that "a few authors receive two allocations". That's good news, but still not answering the question i'm asking, although you have at least implicitly told us that the total number of allocations must be > 1 (which someone claimed), and also must be > 2 (if there are "a few" authors ...), but it still seems rather evasive. Please permit me to repeat for clarification: After a total of 36 contests so far, how many allocations have actually been made?

Changing track, i also support @Vladimir's concerns about the current need to kill a perfectly good algo to make space for a new one at each competition, as I am now forced to do within the next 2 days. Whatever way the solution to this is best implemented by Q, thanks @Jamie for at least considering.

Will there be a one-liner optimiser set up with all the current contest rules? It should be sufficient to redefine initialize() to build a pipe with an 'alpha' column and end with schedule_contest(pipe) and be done with it.

@Tony, on the allocation count, I will try to give you a helpful answer without being totally explicit. We don't give regular updates on the number of allocations because it's a closely-held business metric. Also, as a practical matter, the number changes frequently: allocations are made, algorithms are dropped. In September we shared that the number of allocations was 14 (as Luca noted, it was in a webinar and a few other places). I can also share that most of the allocations are made to different individuals, with a few people getting 2 allocations. I think that answers your basic question of "roughly how many people are getting allocations these days," though I acknowledge that it doesn't answer the actual question you asked.

Also, just for extra clarity: the number of allocations is totally independent of the contest results. Allocations are made to algorithms that meet our criteria, regardless of contest performance.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

@Dan, yes that does adequately answer the true intent of my question. I have no wish to pry for info that is any way confidential. There have however been a number of negative comments in the Forum that cast doubt on Q's integrity with regard to actually making awards. I simply wanted reassurance that those Forum comments were in fact incorrect and baseless. Thanks for reassuring me. Best regards, TonyM.

I'd like to echo Norbert Halbzeit's suggestion above to encapsulate all of the boilerplate code required, should a user want to meet the constraints by applying the Optimize API in all of its glory. I'm gathering that the strong recommendation by Q is that all orders get pushed through order_optimal_portfolio with a specific set of constraints, consistent with the contest constraints.

Perhaps you could ease the cognitive load and simply release a function on Github (along with the Optimize API code and the risk model code), and then support something like:

from quantopian.experimental import order_with_contest_constraints

Then, users could either use the short-cut order function, or modify the base code, as they see fit.

It would be interesting to hear more about the rationale of using only the out-of-sample period for contest ranking. I gather that Q is looking for algos that exhibit a (1+r)^n returns, like a high-yield bank CD, supporting 10 M in capital. But I'm wondering if the constraints applied over 2.5 years plus the 0.5 year performance analysis will suffice, or if statistically, there will be a lot of dud winners that won't actual exhibit the (1+r)^n returns over a relevant time frame (e.g. 20 years)? For the statisticians out there, I'm wondering if the multiple comparisons problem applies here, and if some correction needs to be made? @Grant - But I'm wondering if the constraints applied over 2.5 years plus the 0.5 year performance analysis will suffice, or if statistically, there will be a lot of dud winners that won't actual exhibit the (1+r)^n returns over a relevant time frame (e.g. 20 years)? The contest has always had a relaxed version of requirements, while getting an allocation requires much more from an algorithm. Dan Dunn wrote: For the purposes of this evaluation process we look at tear sheets that start at 2010. If the algorithm passes this screen, it is evaluated of other time frames as well. I believe it is ok to not make the contest too difficult and leaving the strict algorithm analysis for the allocation process. For the statisticians out there, I'm wondering if the multiple comparisons problem applies here, and if some correction needs to be made? This is a really interesting question, I'd love to hear the answer. I just tested a few long/short algos with fixed slippage @ 0.05%. Brutal. :) @Charles Piché , if you used FixedSlippage class in your test then please note it doesn't have volume limitation (from documentation "Fills under fixed slippage models are not limited to the amount traded in the minute bar"). I am pretty convinced that when Jamie wrote "0.05% fixed slippage model" he meant fixed cost but some sort of volume limitation must be included as well. The only constraint I have concerns about is beta < 0.2 throughout or 95th percentile. Beta estimation and bounds maintenance are both Quantopian code. We don't necessarily program for beta and beta is more of an observed quantity based on our risk restrictions. Requiring a future observation to be within 0.2 throughout might require one to go ultra conservative on the beta bounds like 0.15 or 0.1 to account for some unexpected beta variance in the next 6 months. Lowering beta threshold could affect alpha optimization but we are also required to maintain a steady alpha so that could become a catch 22 type of situation. I prefer a long term beta 1yr or above to remove the noise fluctuations in beta disqualifying your algo in the contest. I am just beginning to get excited about the contest and will have to run some simulations to see how much more to cut down beta bounds. 0.3 beta bounds as mentioned in allocation page is considered low-beta, no? @Luca Ah, right. Thanks. Hi @Jamie, During the early time after a new algo kicks off, many of its performance metrics are initially highly unstable. It makes little sense to apply any constraints during that startup time, and I note in your original post you mention that Q will run initial 2-year backtests. Personally I don't try "tweaking" my algos to get the best possible fit at the end of the history period, and so i usually don't worry about extending my backtests right up to the date immediately before submitting. However if Q is using their "initial 2-year backtests" as you mentioned for rolling-window calculations, then presumably Q's own backtest periods must always be contiguous with the date on which each algo is actually launched. Please can you confirm? @Norbert: Thanks for the suggestion. We are indeed working toward creating defaults in the Optimize API to help meet these criteria. @Leo: You're right about beta. This is one of the toughest constraints to predict. The argument against increasing the duration of beta to 1 year is that it puts too much stress on the in sample beta of the algo. When you submit an algo to the contest, the beta is completely computed from the backtest. If you think about it, the 90-day mean of the 1 year beta means that returns from ~340 days in-sample are used in the computation (252 + 90). That's a lot from the in-sample and it sort of loosens the requirement of following the beta constraint in the early parts of the out-of-sample period. Of course, the argument for a longer period is exactly as you said: beta is noisier over shorter periods of time and requiring more conservative beta constraining attempts could negatively impact alpha optimization. What do you think of a 90-day mean of absolute 6 month beta being capped at 0.3? It gives a bit more wiggle room and I'm thinking that algos which meet the other set of constraints might have beta < 0.3 naturally. @Charles: Would you be willing to share your implementation of the 5bps fixed slippage model in this thread? It would be great to have others see how it affects their algos too. @Charles/@Luca: Regarding volume share limits in the fixed slippage model, the current plan is to only have a price impact component and to not restrict volume. We're still iterating on the model. I'll post back here when I have more information to share. the current plan is to only have a price impact component and to not restrict volume This is unexpected. I am curious to know how the orders can be realistically modelled without volume consideration. I guess we have to wait for a dedicated post to know more about this topic. Regarding slippage, I suppose with the QTradableStocksUS and the constraints, including the max exposure to any given security, any algo that will be worth its salt won't end up trading in any significant volume of a single stock. @Jamie, a 6-month mean beta being capped at 0.3 sounds reasonable, allowing entirely for out of sample beta at end of contest period along with giving the optimizer some wiggle room to optimize for alpha. Having added a volatility denominator the purpose of steady returns has already been achieved to some extent. Overly constraining beta at the expense of alpha optimization might work against the primary objective of the contest by constraining the optimizer. I don't think a 6month out of sample beta is in any way representative of what you are likely going to see in a different 6 month period going forward or backward. Mean and standard deviation of beta over a longer period like 10+ years and whether out of sample beta lies within some number of standard deviations of that is probably a better indicator of out of sample beta validity, but that kind of extended analysis is probably out of bounds for the contest, although it might give a better picture of conformity to the allocation guidelines. On a different note 10% position concentration sounds way too high for an institutional allocation. That appears to be excessive risk taking on a single equity. If you haven't made any allocation to that kind of position concentration there appears to be some room to lower that max towards the purpose of achieving lower portfolio volatility. @Jamie, "It gives a bit more wiggle room and I'm thinking that algos which meet the other set of constraints might have beta < 0.3 naturally." Not sure if other constraints would naturally limit beta and the optimizer will still have to do that. For the purpose of the contest we wouldn't want to be disqualifying a whole bunch of algos in the middle of the contest by having a very strict limit on beta that is a noisy estimate in any case. A more relaxed estimate will allow for more algos to complete the contest, and allow more room in the optimizer for alpha to be maximized. I am expecting that if I set my limits to 0.2 for beta bounds in the optimizer the mean is unlikely to go over 0.3 for a 6 month period. I expect my mean to oscillate between [-0.2 and 0.2] with occasional forays upto 0.25. @Jamie, Can you give some details about More Winners. Here is some information about how the Chinese competitor Joinquant awards the winners of their contests. Hi Jamie - For a multi-factor pipeline-based algo, I'd be interested in an example of how to constrain the factors individually within pipeline, per your requirements, prior to their combination. Jamie - I'm wondering if there is any evidence that rewarding what will amount to the extreme tail of a volatility-adjusted return distribution, across contest entries, works? It seems that particularly with only 6 months of out-of-sample data, one might run the risk of selecting unrealistic outliers. Is there any evidence from Q's own data or published literature or otherwise that would suggest that performance tends to persist for such outliers? @Luca: Yeah, I might have been too strong with my wording about not using volume with the fixed slippage model. You should read it as "we haven't made a decision yet". I'll let you know as soon as I have more info. @Grant's comment about the QTradableStocksUS filter is a big part of why we're considering not enforcing a volume limit. @Leo: You might be right about the position concentration. I'll revisit this and get back to you. @Vladimir: I can't give exact details on the 'more winners' comment, but I can tell you that we will be increasing the number of prizes given out, and generally making the prize structure more evenly distributed. The reasoning for this is that top-heavy prize structures reward the extremes. We're hoping that a more balanced prize structure will incentivize participants to consistently do well over more contests (longer period of time), instead of taking a bigger risk or getting lucky/having favorable conditions in one particular contest (short period of time). @Grant: The request to apply constraints within pipeline is an interesting one, thanks for the suggestion. It is my understanding that it is obligatory to use the Optimize module, but not necessarily the Pipeline. Would it be acceptable to have an algorithm based on minute data, read in with the help of the old-fashioned history function? @ Tim - Note that the only requirement in using the Optimize API is to use order_optimal_portfolio for placing orders. It can be used without actually doing any optimization, like this, for example:  objective = opt.TargetPortfolioWeights(weights) constraints = [] order_optimal_portfolio(objective, constraints)  Regarding not using Pipeline at all, I'm not clear on that point, since you may still need to use it in some sense, to confirm that all stocks are in the required universe, QTradableStocksUS (although in theory, you could just fly by the seat of your pants, and hope that your universe is contained within QTradableStocksUS point-in-time). It would be interesting to hear from Q regarding what optimization/risk management they intend to use on algos that receive allocations. If a specific implementation of the Optimize API (and whatever risk management tool they are cooking up) will be required in the end, then we might as well use it now. @ Jamie - I note in your example algo above, you use: import quantopian.algorithm as algo  Then, you use it in a few places in the algorithm. Is there an example/style guide that talks about quantopian.algorithm and its usage? A bit off-topic, I realize, but maybe for well-written contest algos, it would be beneficial? @ Grant - Thank you very much for your comments,  objective = opt.TargetPortfolioWeights(weights) constraints = [] order_optimal_portfolio(objective, constraints) ,  this is actually exactly what I had in mind when posing the question, Regarding not using Pipeline at all, I'm not clear on that point, since you may still need to use it in some sense, to confirm that all stocks are in the required universe, and that indeed is the other part of it, where Pipeline is only called to get the required list of stocks, which is done easily enough. I also concur with your request for a clarification from Quantopian on the possible future requirements on the use of Optimize and Pipeline. Regarding the use of minute data, eventually one would hope that it would be incorporated into the Pipeline framework in some fashion (I gather it is a computational resource limitation at this point), so it would be imprudent to discourage using minute data in the contest. My hunch is that there is a lot of information thrown out, due to the daily bar limitations of Pipeline. However, I suppose if the model is that new alpha will come from alternative data sources, and those sources are daily (or less frequent), then a framework that supports minutely factors won't buy anything until minutely alternative data sources become available. @Grant - I have spent a lot of time working with OCHLV daily data and was unable to find any edge in them (in recent years, at least), or at least no allocation-material approaches, so I was hoping that switching to minute data might help. But I think you may well be absolutely right that the only solution are alternative data sources, since the massive use of quantitative trading and methods has profoundly eroded any profits that may be gained from the traditional ones. I have definitely observed some real "regime changes"in the OCHLV-based algorithms, comparing the pre- and post-2008 period performances, roughly speaking @ Jamie - Will the contest require the algo to run live for 6 months (on the Q simulated trading platform)? In my opinion, this would be o.k., so long as if the algo crashes, it still stays in the contest (since it would be evaluated via a backtest at the end of the 6 month out-of-sample period, not on its live trading performance). You could leave it up to contest participants whether they want their algos to run live, or they could simply independently launch their contest algos live to run during the contest (and avoid the complication altogether of having contest algos crash). I personally think Quantopian is making a mistake by limiting themselves to long-short US equities and piling on these other restrictions. The US equities market is already one of the most heavily scrutinized. The QTradableStocksUS is even a little worse, there is nothing there I want to short. Add all these additional restrictions plus the hard-to-predict effects of the Optimize API, and I am skeptical anyone will be able to find statistically rigorous alpha. In general the cutting edge can be pushed forward by setting performance metrics -- if you tell a team of engineers that you need 50 megawatts of power, eventually you'll get a jet engine. If you tell them they need to use an internal combustion engine, then at best you'll get a monstrosity, more likely nothing. Hi @Douglas, i certainly agree with your comment: "The US equities market is already one of the most heavily scrutinized." It is well known that, not surprisingly, the best opportunities for finding alpha come from somewhat less mature markets. I'm not suggesting that Q starts looking at wild frontier markets where stock-exchanges as such barely exist, but there are excellent opportunities in markets that are reasonably large, reasonably mature (but less so than the US), and are well regulated. In particular I would strongly recommend that Q looks to diversify to at least the Canadian & Australian markets, i.e. the Toronto exchange and the ASX which are a lot less picked-over than the US market. @Jamie, @Dan: comments? @ Douglas - Q is pretty transparent with respect to users; they are providing a decent sense for what they want. Presumably, they have folks with money burning holes in their pockets who will invest in the 1337 Street Fund, if certain evidence in line with the contest requirements can be shown indicating that investing would be worthwhile. As Jamie mentions above, I have to think that they are doing their best to align the contest with what will be fund-able by the moneyed powers-that-be (if they aren't working for good alignment with their potential customers, then Q has a very strange business, indeed). I guess the argument would be that the whole enterprise of long-short US equities, as Q is approaching it, will be like trying to get blood out of a turnip. The counter argument I've heard in the past is that it is a huge segment of the hedge fund market and just capturing a small piece of the large pie would be workable. The other angle I've heard is that you gotta start somewhere and get things off the ground. Fanciness will follow. In the end, it sounds like Q will be doling out more cash to the masses (or at least the cash will be spread over more users), with just opportunity cost to the contest participants. Whether that leads to more algos getting fund allocations, and a long-term successful hedge fund enterprise...we'll see. @Tim: You are free to use minute data if you'd like. You will need to use Pipeline to get the QTradableStocksUS, but beyond that, there are no requirements to use Pipeline. Of course, you probably want to use it for anything other than minute data. @Douglas: The new contest will start out focused on long-short US equity strategies. Going forward, we would like to add contests for different types of algorithms. Other types of algorithms which can typically meet the allocation criteria and can be written on our platform today include pair trading, event-driven, futures strategies, and more. This isn't to say that these are the only good or cutting-edge algorithms, there are plenty of good algorithms that do not meet the constraints of this contest. The constraints of the new contest are designed to help you write a long-short equity algorithm that could be considered for an allocation. @Tony: The long term plan is to add more markets and more instruments as you suggest. Jamie, Regarding the possibility of: not using volume with the fixed slippage model what if there is no volume at all in a given minute bar? Would you still fill the order completely? Right now, we are planning to use a0.001/share commission model as well as a 0.05% fixed slippage model.

Does this correspond to:

set_slippage(slippage.FixedSlippage(spread=0.05))


or

set_slippage(slippage.FixedSlippage(spread=0.0005))


or something else (since it is not explicit how "a 0.05% fixed slippage model" relates to the FixedSlippage spread parameter)?

By the way, note that the help page doesn't spell things out explicitly regarding the units of "spread" (fractional/percentage):

When using the FixedSlippage model, the size of your order does not affect the price of your trade execution. You specify a 'spread' that you think is a typical bid/ask spread to use. When you place a buy order, half of the spread is added to the price; when you place a sell order, half of the spread is subtracted from the price.

Fills under fixed slippage models are not limited to the amount traded in the minute bar. In the first non-zero-volume bar, the order will be completely filled. This requires you to be careful about ordering; naive use of fixed slippage models will lead to unrealistic fills, particularly with large orders and/or illiquid securities.

@Grant the FixedSlippage model is a fixed dollar amount. Attached is what the 0.05% fixed percentage (bps) slippage model would look like.

0
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 5a246ec35e8e6540e24aeadd
There was a runtime error.

Hi Jamie -

I'm gathering that one could have (Annualized Specific Return)/(Annualized Common Return) approaching infinity, and still need to reduce risk factors, but this just doesn't make sense. Why bother? Does it make sense to impose a requirement to reduce risk factors that are in the noise?

How does the proposed FixedBPSSlippage handle bars with zero volume? I thought that orders could not be filled if no historical trade occurred? Is it correct that FixedBPSSlippage would set the price, but the order would not be filled until the next non-zero volume bar?

Please correct your calculations of Sharpe/Sortino Ratios to account for risk free returns (i.e. 1 year US Treasury Bills) in your new scoring system.
For reference: Sharpe ratio = (Mean portfolio return − Risk-free rate)/Standard deviation of portfolio return
This has caused a lot of anguish from contest participants who see the no. 1 ranked contestant with an annual return of 0.00003 something!

Hi James,

You've identified one of the problems we are looking to solve with the new scoring system. Note that the current scoring function floors the annualized volatility at 2%, so we only expect to see extremely low volatility in the case where an entry also has high returns.

You can find this in the volatility_adjusted_daily_return function in the original notebook on this thread:

def volatility_adjusted_daily_return(trailing_returns):
"""
Normalize the last daily return in trailing_returns by the annualized
volatility of trailing_returns.
"""
todays_return = trailing_returns[-1]
# Volatility is floored at 2%.
volatility = max(ep.annual_volatility(returns), 0.02)
score = (todays_return / volatility)
return score


Yeah, it will be good to get the fourth decimal point gamers out of the contest instead of rewarding them with a top place in the leaderboard. An exponentially degrading penalty term if the annualized absolute return 95% percentile falls below a certain threshold should do the trick.

@Jamie, yeah thanks for pointing that volatility is floored at 2% thus degrading the performance of abysmally small returns. That should do the trick.

Hi Jamie,

While normalizing the last daily return in trailing_returns by the annualized volatility of trailing_returns and flooring it at 2% is a step in the right direction, it still does not solve the error in the computation of Sharpe and Sortino ratios. Without subtracting the risk free return from the annualized portfolio returns, you are actually overstating the result of this metric which deviates from the original intention of the author, the venerable William F. Sharpe. The reason Prof. Sharpe adjusted the returns with the risk free return is to emphasize that if your portfolio returns are less than the risk free return, then your model returns are inadequate in terms of sound investment principles. You might as well invest your capital in a one year tax free U.S. Treasury Bill with relatively no risk at all. This correction is absolutely necessary to weed out the flawed scoring that ranks number one a model that has an annualized return of 0.00003. Had you subtracted the current risk free rate of 1.62%, this said model will have a negative Sharpe/Sortino Ratio and thus a much lower ranking, if not thrown away altogether!

@James, well said.

@Jamie, i would like to suggest that Q publishes the mathematical details of all formulas being used in contest scoring. Not only would this contribute significantly in the areas of trust and transparency, it would also provide a good quality control check for Q to ensure that there are no more errors in the formulas being used, as is well illustrated by James Villa's post above.

@Tony, Amen!

If Q is truly a community based platform, then open sourcing all calculations and codes used in analysis and evaluation of models would be the best policy for trust and transparency. The community themselves can help spot errors, inconsistencies and even provide 'right' guidance towards achieving Q's ultimate goals.
Q probably has a battery of young PhD's and mathematicians that works on creating tearsheets and various risk models in an effort to provide the model developer a tool to scrutinize and evaluate how the model works or not work. However, I suspect that most of these young guns have very minimal experience in investing and financial trading, I may be wrong though. Then there's the matter of what Q is looking for in terms of their definition of an 'ideal' portfolio. I don't know if this is dictated by their investors or an internal consensus of their management but anyway, this is their prerogative and we should respect that. However, I noticed that there seem to be an emphasis on risk management, volatility and uncorrelated beta which on the onset is not a bad thing except when there is an overkill which could constraint returns. Investment is all about risks and returns and what the investor can tolerate. Basic principles still holds, high risk/high returns, low risk, low returns, just the nature of the beast! Have yet to see the 'holy grail', low risk/high returns...the search continues!

@James, @Tony: Note that neither Sharpe nor Sortino are used in the new contest. The notebook attached at the top of this thread has the scoring function. The code might change a bit to be more efficient on the backend, but the scoring function will remain the same. For the constraints, I used pyfolio and empyrical to compute common risk metrics, both of which are open source.

@Everyone: We're still working on some minor tweaks to the constraints based on the feedback you provided. There is one more constraint that I forgot to include in the original post, which is that your algo needs to have positive returns since the start of the 2 year backtest (this one existed in the old contest as well).

Jamie,

Me and, I think, James and Tony will be happy if you change:

your algo needs to have positive returns since the start of the 2 year backtest.
to
your algo needs to have positive excess over risk free returns since the start of the 2 year backtest.

IMHO, 2 years backtest on daily frequency is too short to be meaningful. That being said, Q has set this as their criteria and as a contestant, I respect that.
However, let me point out that the last two years can be defined as a bull market and poses an upward return bias. From a modeling standpoint, this is not good because it is skewed towards one type of market condition, the bull market. Exposed to a bear market or consolidation conditions in the future, the model trained just in the last two years, will likely fail. I recommend that you try to incorporate historical data that encompasses all possible market conditions, specially the 2008 shocker, to have a more meaningful generalization of the future.

My two cents on the returns calculation is that the risk-free rate may not be so relevant. I doubt that an algo with a return equivalent to the risk-free rate would be attractive for a Q fund allocation. Incorporating the risk-free rate would be a starting point, but some multiple of the risk-free rate would probably be more realistic.

So, I would change from:

your algo needs to have positive returns since the start of the 2 year backtest

to:

your algo needs to have positive excess returns since the start of the 2 year backtest

I'm not sure what the offset should be, but there should be some feedback to users that they are off the mark, and wasting their time, or have an algo that is potentially viable, as indicated by positive excess returns. This way, if excess returns are flat or negative, its sends the message "To get an allocation, you still have some work to do, buddy!"

@Grant, I'm a bit confused by your statements. First, you start with "My two cents on the returns calculation is that the risk-free rate may not be so relevant. " Then you recommend a change to..."your algo needs to have positive excess returns since the start of the 2 year backtest". By using the words" positive excess returns", you are actually saying that risk free rate is relevant because you are incorporating risk free rate or a multiple of it as you are suggesting . Again, Prof. Sharpe introduced the risk free rate as a minimum threshold. Basically, he is saying that if your portfolio can't beat the returns of a one year US Treasury Bill, then it's not worth trading. By using a multiple of risk free returns, you are upping the threshold and this is arbitrary depending on what is acceptable to the individual investor vis a vis his risk tolerance.

Hi James -

I'm simply suggesting that the direct risk-free rate may not be relevant to the long-short equity hedge fund style of investing. A different rate may apply, for computing the excess return; that rate may be tied to the risk-free rate, but it would be higher. It would be nice to have some threshold, as a guide, since one has to figure that a strategy with slightly greater that 0% returns will never get an allocation; there should be guidance to authors what return is required.

Also, I'm not sure all trading and operational costs are incorporated into the simulations; the return we see is idealized and not what an investor would see. So, one would roll these in, in addition to whatever return is needed for Q to be a sustainable, competitive business. I have to figure that the baseline is greater than the risk-free rate.

Let me put things in the proper context. When we throw words like "excess returns", we need to define the baseline from where excess is derived from. From the Sharpe/Sortino Ratio prespective, the baseline is risk free rate. In the US, investment industry standard is 3-month maturity US Treasury Bill. The computation of annualized risk free rate = (Value at Maturity -Discounted Purchase Price) /Value at Maturity x (12/3) . Say you purchased today a 3-month maturity US Treasury Bill with a Maturity value of $1,000 at a discounted price of$995, annualized risk free rate = (1,000-995)/1,000x(12/3) = 0.02 or 2%. This basically sets the minimum threshold for evaluating portfolio performance. If you can't beat this then your portfolio is not worth trading.

Now, let's take a more stringent baseline of "excess returns" in the context of the Capital Asset Pricing Model (CAPM) that describes the relationship between systematic risk and expected return for assets, This is computed as follows:
CAPM = risk free rate + beta of portfolio(Expected Market Return - risk free rate)
Expected Market Return is normally represented by SP500 returns. Investopedia explains the rationale below and I quote:

"The general idea behind CAPM is that investors need to be compensated in two ways: time value of money and risk. The time value of money is represented by the risk-free (rf) rate in the formula and compensates the investors for placing money in any investment over a period of time. The risk-free rate is customarily the yield on government bonds like U.S. Treasuries. The other half of the CAPM formula represents risk and calculates the amount of compensation the investor needs for taking on additional risk. This is calculated by taking a risk measure (beta) that compares the returns of the asset to the market over a period of time and to the market premium (Rm-rf): the return of the market in excess of the risk-free rate. Beta reflects how risky an asset is compared to overall market risk and is a function of the volatility of the asset and the market as well as the correlation between the two. For stocks, the market is usually represented as the S&P 500 but can be represented by more robust indexes as well.

The CAPM model says that the expected return of a security or a portfolio equals the rate on a risk-free security plus a risk premium. If this expected return does not meet or beat the required return, then the investment should not be undertaken."

I can not emphasize enough the importance of incorporating risk free rate in all returns analysis as this represents the time value of money, a very basic investment principle. I checked Q's open source codes in empyrical like Sharpe, Sortino, Omega and they are all correct and has provision for risk free return, HOWEVER, THEY ARE ALL SET/DEFAULTED TO ZERO, TSK, TSK ...Prof. Sharpe and Sortino must be turning in their graves!

@Karl, LOL! That would be from 2010 to mid 2015. Click Maximum option in Chart and see that in 1981 it was 14.3% and thereafter averaging around 6% before going down to around 0.2% in mid 2015 and slowly creeping up again. These are significant historical numbers. So imagine in 1981, your portfolio should beat 14.3% annual returns. SP500 annual returns in 1981 was a -4.7%. Even if your portfolio made money in 1981 say 10% and beat the SP500, you are still better off investing in risk free Treasury Bills. Just saying....

@Karl, so your justification is no harm, no foul! That is not good enough for me because when I backtest I use 10-15 years data to encompass all possible market conditions, i.e. bull, bear, consolidation, random for better generalization of the future. And all this time I am staring at performance metrics like Sharpe and Sortino ratios that are overstated because they did not deduct the risk free rate. ZERO is not reasonable historical number, this has caused the leader of contest 34/35 to be able to game the contest. Look at his numbers:
300 ANNUAL RETURNS
0.0003249%
1 ANNUAL VOLATILITY
0.00009459%
31 SHARPE
3.486

If you subtracted the current risk free rate of 1.05% from his annual returns, he woud have a negative Sharpe Ratio. This is a major flaw!

@James makes the very important point that: "the expected return of a security or a portfolio equals the rate on a risk-free security plus a risk premium. If this expected return does not meet or beat the required return, then the investment should not be undertaken."

The second part of this statement implies that we can substitute the words "Minimum Required" in place of the word "necessary", and then the first sentence holds true for any prudent investors, irrespective of whether or not they specifically use the CAPM.

No sensible investor would invest in "risky" stocks unless they can get a HIGHER return than they would from "risk-free" treasuries!

So, for sensible and prudent investors, the MINIMUM required return for investing in stocks or any other non-bond assets = Rf + Premium for that particular asset.

Premiums for different asset classes and specifically for different shares depend on the size and quality of the company, the degree of leverage it uses (e.g. Debt to Equity ratio), the countries in which it operates, and various other company-specific factors, as well as the recent overall state of the markets. Median values of Equity Risk Premium (ERP) are typically of the order of 6% or so. Damodaran has some good info on this topic, see for example:
http://people.stern.nyu.edu/adamodar/pdfiles/papers/ERP2012.pdf
http://pages.stern.nyu.edu/~adamodar/New_Home_Page/datafile/ctryprem.html

So, if we assume Rf = approx 1 or 2% now, then prudent investors (not wild speculators) are going to be looking for minimum returns of about 1 to 2 + something around 6 as ERP = required return of 7 or 8% pa from stocks now, which makes a complete nonsense of assuming 0% in the calculation of meaningful reward:risk ratios that prudent investors would use.

@Guy, thanks for concurring.

Now that we've established E[R(p)] as the objective function to be maximized, let's look at the constraints. There are 9 constraints as detailed by @Jamie @ the top post. I assume that these constraints are designed to shape what Quantopian and/or their investors is their "ideal portfolio" or better yet, "preferential portofio". This is where the dilemna lies. You question if this is the right mix for the desired return/risk objective and even if it were, it is optimal? I haven't dug deep yet on the impact analysis of each individual constraint to desired returns/risks. However, I worry about overkill of constraints that limits the returns potential. Volatility as measured by standard deviations of daily returns is slightly flawed because it is not a normal distribution when computed as daily percent change. So this risk measure is also slightly off reality. Better to use daily log differences which is closer to normal distribution with fat tails and similar to pareto distribution Are drawdown based risk measure closer to reality? These are just a few questions to sort out..

Well, my vote would be to define the Sharpe ratio in whatever way is the most widely accepted (I guess that means subtracting off the point-in-time risk-free rate). However, it would still make sense to have another number, added to the risk-free rate that would be representative of the kind of return Quantopian needs, based on its business and the market for such investments. For example, presently, if Quantopian needs 10% above the risk-free rate (unleveraged) and the risk-free rate is 1.5% (a minimum return of 11.5%), then if the risk-free rate doubles, a minimum return of 13% would be required to be in the black.

It does raise an interesting question. If the risk-free rate were to go to 10% or higher, would this whole long-short equity enterprise make sense? Why, under such conditions, would one expect to be able to make a go of it? If this were the case, then I suppose it would be captured by the contest rules by awarding nobody prizes (since all of the returns adjusted by the risk-free rate would be negative).

Let me break down the equation qualitatively:

Expected Portfolio Returns = actual annual protfilio returns + risk free rate
This is because the constraint of beta to SPY is targeted at 0 as optimal. This means they want the portfolio returns to be totally uncorrelated to the market (SP500) returns.
But, since they set risk free rate at 0, they are not accounting for risk free rate either. So the equation becomes:
Expected Portfolio Returns = actual annual protfilio returns
So in the end, the actual returns is what they call alpha. And this is reflected as the parameter to Maximize using the Optimize API.
And currently this is scored by this formua/code:

def volatility_adjusted_daily_return(trailing_returns):
"""
Normalize the last daily return in trailing_returns by the annualized
volatility of trailing_returns.
"""
todays_return = trailing_returns[-1]
# Volatility is floored at 2%.
volatility = max(ep.annual_volatility(returns), 0.02)
score = (todays_return / volatility)
return score

@Jamie, Is there any basis for the floor of 2% on volatility or just an arbitrary threshold? I am not convinced that volatility_adjusted_daily_return would describe alpha best. My beef here with volatility as computed is the assumption that that daily returns have a normal distribution, you know, the bell shape .
distribution chart. Empirical studies say they are not. What does this mean? To me it means that volatility as computed by standard deviations and used as the risk measure is then not in tuned to the reality of the market and therefore not accurate. This is why, post modern portfolio theorists are proposing alternatives that are more in tune with reality. Newer risk measures such as accounting for only negative or downside deviations, by Prof. Sortino or those that rely on drawdown information as a better measure of risk like CALMAR, Ulcer, etc. Tony Morland has touch upon these in his previous post. Something that Q should be open minded about.

There's a discussion of risk here:

http://www.greenwichai.com/index.php/hf-essentials/measure-of-risk

It seems the contest rules may be leaving out some important measures of risk. In particular, negative skew would seem to be undesirable:

Negatively Skewed Distribution - characterized by many small gains and a few extreme losses and has a long tail on its left side. Relative to the mean return, negative skewness amounts to a limited, though frequent, upside compared with a somewhat unlimited, but less frequent downside.

But perhaps with only ~2 years of data with a limited number of trades, it would be hard to capture the skew with any confidence, anyway, since one needs to go way out in the tails.

@Grant, thanks for the link. I'll give it read.
This is why I choose to backtest 10-15 years to try and capture different regime shifts and design a model that will adapt to these changes as financial time series are nonstationary. Relying on 2 years backtest only is a dangerous proposition, in my view and I have pointed out my reasons in an earlier post. A model that survives the test of time with consistent reward/risk ratio that beats the market over time is what we should be shooting for.

@ James -

My understanding is the Q fund team likes to evaluate back to 2010 (which avoids the Great Recession which, if I understand, is exactly the kind of thing hedge funds are good for...). This is consistent with their various "alternative data sets" (excluding fundamentals), which don't go back very far. I guess the idea, too, is that if one goes too far back, the market conditions were not relevant. I'd be curious why the new contest is limited only to 2 years. Why not go back to 2010 to have better consistency with the way candidate algos are evaluated for the fund? Maybe they just don't want to make the bar too high for the contest? But then, it may tend to give users false hope, and also encourage short-term, over-fitting type behaviors.

@ Jamie - Why 2 years and not back to 2010?

@Grant, I found that 10-15 years backtest gives a fairly good representation of how I view the dynamics of market price evolution . A pattern that persistently occurs historically is : bull market -> consolidation -> bear market. I see these as fractal patterns as evidenced by self similarity measures that say they are invariant, meaning you can see the same patterns occur at different time scales be it one minute or monthly The duration and magnitude of each regime is what changes and a challenge to predict. Since I only use variations of prices as my inputs, I can go as far back as I want. I believe that prices reflect the equilibrium of demand versus supply of market participants who have made their buy/sell decisions based on all available information they have at that time. In the end, the accuracy, adaptability and consistency of my predictions are the driving factors that I look for in achieving alpha which in my definition is excess returns relative to SP500 (market returns) and/or risk free rate, whichever is higher.

@James, @Grant, there is a lot of good content in each of your posts above. Although you are responding to each other's comments, i think there are also 2 very good and separate themes here.

Time period for (back)testing as part of the ideas-to-algo development process:
Whatever Q might choose to use for their evaluation (for whatever their reasons) is not necessarily the same as what we would be wisest to use in our own algo design process. As far as I'm concerned, the longer the better because, exactly as James says, we really do need to see as wide a range of different market regimes as possible, and preferably more than just one repetition of each of the bull-bear or bull-consolidation-bear cycles. At least that's what we need if we want to design robust algos that can work in the long-term, rather than just in the immediate future for only as long as the current market regime prevails.

Negatively Skewed Distributions: I don't like these. The implications & likely eventual consequences are exactly the reason why i don't do option writing as a "profit" strategy in my own trading. It's also why I am considerably more cautions about applying counter-trend mean reversion strategies as compared to either trend following or with-trend mean reversion. I just don't like the idea of "picking up nickels & dimes from in front of a steam-roller" (as the classical metaphor for negative skew strategies) no matter how easy it may seem to be. Eventually, sooner or later, just one little slip and .........

Hi Jamie, in the attached backtest I added the number of stocks held by the algo at any time. This number is about 50 long and 50 short at any time while there are about 2000 stocks in the universe. Is this the outcome of algo.order_optimal_portfolio? Thanks

11
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 5a2d20e27568c94060c65a5b
There was a runtime error.

@Ioannis: That's correct, the outcome is from order_optimal_portfolio. When MaximizeAlpha is used as the objective function in order_optimal_portfolio, Optimize will attempt to order the portfolio with the greatest alpha (as defined in the values passed to MaximizeAlpha. If we didn't pass any constraints to order_optimal_portfolio, Optimize would just place orders for the stock with the highest alpha value passed to MaximizeAlpha. When we pass constraints to order_optimal_portfolio, Optimize has to find the best solution subject to those constraints. In this case, since the maximum short and long position sizes were capped at 1% of the portfolio, and max gross exposure was constrained to 1x, it makes sense that Optimize ordered about 50 long and 50 short names.

Does this help?

Its clear now Jamie. Thanks

I've updated the original post on this thread to reflect some changes to the criteria based on feedback from participants on this thread. I noted the changes at the bottom of the original post. I also added some links to a new risk API that should help control your exposure to factors in the Quantopian Risk Model.

Here is a new example algorithm that meets all of the contest constraints. It also uses a 5 bps fixed slippage model which we expect to use in the new contest. Note that the slippage model included in this example enforces a 10% volume share limit, which is contrary to comments that I made earlier in the thread. We are now expecting to use a 10% volume share limit so you should use this model to prepare for the new contest. We have a built-in version of this model along with a forum post + explanation coming soon.

You might also notice a new SimpleBeta factor which replaces the RollingLinearRegressionOfReturns factor in my earlier example. This is a new factor that simply gets the beta of the daily returns of stocks, regressed against the target. SimpleBeta is more than 1000x faster than the old term, so I swapped it in. We also have a post coming with a bit more of an explanation on SimpleBeta.

22
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 5a31777ccef40c43e4313255
There was a runtime error.

@Jamie, these are all nice changes. However, you seem indifferent to the discussion above on the importance of factoring in the risk free rate in calculating Sharpe, Sortino ratios or all other metrics that measures returns. Although Sharpe and Sortino ratios are not part of the new scoring system, Quantopian's calculation of these are quite frankly incorrect by investment industry standards and financial / investment textbook and publications. Since you still include it your Risk Metrics Section, this can pose a problem for Quantopian in the future.

On the new scoring function (returns/max(volatility,2%)), is there any basis for the floor of 2% on volatility or just an arbitrary threshold? Without adjusting returns with risk free rate, this could still be a problem. Here's a concrete example: Returns = 2%, Volatility = 1%, Risk free rate = 3%, under the new score function, the score = 2/max(1,2) = 1. While this curtails what I call scalping /gaming algos, it does not answer a basic investment question: are the returns of the equity portfolio better than that of the risk free Treasury Bill? Under the above example, it does not. I hope you see the point I'm harping at.
A better scoring function, in my opinion:

score = (returns - risk free rate) / (maximum(volatility of portfolio, market volatility))

@James, i fully understand and agree with you.

@ Jamie, especially as Q now has all the necessary data (e.g. 10-year Bond futures, etc), is there any good reason why Q is NOT taking risk-free rate into consideration in any way?

@ Jamie -

It is tedious and seemingly inefficient, but if you solicit feedback from the crowd with a thread like this, you really need to respond to all comments/questions in some fashion, even if they seem irrelevant from your perspective. Perhaps you could do a scrub and reply to all points brought up here by users? For example, a longer backtest was discussed (e.g. 10 years instead of 2 years), but there has been complete radio silence from Quantopian. It is a very reasonable point, I think. Are you still chewing on it? Or have you locked into a 2-year backtest? If the latter, just say so and explain why.

Hi Jamie,
Regarding "One thing we’ve talked about is measuring the 95th percentile of daily leverage to be <= 1.1x, and then requiring the 100th percentile to be <= 1.2x" . Is the 95th/100th percentile changes still under consideration for the contest.

I appreciate that people here have strong opinions on the risk-free rate in the Sharpe Ratio, and many discussions were had. I do want to highlight that that discussion is unrelated to the new contest which is not using Sharpe Ratio but just plain risk-adjusted returns. Subtracting any constant from all algorithms' returns will not change the ranking which determines the prize. As such, I would prefer if we didn't derail this thread with this discussion and move it elsewhere, for example here was a prior discussion: https://www.quantopian.com/posts/feedback-requested-improvements-to-quantopians-risk-and-performance-calculations, or let's open a new thread.

Because only OOS performance is used for scoring, I don't understand why a longer backtest would have an influence on the score. The backtest is only used to check if structural criteria are matched (and to compute volatility at first).

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

@Leo: Yes, we plan to have outlier guarding for each constraint (other than ordering with order_optimal_portfolio, and the positive returns constraint). If you look at the notebook in the original post (updated), I added outlier guarding to the beta-to-SPY constraint as an example. Sorry for not mentioning that.

I get this Exception in my backtest in August 2008:

ValueError: NaN or Inf values provided to FactorExposure for argument 'loadings'.
Rows/Columns with NaNs:
row=Equity(32430 [SHA]) col='industrials'
row=Equity(32430 [SHA]) col='momentum'
row=Equity(32430 [SHA]) col='short_term_reversal'
row=Equity(32430 [SHA]) col='size'
row=Equity(32430 [SHA]) col='value'
... (1 more)
Rows/Columns with Infs:
None


Does anyone else get this? The backtest worked until this point. It's using Q500US.

UPDATE: This seemed to fix it:
 opt.experimental.RiskModelExposure( context.risk_loading_pipeline.dropna(axis=0), version=opt.Newest ) 

@Thomas, I feel the need to respond to your post above as I believe I am one of the persons you are alluding to. You are right, that under the proposed new scoring system, the right or wrong way to calculate Sharpe is irrelevant. My discussion here on risk free rate and Sharpe is by way of reference to the existing contest scoring, my discovery of its flaws and with the intend to provide suggestions to improve the new proposed scoring system. Q just disqualified a contestant who was ranked no. 1 today and this has exposed the flaws of the scoring system. Had you included risk free rate in your calculation of Sharpe, gaming/scalping algos would not be able to slip the cracks. No one in Q seems to grasp the basic investment principle that risk free rate is the minimum threshold. The new scoring system includes a constant, something you don't seem to like, by flooring volatility at 2%. Jamie hasn't answered my question (which I already asked twice) , what the basis of this floor constant, if any.

@Thomas,

What is your opinion about Tony Morland request to use "downside volatility" as a volatility component of risk adjusted return?

@Guy While its not my place to say, I think I would find an additional investor who would like to dabble in algos with few restrictions.

@Jamie Since the market sees our orders and can react in a way backtests cannot, that difference is something we can strategize to profit from.
Request: Please publish the differences between real world fund algorithms that have been active and backtests over the same time period.
I'm not asking for their absolute metrics, it would be dumb to ask because that is proprietary. I think I'm leaning toward intelligent and respectful by just asking for the differences. For example: Algo A: Backtest returns are 2.4% higher than real world performance was, and what about alpha, beta, sharpe & drawdown in backtest vs real? Algo B: 34.8% higher returns in backtest, the worst difference, would tell us something. You don't need to say how many of them you wind up comparing, just highs, lows and average differences.
Think real world algo at 10M with 50k orders across 200 stocks. Sometimes 50k (think long for example) might be a lot for certain stocks, no? Momentum day traders out there might see that along with the effect of a price rise from it (yes?) and say, I want to ride that wave and they buy too. The other crowd might take profit. In something like that, seems the backtest-vs-real magnitude can give us a clue on how much we might benefit in giving that thought. There's a lot of brain power in the world. Someone might find a brilliant new method, triggered by the new info. We might look at price movement and volume 30 minutes after order_optimal_portfolio and make decisions based on some extremes or who knows? Hi Jamie, Thomas, Q team: My understanding is that the primary rationale for the new contest, as Jamie states above, is captured in the statement: The new contest will give you better and faster feedback on your algorithms. The rules will be better aligned with how we evaluate algorithms for allocations. The Quantopian Risk Model is a key part of our allocation process, and it will soon help drive the contest. As a side note, it would seem that this should read "The Quantopian Risk Model will be a key part..." since it is still in the process of being released and implemented...or perhaps you are saying that you've been using it awhile for evaluation purposes, and are now rolling it out into the API? Anyway, back to the main point. If your allocation evaluation process includes backtesting over a longer period (e.g. back to 2010), then if one of the goals of the contest is to provide better feedback vis-a-vis your allocation evaluation process, then my Vulcan logic concludes that it would make sense to start contest algo backtests back further than 2 years. Presumably, as well, more than just a 6-month out-of-sample data set is used to determine an allocation; an assessment of the backtest is done. Awarding prizes solely on the basis of a 6-month out-of-sample simulation would seem to be in poor alignment with the allocation process. Again, applying Vulcan logic (I'm only part Vulcan, so it could be flawed), presumably the allocation process includes some lower limit on returns, everything else being copacetic. So, contest rules and prizes aside, that number should be included in the feedback to users. The risk-free rate would seem to be a logical starting point (unless, of course, allocations are awarded to algos making less than the risk-free rate, but this would seem to be illogical). In the context of the allocation process, how would the proposed 2% floor in volatility for the contest be incorporated? If it is not part of the allocation process, then it would seem illogical to include it in the contest entry evaluation criteria. It sounds like an ad hoc fudge-factor for a problem that might be better fixed in a different way. Generally, chucking fudge-factors into a model is a bad practice; it should be a red flag that something is amiss. On a separate note, how will Q gauge and report the success in aligning the contest will the allocation process, since this is a key goal of this effort? It would be interesting to publish the percentage of the contest entries that made it on to the next stage of being seriously evaluated for an allocation (e.g. get designated as a candidate algo for the fund). How will we know if the contest is meeting its goal of doing a good job of generating decent algos for the fund? @James Villa: The problem you point to is a real one, but I don't think adding a risk-free rate is the right fix (although it would help in this specific scenario). Instead, we should have implemented a lower bound for leverage and volatility, as this contest does. So I'm not concerned about the failure mode you highlight because it's already fixed more properly here. What's left then is a (perfectly valid) general discussion about Quantopian's usage of the term "Sharpe Ratio" with a risk-free rate of 0, but which is unrelated to this contest. @Vladimir: Same answer, cheating by using a very low leverage will not be possible in this new contest. I think that's the right way to fix it. @Vladimir: I think downside volatility has it's uses, especially if you want to measure anything related to utility to an investor. Here we really just want to put everyone on equal footing in terms of a risk-budget so the fact that investors care more about downside volatility isn't as prominent and would just make the volatility calculation more noisy. @Grant: There is no hard cut-off, and if it were, it would be defined assuming a risk-free rate of 0, so the effect would cancel out. @Guy: Why couldn't a strategy that does well in the contest not be levered? There is no hard cut-off, and if it were, it would be defined assuming a risk-free rate of 0, so the effect would cancel out. Not sure I follow. I guess I'm concerned that I and others could spend a lot of time working on algos, only to find out that the return, while positive, is insufficient for an allocation. Presumably you'd at least need to cover the cost of analyzing the algo and the paperwork to license it? And whatever operational costs come into play to deploy it? Sorry, just not following your rationale for a zero return cut-off. There must be some estimate of a minimum as guidance, no? @Grant: Let's put it like this: an algo that has the right structural requirements (like those in the new contest) but an OOS Sharpe Ratio of significantly less than 0.5 (without risk-free rate, you can factor that in and calculate the adjusted SR) will have low chances of getting an allocation. Sounds like the cut-off is ~ 0% returns referenced to the risk-free rate, or ~ 1% returns absolute (since a SR ~ 0.5, with a 2% floor in volatility could be achieved with 1% returns). Seems pretty meager for a hedge fund, but what do I know... The SR > ~ 0.5 sounds realistic, per what I learned in Rob Carver's book (not sure if he incorporates the risk-free rate and a 2% floor in volatility). As I recall, he claims that a long-term SR of > ~ 1.0 is probably dope-smoking, and should be a flag that something is wrong. I guess I'm confused why SR would be irrelevant to the contest? You are effectively computing a SR by computing volatility-normalizing returns. So shouldn't that computation be done in a standard fashion (e.g. https://en.wikipedia.org/wiki/Sharpe_ratio where a reference benchmark is described)? My personal theory is that a SR consistently above 1 in a backtest just means your backtest hasn't had a regime change or black swan event that goes against your strategy. @Thomas: The problem you point to is a real one, but I don't think adding a risk-free rate is the right fix (although it would help in this specific scenario). Instead, we should have implemented a lower bound for leverage and volatility, as this contest does. So I'm not concerned about the failure mode you highlight because it's already fixed more properly here Can you please prove to me your statement that ..."it's already fixed more properly here"? Because here's my proof that your volatility fix is not sufficient and I quote myself from another thread: So basically it is a volatility adjusted return measure with volatility floored at 2%. This implies that you are establishing a minimum of 2% assuming the formula's benchmark is 1. This satisfies my proposed correction no. (2) above, defining a minimum return. However, without factoring in my proposed correction no.(1), inclusion of risk free rate adjustment to returns, you can run into a problem in the future. If we define risk free rate as the 3 month US Treasury Bill currently at around 1.05%, today it might seem negligible but historically for the last say 30 years, it was high around 14% to low of 0.2% and averaging about 5-6%. This means that even if you established your minimum required returns via flooring the volatility measure at 2%, in times when risk free rate is above 2%, your benchmark falls apart. @Grant: The key is that if you combine a couple of 0.5 SR strategies, you can get something that has a much higher SR in the aggregate (if the strategies are uncorrelated). That's the key idea behind Quantopian, find as many uncorrelated 0.5 SR strategies as we can and combine them. Of course no algo will be excluded for having a SR higher than 0.5 ;). The academic definition of the SR is irrelevant for the contest and for the success of Quantopian. We just want to reward people who write strategies that perform well OOS and meet our structural requirements. The risk-free rate does not affect this in any way. Trust me, in the industry no one is concerned about the risk-free rate. Please, let's move on to more relevant topics. @ Burrito Dan - It's been awhile, but I recall Carver talking a lot about negative skew. Not sure if the Q risk model captures this. I guess certain strategies are known to be negative skew, and in the end, the Q fund team, once the economic rationale is explained, can try to make an assessment of the regime change/black swan risk. The SR discussion would seem to be very germane, since with short backtests and not referencing the risk-free-rate, greed-driven delusions can set in. Another thing Carver talks about is the fact that the error bars on SR are very large, even for longer backtests. I guess the Q risk model will provide some sobriety, but assuming that the assessment (both for the contest and the fund, since the intent is to align them) is only 6 months, I'm not sure it really makes sense. The key is that if you combine a couple of 0.5 SR strategies, you can get something that has a much higher SR in the aggregate (if the strategies are uncorrelated). So how would I and other users know that our algos are uncorrelated with what you already have in the fund? Will you be adding additional risk factors (e.g. an risk factor called already_in_fund)? EDIT: No matter how many uncorrelated SR = 0.5, 1% return algos are combined, the SR goes up, but the return is still only 1%. I guess one can then hit the 6X leverage, and get 6% return? Sorry, very confusing. 1% still seems pretty low. Trust me, in the industry no one is concerned about the risk-free rate. That's really surprising. I guess it is because whatever target return they have is much, much greater than the risk-free rate, and so it is in the noise? @Thomas That's the key idea behind Quantopian, find as many uncorrelated 0.5 SR strategies as we can and combine them One of the well known stylized facts of financial instruments is that all correlations increase during times of increased volatility. How then do you plan on finding a suitable number of uncorrelated strategies given that the upcoming strategy constraints are squeezing our strategies through an increasingly narrow funnel, essentially creating a bunch of structurally similar strategies that may appear uncorrelated when all is well and liquidity is plentiful but show increasing correlations during market disruptions or panics as that liquidity dries up? (Think quant-quake of 2007 where unrelated liquidity risk from bets on sub-prime mortgage instruments created forced selling in many long/short market neutral strategies and a race to the exits with significant and sharp short-term drawdowns). Aside from trusting Thomas, I also think the risk free rate is irrelevant. Since long short equity is self financing, the strategy’s capital can be put into short term treasuries, and this return stream is added to the strategy’s. It then gets taken off again if you want to calculate a SR using the strict CAPM definition. WOW, @Thomas, I couldn't believe you made these statements: BlockquoteThe key is that if you combine a couple of 0.5 SR strategies, you can get something that has a much higher SR in the aggregate (if the strategies are uncorrelated). That's the key idea behind Quantopian, find as many uncorrelated 0.5 SR strategies as we can and combine them. Of course no algo will be excluded for having a SR higher than 0.5 ;). If the Wall Street Journal's article is accurate about Q's fund being down 3% in a bull market, this strategy is very telling, albeit a short run! BlockquoteThe academic definition of the SR is irrelevant for the contest and for the success of Quantopian. We just want to reward people who write strategies that perform well OOS and meet our structural requirements. The risk-free rate does not affect this in any way. Trust me, in the industry no one is concerned about the risk-free rate. Please, let's move on to more relevant topics. I've been in the investment industry for 30 years, we do care about risk free rates. I'd rather be sipping pinacolodas in the Bahamas earning a meager 1.05% return onTBills, worry free rather than trying to act like a genius churning out returns of 0.5% in a long/short equity fund. @Thomas Trust me, in the industry no one is concerned about the risk-free rate. This is just not true. Many of us are in the industry ARE concerned about the risk-free rate, particularly in light of the potential future increase of such. Our pricing models (in theory and in practice) are based on the spread (premium) between a given asset's return and that of a "risk-free" instrument. @Burrito Any leveraged strategy is by definition NOT self-financing. The whole Quantopian model as presented to us up to this point is to find "low-risk" strategies and leverage them to obtain competitive returns. @HarlequinSheep, you sound negative on the risk model. I think if anything it stops the community from writing and rewriting short term mean reversion and common factor strategies. Common factors are exactly what blew up in the quant quake. Funds like AQR observed their more proprietary factors didn’t suffer as much from the deleverage than the more common factors. This Chat with Traders podcast with Aaron brown is good, as is his “risk management for dummies”, which has a section on the quant equity crisis. The cash you get from the shorts finances the longs... even if you leverage @Burrito And just who lends you those securities you plan to short at 0% interest? There are no such things as (leveraged) self-financed portfolios, just as there there no such things as perpetual motion machines. @Burrito Dan, as they say in here in Brooklyn, Furgitboutit, there's no such thing as a free lunch! I also think the risk free rate is irrelevant. Since long short equity is self financing, the strategy’s capital can be put into short term treasuries, and this return stream is added to the strategy’s...The cash you get from the shorts finances the longs... even if you leverage There is cost of borrowing, cost of paying the genius running the fund, cost of entertaining clients in a strip bar, office expenses, etc. Why go to all these troubles, risks, expenses if you can't beat the worry free , risk free rates? I would think that at some point the inefficiency in the market which is the hedge fund lifeblood wouldn’t be high enough. For example, if the risk-free rate goes to 10% unless the market inefficiency scales accordingly hedge funds would struggle. What am I missing? For reference, from Rob Carver's book, Systematic Trading, pg. 32: "Strictly speaking you should take the 'excess' return over and above a risk free interest rate, although this isn't relevant for a trader using derivatives, nor as important in the post 2009 low interest rate era as it was before." I guess the Q model is that interest rates will stay low indefinitely, and hedge fund returns will be comparatively high, such that fussing around with computing point-in-time excess returns doesn't add anything to the picture. Leaving it out of the model, however, means that the model only applies under the present historically low-interest environment. Is this a correct conclusion? Can we take risk free rate discussion into a thread of its own, instead of hijacking this one which is about new contest rules? @Leo M, we are not hijacking this thread by discussing risk free rate. The discussion of risk free rate here is by way of proposing ways to improve new contest rules. If you carefully read these, you might learn something meaningful and relevant in the context of contest scoring. Let's move the risk-free discussion over here: https://www.quantopian.com/posts/risk-free-rate-on-quantopian Even if you think it's related to the contest, please post there of why you think so. @Thomas, OK then, I think I've said enough on that matter that if it still doesn't sink in, it's a futile exercise on my part. But please answer me this on a question I keep asking but don't get a response: What is the basis of the 2% (a constant) floor on volatilty in the new score function? To me it implies that Q has set the minimum benchmark of returns at 2%. If benchmark is set to 1 in the new score function, by way of mathematical extrapolation, you get 2%. Am I right? Answer me this first, please. @James: The 2% volatility floor is in the contest scoring function to guard against gaming. The scoring function is used to rank participants relative to each other, so there's no benchmark. We're starting with 2% because we think that this covers the most extreme cases (based on analysis of past contest submissions). When the new rules roll out, we will keep a close eye on the rankings and results and see if we need to make a change. @Jamie, nice and safe answer. So in other words, you've designed a scoring mechanism that doesn't have a benchmark which includes a constant factor that guards against gaming, purely for ranking purposes, that passes your risk model constraints but without regard to relative returns of the market or other alternative asset class including (apologies for using the bad word) risk free rate. I will now refrain from commenting any further because doing so will involve me using or referencing the "bad word". But if you go up further in my posts, it's in there! In six months, if Q publishes contestant performance metrics, as they have in the past, we'll have a frame of reference regarding absolute returns (assuming that top-ranked algos are legitimate candidates for allocations). As I commented on https://www.quantopian.com/posts/risk-free-rate-on-quantopian, I'd expect the official Q baseline to be 0%, for various reasons. @ Jamie - what do you plan to publish, in terms of individual (or summary) contest performance metrics? Hopefully, this question is considered in-bounds. @James: That sounds like the correct interpretation. @Grant: Right now, the plan is to just show the out-of-sample score, and whether or not an algo has passed all of the criteria. Given the questions about the scoring function so far in this thread, I can see why more information would be useful. I'll think about ways that we could make more of the risk and performance metrics available. I'm not sure how much guidance relative to the competition a simple out-of-sample score and pass/fail will provide. Basically, assuming that the contest is a good proxy for getting an allocation, then it would be good to know specifically what shortcomings need to be corrected, relative to the competition. I guess, at a high level, it'll be either returns or volatility. @Grant: I think I misunderstood your question. When you submit to the contest, you will know whether your passed the constraints outlined in the original post. If you fail to pass a constraint, you will receive a helpful message indicating why it failed, and providing guidance on how to correct the behavior (similar to what the attached notebook attempts to do on failed constraints). This should provide the feedback necessary to correct the shortcomings. The criteria are independent of other algorithms in the competition. You need to meet all of the criteria to participate in the contest. The score-based rank is relative to the competition. The higher your score, the more likely you are to win. @ Jamie - Feedback & scoring sound good. As I understand, the scoring will simply be based on returns over volatility, so I guess you could report those separately, as summary statistics across all participants. Then one could see placement within the overall returns histogram and the volatility histogram, in addition to a histogram for the returns-to-volatility ratio used for scoring. @Karl: That's correct. The plan is to allow entries in the current running contests to continue to run under the old rules until their 6 month contest is complete, and you will be allowed 3 new entries under the new rules. @Guy: I apologize, I sent a message to the email associated with your account, I'm not sure why it didn't go through. Next time, I'll wait for you to respond first. I certainly didn't want you to feel like I don't want you to participate in the conversation. One of the limitations on our forums is that conversations are single-threaded (i.e. you can't respond directly to a comment). As a result, we try to keep the discussion of a thread on one topic, and encourage new topics to be opened up as new posts in the community with links back to the original thread. Your post was indeed about the example that I posted at the top of this thread. However, the point of my posting the example was just so that someone could use the notebook. I'll make that more clear next time. An analysis of the strategy itself would best be done in its own post -- would you mind starting that thread and kicking off the discussion? Next time, I'll be more patient, sorry! I wanted to draw attention to the likely consequences of changing slippage to 5bp in future contests, combined with the mandatory use of the Optimize API. If I understand well, current contests use the default VolumeShareSlippage model with volume limit at 0.025 and price impact at 0.1. The slippage is therefore currently capped at 0.1 x (0.025)^2 = 0.625 bp. In future contests, the default slippage will be 5bp, which corresponds to a minimum eight-fold increase from current levels. For a 20 stock, the slippage would represent 1 cent, and would therefore dwarf commissions (0.1 cent).

I believe that this increase in slippage will make all high turnover strategies non-viable (in real life, most aren’t viable anyway, at least when using market orders). From that perspective, I notice that the example algorithm provided by @Jamie trades on a weekly basis. If we change the trading frequency to daily, and use only the most volatile alpha (the one from PsychSignal), we obtain a Sharpe ratio lower than minus 5!

More importantly, I believe that, due to this sharp increase in slippage, the question of trade execution becomes far more important when designing an algo. It seems to me that the now mandatory optimize API has a “Markovian” view of markets. It does not care about the current portfolio positions, and simply readjusts positions without considering the trading costs involved in doing so. Consequently, I believe that with the new 5bp slippage, readjusting one’s portfolio on a daily basis using the Optimize API will clearly be suboptimal.

In a part of the quant world I’ve worked for during a few years (CTAs), models are generally deceptively simple. The PhDs those funds hire (especially multi-billion ones) work mostly on minimizing trading costs. Since the OptimizeAPI will become mandatory, it will be difficult for us to find ways to minimize the trading costs incurred by the new 5bp slippage.

To sum up, I believe it will be difficult from now on to produce algorithms that perform nicely in the long-term*, whatever the metric (SR, return etc.)

• That said, we should always keep in mind that on a gross basis and a 6 month horizon, 24% of a population of dart-throwing monkeys can be expected to have a Sharpe ratio higher than 1!

@ Jamie -

One of the limitations on our forums is that conversations are single-threaded (i.e. you can't respond directly to a comment). As a result, we try to keep the discussion of a thread on one topic, and encourage new topics to be opened up as new posts in the community with links back to the original thread.

It is not your forum that is the fundamental limitation, as there are numerous ways to manage things differently without modifying your forum (I prefer your simple flat design). One simple approach would be to set up a spreadsheet and capture every bit of feedback in it, along with what are considered closed and open issues. Alternatively, you could use your Github (https://github.com/quantopian/) in a similar fashion. The latter would be the preferred approach, since you are basically engineering a software product with the new contest.

The other piece of advice is that if Q is going to go down the path of requesting that separate threads be opened, you need to list all of them at the top of the original thread (with brief descriptions). This would serve several purposes:

1. Easy, for trace-ability, to find all of the input to this development effort.
2. Clear to contributors that they are not being suppressed, but rather you are making an effort to organize the feedback, to keep it tractable.
3. Easy way for someone coming into the conversation to see upfront what issues have been classified as out-of-bounds, off-topic, low priority, etc.

Also, it might help for you to list what you consider areas that, in your opinion, still need scrutiny by and feedback from the crowd. What's left to get this bird off the ground?

I'd also recommend periodically updating your notebook and algo/backtest above, capturing that latest API improvements and contest rules as a working example of a viable contest entry.

Here's an example of an algo that is completely wiped out by the proposed slippage model.

109
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 5a34f089b55c9944d662e6c1
There was a runtime error.

@ Karl -

Where did you find SimpleBeta()? How is it used?

@ Jamie -

I'm trying to reconcile the proposed slippage model with reality. I see that SPY has an expense ratio of 0.09% (see http://financials.morningstar.com/etfund/operations.html?t=SPY). Yet I think you are saying that the slippage would be 0.05%, implying that going in and out of SPY just once in a given year would double the expenses, excluding commissions (say I bought SPY at the start of the year, and sold SPY at the end). Alternatively, how could SPY keep its expense ratio at 0.09%, if it is has re-balancing slippage anywhere close to 0.05%? Or is this an incorrect interpretation?

Intuitively, back-of-the-envelope, the 0.05% slippage seems too high, no?

Hi Jamie, for the amount of turnover you are seeking in the algorithms for the contest, modeling transaction costs accurately would be important, no? 0.05 fixed slippage doesn't seem to account for volume of a stock/liquidity in the market etc. which are likely determinants of slippage, I am no expert but just guessing. Having used the prime broker for 6months are we not in a position to model slippage accurately yet? It appears that would be a critical component of the transaction costs, something that need to be accounted for as close as possible to the real trading costs you are seeing in the Q fund. Slippage should affect the performance of algorithms quite a bit, no? Is it not true that slippage is not a constant and varies quite a bit between stocks? Modeling it incorrectly may lead to incorrect selection of algorithms that wont perform just as good in live trading as they do in a backtest or a contest.

Reference to some prior published work:

How Accurate is Our Slippage Model: Comparing Real and Simulated Transaction Costs
By Gus Gordon
December 6, 2016

https://blog.quantopian.com/accurate-slippage-model-comparing-real-simulated-transaction-costs/

I guess this is the basis for the 5 bps slippage model proposed by Jamie?

That study shows a wide range in slippage from [0, 20] basis points. Not sure if using median of 5 basis points here is the right way to model slippage when the breadth of slippage is that wide, you are just ending up discounting slippage of low volume stocks [15-20] basis point slippage at the expense of those [0,1] basis point slippage stocks. Kind of surprised there aren't industry standard slippage models available already for each stock in the Q1500US or QTradableStocksUS. Just some simple calculations between two sets of portfolios. A) only [0-5] basis point stocks vs B [15-20] basis point slippage stocks. Slippage difference is 15 basis points (17.5-2.5 basis points) (=0.15%). If we assume portfolio turns over 0.25% daily (in the mid range of contest requirements) then you are turning over turning over 5 times a month =~ 5*0.15% = 0.75% basis point slippage differential per month =~ 9% per year unaccounted slippage because of using constant 5 basis point constant slippage. This is in the extreme case, while portfolios will in general have stocks that have varying slippage and the unaccounted differential will be lesser than this, but why take a chance when the potential for getting performance incorrect while evaluating algorithms can be off that type of huge percentage.

Vanguard ETF average bid/ask spreads:

https://advisors.vanguard.com/VGApp/iip/site/advisor/investments/bidaskspread

Not sure if it is apples to oranges, but the numbers are roughly in line with the proposed 5 bps slippage for the QTradableStocksUS.

@Leo: The default of SimpleBeta is window_length=2 (daily returns). There's no mask by default. The mask was usually included on RollingLinearRegressionOfReturns to speed up the compute time since it was much slower. There will be a post coming out on this shortly.

Regarding the slippage model, we expect a post to come out shortly that should answer some of your question.

Scott just posted about SimpleBeta. You can't tweak the window_length by default, but you can build your own factor that does so. The notebook he shared has more info.

And here is the post on the new slippage model.

Here's an example of a new contest-conforming algo (I think) that is basically a monkey-on-a-typewriter example. Note the nice run a the start of the backtest, which of course could happen to be during the 6-month out-of-sample contest period.

Are we sure judging solely on 6-months will be much better than random?

Will post tear sheet analysis next.

109
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 5a38f3a8d4c27f45a8f965bd
There was a runtime error.

Tear sheet analysis for example algo immediately above.

3
Loading notebook preview...
Notebook previews are currently unavailable.

Grant, unfortunately that algorithm doesn't quite meet the position concentration and beta limits. I'd recommend running it through the notebook at the top of this post to see if it meets the constraints. The constraints that were not met were the position concentration and the beta-to-SPY. It looks like the max position concentration being passed to Optimize was ~7%. I modified it to 2.5% and it meets the criteria now.

Side note: I removed the try/except block around order_optimal_portfolio. In general, it's dangerous to have an except statement without catching a specific type of error. If there's an exception that you want to handle, you can do something like except SpecificErrorType:.

3
Loading notebook preview...
Notebook previews are currently unavailable.

And here's the modified algo.

12
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 5a3924bc95218e44e948bca3
There was a runtime error.

Thanks Jamie -

My point/question still stands regarding the potential for incentivizing randomness, with the 6-month only out-of-sample evaluation. I'm gathering that from your lack of response on this point, it is set in stone for the new contest. It wouldn't offend me if you were to answer my question directly (whether you provide rationale or not):

Are we sure judging solely on 6-months will be much better than random?

I gather that you've decided to base the judging solely on the 6-month or longer out-of-sample period? Is this consistent with your goal of aligning the new contest with how you evaluate algos for the fund?

For the try/except block around order_optimal_portfolio that's a good point. In your examples, it would be good to have a recommended way of doing this. In the past, particularly when constraining turnover, I've had problems with it crashing. If you were to publish a recommended way of handling things, when constraints can't be meant, it would be helpful.

A 6 month out-of-sample evaluation does not incentivize randomness. Sure, it might only capture the short term upside of an algo at some point, but good algos will have the best chance of winning. I believe this statement holds true for any out-of-sample period, really. The longer the OOS, the more likely an algo is performing well without getting lucky. If you have a weighted coin that has a 51% chance of landing on heads, even if you only flip it once, you have a better chance of getting heads than tails. So, yes, I'm sure that 6 months is better than random.

Hi Jamie -

My point, perhaps not well posed, is that you have a very loose performance requirement on the in-sample period (the backtest). The return just needs to be positive (and the return is absolute, not excess...see discussion on the risk-free rate). One way to look at it is, say all 160,000 of your users each writes 3 algos and submits them to the contest (~ 0.5 M algos). Also, suppose that they sample a wide variety of alpha factors (e.g. grab 101 Alphas and whatever else they can get their hands on) and combine them willy-nilly, in a monkey-on-a-typewriter fashion. Basically, we have a Monte Carlo simulation, with prizes awarded for the tail of the distribution of volatility-normalized return. So, it is not obvious that winners wouldn't be chosen at random. The question really is, for the contest as constructed, what is the probability of awarding a prize to a monkey on a typewriter (or contest gamer) over a whip-smart, hard-working aspiring quant aiming ultimately to get an allocation?

And, of course, since one of your goals is to align the contest with your allocation process, you'd have the same risk there (but I get the sense that you aren't aiming for a 1:1 correspondence...if not, then it would help to flesh this out).

By the way, if you have all of the feedback you need, then I'm good with that, too. In the end, it is your deal. But based on my understanding of what you are trying to accomplish here, there is still room for improvement, in my opinion.

@Grant: Yes, 6 months of OOS is not a lot. Ideally we'd do much longer but then the turn-around would be quite slow. Who would want to wait 2 years until they know if they won the contest? We've talked many times about a hold-out set (historical data that no one has access to) but it becomes extremely tricky to avoid cheating and overfitting there.

For fund selection we are looking at longer times. The contest is not directly related to how we evaluate algorithms for the fund for OOS performance. But the structural requirements enforced here are definitely the type of strategies we're looking for.

@ Thomas -

I'm guessing you have a model in mind where, if the structural requirements are met, then only those algos that are legitimate will bubble up to the top and would be awarded prizes. It is not so obvious. You'll definitely incentivize the crowd to get the algo structure in line, but beyond that, it could be a casino.

Simply ignoring the backtest, beyond testing for structure and a weak test on return doesn't seem prudent. I'm sure if you did a head-scratch, something better could be devised.

Regarding the correspondence with allocations, what is the goal here? I'd think that you'd want X% of the contest winners to have a Y% probability of getting an allocation. After the contest runs for N years, you could look back and see if X & Y are what you had hoped for.

Maybe we should increase the backtest SR lower-bound to be at least 0.5.

Yes, on the probabilities, that's the goal. An algorithm that meets all structural requirements and has compelling OOS performance over 6 months puts you in the front row of getting an allocation (although we might wait a bit longer than 6 months to evaluate it for the fund).

Well, I suggest not using the term Sharpe ratio, since in its conventional form, with the risk-free rate included, it doesn't apply to hedge funds. : )

Generally, some sort of evaluation of the performance of the backtest would seem to be in order. I'm guessing y'all know how to do this really well, from your fund allocation process.

Another note is that you should be able to do the Monte Carlo virtual money-on-a-typewriter simulation to test my hypothesis. You have a large library of pipeline factors, and can run backtests in parallel, so getting distributions of simulated contest entries should be straightforward. You also know what legitimate algos might look like. If the tail of the monkey-on-a-typewriter simulation exceeds what you'd expect for legitimate algos, then we have a problem with the proposed new contest.

Any update on the rough timeline for the new contest?

We are looking into making some of the rule changes for contest 38, and the rest we expect to the next month. If we decide to make any changes to contest 38, we'll announce them soon.

Is anyone else getting the following error?

NameError: global name 'LiquidityExceeded' is not defined

This is from an algo based on Jamie's modified one above. Reference Line #56 (raise LiquidityExceeded()). Any info regarding how to resolve this would be appreciated.

    def process_order(self, data, order):

price = data.current(order.asset, "close")
volume = data.current(order.asset, "volume")

max_volume = self.volume_limit * volume

remaining_volume = max_volume - self.volume_for_bar
if remaining_volume < 1:
# We can't fill any more transactions.
raise LiquidityExceeded()


Thanks,

Troy

You can try the new slippage model here:

https://www.quantopian.com/posts/changes-coming-to-the-default-slippage-model

See if you get the same error.

Thanks Grant. Looks like that resolved it.

Troy

Hi,

Can some one explain why there is a lower bound on leverage? Don't you want your algorithms to have low leverage? Why force them to be leveraged?

@Ishwar: A good way to stay above the lower bound is to use the MaximizeAlpha objective and provide 'alpha' values for a large number of assets. Since the position concentration is capped at 5%, it means you will need to hold at least 2 assets (realistically, it will be more). Your best bet is probably to provide alpha values for at least 50-100 assets, but I'd recommend playing around to see how that translates to actual holdings in a backtest.

@Rafael: You only get into money-borrowing territory when you exceed 1x leverage. The minimum for the new contest will be 0.8, which translates to "having at least 80% of your capital invested". The minimum leverage rule is there to enforce the idea that algorithms should be invested at all times.

@Tony: Instead of mixing risk and return into one value and trying to optimize that, I would rather trade them separately. Then one can create a nice pareto front of the algos. If you only choose algos on the front, you are sure that for the risk given, you get the best return. Adding the result of the "risk-free" algo to the graph makes sure that you can basically never choose something that worse than bonds. And the risk that an investor is willing to take might be very individual.
When it comes to the risk I totally agree that it should be something that rates moving down different than moving up, mainly because otherwise I would not be able to identify the "guy from the future with the best tips" as the best strategy.
I also like your MC idea, but I would tend not do it on individual trades. I would rather think of random subset of the stocks that are available for the trading. So basically a random subset of QTradableStocksUS.
I also like your point on the individual risks attributed to stocks. What I would also like to see as an input to the algos (from Q) is the risk THEY associate with going long/short for a given stock (at the current time or any time in the past). Maybe multiplying that with the real downward moves the stock does below the entry point (kind of "I told you it's dangerous, and I don't like it")
Comments?

@Rafael Quantopian can choose how much of their capital to allocate to any one algorithm. They have overall leverage limits and capital to back for the particular class of algorithm, so far exclusively long-short US equities algorithms, as captured in these contests. If they deploy an algorithm with $1M and 10x leverage, how different is that from deploying it with$10M and 1x leverage, if they rebalance (and dynamically change the allocations) each month? So the 1.0 leverage limit is just a normalization procedure. Subtleties like what @George pointed out have to be sorted out, but otherwise it makes little real difference except making their lives easier.

Jamie, any thoughts on what happens when I use PositionConcentration of 5%, but then I get a favourable movement on one of my positions, and it ends up being well over 5%? See my above example of a short on AIG, which more than doubled in a short timeframe.

@Jamie-- an edge case concern regarding the lower bound on leverage-- what about the first days of the contest for algos that rebalance weekly? That is, if my algo rebalances Thursdays, and the contest starts on a Tuesday, wouldn't I be immediately DQ'd on the first day due to being underleveraged?

The workarounds I've tried on this are actually pretty tricky given the current scheduling functions... (I would have to rebuild Q's trading calendar to define week_end, for example) and they'd impose a penalty of trading the algo on a day in which it's not meant to be traded. A scheduling function that just trades the first day of the backtest/contest would solve the error-prone workarounds, but would still impose an effective penalty.

Would you all consider waiting until the algo's first trade to impose the leverage constraint? Waiting a week to enforce would solve my case, but maybe there are folks out there with algos only trading on specific days of the month, and so on. That is, instead of "algo must always end the day above XX leverage", how about "once an algo starts trading, it must always end the day above XX leverage"?

By using Optimise API -I ended up with a worse performance than without. I can elaborate if required. basically, it results in a more concentrated holding and a more volatile performance.

thanks
-kamal

@Dan: Any chance you can send that algo/backtest into [email protected] and grant permission on it? I'd love to dig into what's happening there. In general, we're applying outlier guards to most of the metrics. For position concentration, you'll be allowed a couple of days up to 10%, but 12% would break even that guard, so I'm interested in what's happening here.

@Jim: There's a 5 day grace period allowed at the beginning of the backtest where we don't check the minimum leverage guard if you haven't yet traded. That's the case in contest 38 and it will be the case in the new contest. The goal of the grace period was to account for the exact case that you described: a weekly strategy that doesn't necessarily trade 2 years prior to its submission date. For anything less frequent than monthly, we'll ask contest participants to add a special case to open up trades on the first day of the simulation. You can find that rule written in bullet 3. under Rules on this page.

@Kamal: Can you start a separate forum post and share your example? It's tough to help without seeing the code.

Hi Jamie -

I'm trying to reconcile your +/- 0.3 beta constraint with earlier Quantopian guidance that beta is unproductive in a hedge fund of the sort you are constructing, since it is cheap and customers will get it elsewhere. Why are you allowing such a broad beta about zero? Is the idea that in the Q fund, you would null it out? Or would a higher beta be o.k.?

@Grant: The thresholds are intentionally wide in the beginning to give a bit more leeway. But you're right, ideally beta is very close to 0 and 0.3 is quite high. We might tighten those bounds in a future contest, especially if users start submitting strategies that intentionally go close to that threshold.

Hi @Thomas since you're here , I would like to redirect your attention to my thread here and love to hear your thoughts / comments. Thanks.

@Grant, we could have our algo disqualified from the contest if beta goes over the bounds we set, for instance if we constrain beta for 0.05 and get 0.21 instead. Beta is a noisy estimate and we need some room and margin of error to work with. The contest is a very short out of sample period of 6 months and that period's beta should not be used to judge what an algorithm's beta is. We need to look at 10+ year beta for all the noise in the estimate to cancel out and get a better handle on the algorithm's beta.

@Thomas,

Zero beta constrain if correctly applied should produce near zero return.
To my mind it absolutely does not matter on which days mostly algo makes money when SPY is up or down.
More important to build such an algo that makes more money than SPY with low volatility.
The Holy Grail which long only on SPY up days has beta around 0.5.

Do not constrain to get there.

@Vladimir, talking about zero beta constraint, this thread might interest you Beta Constraint
I'd appreciate your feedback.

From Vanguard Market-Neutral Investing Strategy Overview and Evaluation:

Hi Jamie -

Is it a correct assumption that you are not interested in ETFs in the Q fund, given that the contest does not allow them and the contest is aligned with the fund needs? I'm asking, since it might be advantageous to use a "hedging instrument" such as SPY to adjust beta, but if it is disallowed, then so be it. Is there a reason why the universe is restricted to the QTradableStocksUS? Would you consider allowing certain ETFs?

Can we keep the universe allowed in the contest aligned with the fund 1:1. If we are making exceptions (like allowing certain ETFs) can we make sure leveraged ETFs are not allowed in the new contest (if they are not allowed in the fund). Having the same universe to work with (for all contest participants) will level the playing field and enable apples to apples comparison.

The requirement for the new contest is that algorithms trade stocks in the QTradableStocksUS. The contest is designed to be a competition for cross-sectional, long-short equity strategies, which we want to see trade exclusively from the QTradableStocksUS. The reason for choosing this type of algorithm is that we believe that cross-sectional, long-short strategies are some of the best candidates to meet the allocation criteria. We're hoping that a more specific set of contest rules tailored to one type of algorithm will help direct you toward writing something that could be considered for an allocation.

There are some strategies that may trade ETFs that could be eligible to receive an allocation, but they aren't really supported in the rules of the new contest. At some point, we might consider adding new contests that are targeted toward different types of strategies, but we're just focusing on the one type for now.

Vladimir, you're right, that description is a bit outdated. We'll need to update it to distinguish between dollar neutral (half long, half short) and market neutral (low beta-to-SPY).

The contest criteria are laid out at the top of this thread. You should refer to those for now, so both dollar neutrality and beta-to-SPY are constraints.

Hey everyone, I just noticed a bug in the scoring function in the notebook attached to the original post of this thread. The bug was that the volatility was being computed over returns of the entire backtest instead of on a 63-day rolling basis. I updated the notebook and left a note that it was edited in the original post. You should re-run your backtests through the new version of the notebook to see if your score changes by a significant amount due to the correct.

Sorry for the trouble.

Hi Jamie, could you provide us an update on contest 38. It appears it is still using the old scoring method that uses rankings on beta, volatility, stability etc. although the submission criteria has changed a bit in that it requires a back test. When will the new scoring go into effect in contest 38 or will it be later. I might have missed a post on when the new contest is starting, or is contest 38 the one?

Hi Leo,

Contest 38 will continue to use the old scoring method. Our current plan is to open up the new contest for submission in early February. We will make an announcement at that time including the official rules and the new prize structure.

Hi Thomas -

Regarding your comment above:

The thresholds are intentionally wide in the beginning to give a bit more leeway. But you're right, ideally beta is very close to 0 and 0.3 is quite high. We might tighten those bounds in a future contest, especially if users start submitting strategies that intentionally go close to that threshold.

I gather that all other things being equal, the payout to licensed algos would be something like:

(payout) = (percentage)(algo return)(1-|beta|)

In other words, you'll pay more for algos that have beta ~ 0 than for algos that have beta ~ 1, and to first-order, the relationship would be linear (since the market will pay X for alpha and Y for beta, on average).

For the contest, all that counts is the Q Sharpe ratio:

SR_Q = SR + (risk-free rate)/(volatility)


So, one approach would be to submit three algos: one biased toward +0.3, one biased toward -0.3, and one beta ~ 0. This would be a way of diversifying the market risk, with respect to winning/losing the contest, right? It's kinda where I might be headed, but I'm guessing this is not quite what you are aiming for here.

Hi Jamie -

Sorry if it was already covered above, but as I understand there is a minimum volatility used in the volatility-adjusted return calculation, and the motivation is to preclude certain types of gaming of the contest. Could you explain how such gaming is done, or provide an algo example? I'm curious if it might be more conventional simply to use the SR (with the risk-free rate subtracted) to preclude such gaming. Without understanding the nature of the potential for gaming, it is hard for me to make this assessment.

Additionally, I don't understand "the calculation is now simply the 6 month rolling beta"--to compute beta for the algo, doesn't this imply that you would need 6 months out-of-sample? Or by rolling, do you mean that you would use less than 6 months of algo returns? If the latter, then will algos be DQ'd if they go out of the |beta| < 0.3 constraint?

Also, how will you compute beta (i.e. what is the exact formula/code/procedure)? One thing I'm concerned with is if you are using OLS, then there is the possibility of regression dilution (I think...maybe one of the stats whiz kids there can comment). Additionally, I'd like to make sure that the beta used for performance matches whatever is under the hood in SimpleBeta. It should be apples-to-apples not apples-to-oranges.

More things should not be used than are necessary.

Are not necessary:

Trade within QTradableStocksUS
Sector Exposures
Style Exposures
Beta-to-SPY
Net Dollar Exposure

Beta neutrality, Dollar neutrality, Sector neutrality, Style neutrality are not risks per se but rather the means to manage the risks of market exposures.
Only the imbalance of neutrality creates an opportunity to make money.

Hi Jamie -

I just finished the book:

Fortune's Formula: The Untold Story of the Scientific Betting System That Beat the Casinos and Wall Street
by William Poundstone
Link: http://a.co/dI8lRx2

Although it is a popular, qualitative book, it makes the case that the Kelly criterion in practice works quite well. However, if one is constrained to go all-in, then it is sub-optimal in the long run. So, I'm wondering how this jibes with your lower-limit constraint on leverage of 0.8 (with the ideal of a leverage of ~ 1.0)? As I understand, for Kelly betting, one would never go all-in unless there is some sort of inside information and the game is rigged in one's favor. The downside of not maintaining some cash on the sidelines is that the tail risk can lead to ruin. So, I'm wondering if there should be a more generous lower-limit on the leverage?

Hi Jamie,
I am attaching a recent back-test of my algo that goes back some 13 years. Although most of the time the algo stays within the gross leverage target of 0.8-1.1, during the financial crisis of 08-09 it seems that it slightly goes out of those bounds. Maybe one or both bounds should be extended a bit.

2
Loading notebook preview...
Notebook previews are currently unavailable.

Regarding the leverage, it is also worth noting that optimize.MaxGrossExposure sets an upper bound constraint; there is no guard against the 0.8 lower limit.

There is a workaround, which requires computing the optimal weights with constraints, and then de-meaning and re-normalizing, and ordering without constraints:

    weights = weights - weights.mean()
weights = weights/weights.abs().sum()

objective = opt.TargetWeights(weights)
order_optimal_portfolio(
objective=objective,
constraints=[],
)



This is seat-of-the-pants, however, since it assumes that the constraints will still be met going forward, even though the optimizer had to reduce the leverage to meet all of the constraints.

Hi Jamie,

Above, you say:

The requirement for the new contest is that algorithms trade stocks in the QTradableStocksUS. The contest is designed to be a competition for cross-sectional, long-short equity strategies, which we want to see trade exclusively from the QTradableStocksUS.

The contest requirements, though, state:

Trade within QTradableStocksUS: >= 90%

I think the requirement is that all stocks traded either currently are in or need to have been in the QTradableStocksUS (back to 2002?), and that point-in-time >= 90% of the stocks need to be in the QTradableStockUS. In other words, supposing that point-in-time, only 90% of the stocks are in the QTradableStockUS. The requirement is that the remaining 10% (by count?) need to have been in the QTradableStockUS at some point in the past.

@ioannis: There will be outlier guarding in the new contest that allows you to go outside of the required bounds by a small amount every once in a while. I'll be posting an update to the notebook at the top of this thread later tonight or tomorrow morning that includes outlier guarding, which might mean your algo meets the criteria for the whole backtest period.

@Grant: On leverage, Douglas Staple nailed it earlier in the thread. It's a normalization procedure. We want contest entries to be scored based on their ability to invest ~$10M. On the QTradableStocksUS, the requirement will be updated to be >= 95% in the QTU. The % is 'percentage of position value currently in the QTU'. There is no requirement for a stock to have been in the QTU in the past. That said, the spirit of the rule is still 'trade in the QTU' (as you suggested earlier in this thread). The reason it is not 100% is to avoid disqualifying algorithms that hold companies which undergo a spinoff, or don't turn over their entire portfolio every day. For the specific implementation of this rule, please see the code posted in the notebook at the top of this thread. Note: I'll be updating the notebook later tonight/tomorrow morning with the 95% update as well as outlier guarding, so you might want to wait to read through it until I do that (I'll add a note when I do). Here's an updated version of the algorithm I shared earlier in this thread. I removed the beta-to-SPY constraint in order_optimal_portfolio since others pointed out it doesn't always help, and it the algo stays withing the beta bounds without it. This backtest was also run with the new definition of the QTradableStocksUS. 13 Loading... Backtest from to with initial capital Total Returns -- Alpha -- Beta -- Sharpe -- Sortino -- Max Drawdown -- Benchmark Returns -- Volatility --  Returns 1 Month 3 Month 6 Month 12 Month  Alpha 1 Month 3 Month 6 Month 12 Month  Beta 1 Month 3 Month 6 Month 12 Month  Sharpe 1 Month 3 Month 6 Month 12 Month  Sortino 1 Month 3 Month 6 Month 12 Month  Volatility 1 Month 3 Month 6 Month 12 Month  Max Drawdown 1 Month 3 Month 6 Month 12 Month # Backtest ID: 5a70ffcb0da48847ea9d6c6e There was a runtime error. I just updated the notebook at the top of this page and added a note about the differences from the previous version. @Jamie, are you now acknowledging the flaws I pointed out about Beta to Spy? And thank you for crediting "others". Other changes I propose for new contest and going forward: 1) I noticed in the old contest format, when you roll over paper trading results to the next month's contest, you just carry over the results from when the contestant started participating which I think gives undue advantage to new contestants who are just coming in. Measurement should be apples to apples and not apples to oranges. You should restart everyone with same initial capital and same start date for the six month paper trading. 2) I also think you should account for consistency of both the 2 year backtest and the six month OOS paper trading to remove the "luck" factor. You could for example give a 40% weight for 2 year backtest and 60% weight to OOS to come to an overall score. This way you eliminate algos that do poorly on backtest but performs very well on paper trading mainly due to just getting lucky. I have this simple first acid test for trading strategies: give them more time. All I did was change the start and end dates. The objective is to see if they can maintain a modicum of performance. A step taken before wasting more time on exploring a strategy's potential. On that basis, I consider that the above trading strategy failed miserably. It could not sustain itself on OOS data, which should lead to an evident conclusion. It is one thing to have low beta and low volatility. But a doubling time of 35 years might simply not be the way to go whatever the circumstances, objectives or constraints. Nonetheless, one can observe the strategy broke down going forward showing a -19% drawdown for a 2% CAGR! I can not see a way of making the attach trading strategy support any kind of leverage. Most certainly not 6x. It simply could not afford the added expenses. If someone could be nice enough, they might try to explain how it could be done using such a trading strategy. I am all ears. As a side note: I am losing confidence in the output of the tear sheets and backtest results. One section shows graphs where you end with some profits, while another shows an overall average loss per trade. This problem has been reported before. I do not mind the bugs, just want to know which is the right answer? 2 Loading... Backtest from to with initial capital Total Returns -- Alpha -- Beta -- Sharpe -- Sortino -- Max Drawdown -- Benchmark Returns -- Volatility --  Returns 1 Month 3 Month 6 Month 12 Month  Alpha 1 Month 3 Month 6 Month 12 Month  Beta 1 Month 3 Month 6 Month 12 Month  Sharpe 1 Month 3 Month 6 Month 12 Month  Sortino 1 Month 3 Month 6 Month 12 Month  Volatility 1 Month 3 Month 6 Month 12 Month  Max Drawdown 1 Month 3 Month 6 Month 12 Month # Backtest ID: 5a7268d4131c584317f005ce There was a runtime error. Hi Jamie - Regarding beta-neutrality control, given that the baseline approach of SimpleBeta and the Optimize API doesn't work so well, I'm wondering if Q has any ideas that we could try? I would think that this would be of primary importance, since as I understand, every little bit of beta in the 1337 Street Fund (a.k.a. Q Fund...personally, I like the Q Fund better) will be money out the window. Also, presumably you'll need to convince your investors that you can forecast beta, so that they'll be confident that they aren't buying any SPY when they buy your fund. Make a copy of the scoring system. Make a change to that copy (such as the Sharpe ratio calculation). Press a button and in two days you have a profit figure representing that change. What the button does is use that modified copy to re-run the top contest entries for a given number of past contests for each of their own contest time-periods, then backtest the resulting top 5 from each contest during--what would be for them--a common OOS period, which starts at the end of the last contest chosen (like last June to be able to compare results to the real-world 3% downturn). Past contests are a gold mine waiting to be mined. There's a bug in the evaluation Notebook. You'll get an error UnboundLocalError: local variable 'max_sector_exposure_day' referenced before assignment if you hit the branch starting with if (abs_mean_sector_exposure_98 > SECTOR_EXPOSURE_98TH_MAX):. There's a similar bug with the abs_mean_style_exposure_98 > STYLE_EXPOSURE_98TH_MAX branch later on. Thanks Douglas, I just posted an update with the fix. @All: The new contest was announced today, along with the new daily prize structure. Can I please get some help on how to correct the only failure of my algo? Will reducing the number of holdings help? Thanks Checking turnover limits... FAIL: Minimum turnover of 2.06% is below 3.0%. FAIL: 2nd percentile leverage of 2.54x is below 5.0x @ioannis: It looks like I had a mistake in the error message there, both of those should say that your turnover is too low. What type of signal is your strategy using? If the signal itself is too slow, maybe you can try adding another component to it. What is the update frequency of the input data to your algo? Thanks Jamie. The algo re-balances weekly, but since it is more heavily dependent on fundamentals and less on technical indicators, it does not change all positions weekly. I will try to add another technical indicator and see if this will increase turnover. @Jamie I am still getting errors when I use the notebook from the top of this thread, if the 98th percentile check fails for style or sector exposure. I've attached a fixed version. Also, I have two questions: Is the top of this thread still the canonical place to get an up to date version of that notebook? Is this notebook still our main tool for checking that our algorithms are passing these constraints? 6 Loading notebook preview... Notebook previews are currently unavailable. @Douglas, would you be so kind as to re-run your presented backtest full tear sheet with the round_trips option turned on, as in: bt.create_full_tear_sheet(round_trips=True)  Interested in viewing a few of the numbers. Thanks. @Guy it's not my backtest: that's the one from the example posted by Jamie. @Douglas, thanks. Then, I will wait for @Jamie's kind response. @Douglas: Thanks again! I copied over your changes and updated the notebook at the top of this thread. And yes, this is the place to get the most up-to-date version. The version that you get by clicking the link in lesson 11 of the contest tutorial also normally up-to-date (though the change with your fix is still on it's way there). This notebook is still the main tool for checking whether or not you meet the criteria. I've been working with our engineers to make sure the notebook is consistent with what the overnight contest scoring job uses to check constraints. Evidently, there were certain branches of my notebook that I missed, but the current version (with your fix) is the best version to use. Once the contest begins, your contest entry will be tested each time the leaderboard is updated by the overnight contest scoring job. @Jamie, sorry for the request. Realized that I can get those numbers by myself. Thanks. Hi Jamie - Should entering the contest kick off a backtest automatically? I'm not sure it did for me. I figure this should be the backtest used to plug into the evaluation notebook, but it didn't run automatically. Also, I got an odd error in the notebook. See attached and below: IndexErrorTraceback (most recent call last) <ipython-input-10-242362714c09> in <module>() ----> 1 evaluate_backtest(positions, transactions, algorithm_returns, factor_exposures) <ipython-input-8-5c3f85a79755> in evaluate_backtest(positions, transactions, algorithm_returns, risk_exposures) 2 if len(positions.index) > 504: 3 check_constraints(positions, transactions, algorithm_returns, risk_exposures) ----> 4 score = compute_score(algorithm_returns[start:end]) 5 else: 6 print 'ERROR: Backtest must be longer than 2 years to be evaluated.' <ipython-input-5-0d22abc0c68c> in compute_score(algorithm_returns) 16 17 cumulative_score = np.cumsum(daily_scores[503:]) ---> 18 latest_score = cumulative_score[-1] 19 20 print '' /usr/local/lib/python2.7/dist-packages/pandas/core/series.pyc in __getitem__(self, key) 581 key = com._apply_if_callable(key, self) 582 try: --> 583 result = self.index.get_value(self, key) 584 585 if not lib.isscalar(result): /usr/local/lib/python2.7/dist-packages/pandas/tseries/index.pyc in get_value(self, series, key) 1380 1381 try: -> 1382 return _maybe_box(self, Index.get_value(self, series, key), 1383 series, key) 1384 except KeyError: /usr/local/lib/python2.7/dist-packages/pandas/indexes/base.pyc in get_value(self, series, key) 1984 1985 try: -> 1986 return tslib.get_value_box(s, key) 1987 except IndexError: 1988 raise pandas/tslib.pyx in pandas.tslib.get_value_box (pandas/tslib.c:17017)() pandas/tslib.pyx in pandas.tslib.get_value_box (pandas/tslib.c:16774)() IndexError: index out of bounds  4 Loading notebook preview... Notebook previews are currently unavailable. Hi Grant, Submissions to the new contest don't yet kick off automatic backtests. Submissions will be evaluated on a nightly basis once the contest officially starts scoring overnight on Feb. 16. In the meantime, best practice is to run a backtest with default slippage and commissions from 2 years ago to today, and then run it through the notebook above. Thanks for reporting that bug in the notebook. I haven't seen that one yet, and I can't tell from the trace what's triggering the exception. Would you be willing to create a support ticket with permission on the notebook so I can try to figure it out? If not, I'd recommend commenting out the call to compute_score in evaluate_backtest for now and just test to see if the backtest passes the criteria. Thanks Jamie - I just cloned the notebook above, re-ran it, and got the same error. I submitted a help request and checked the box to allow you to access the notebook. If you need anything else, just let me know. Thanks Grant. I took a look and discovered the problem. I had updated the rolling used in the scoring function to use the same function used in the constraints, but hadn't realized that the change in definition also meant that the first 62 NaNs were getting dropped, so the hard-coded slice on the score was off by 62. I updated the notebook on top of this thread with the correct slice. Sorry for the confusion. I am kind of confused why my backtest is failing with the QTradeableStocksUniverse requirement. I am filtering with this universe correctly and it's impacting my results with removed so I know it's working. How come I am still failing the requirement? 3 Loading notebook preview... Notebook previews are currently unavailable. Submissions to the new contest don't yet kick off automatic backtests. Submissions will be evaluated on a nightly basis once the contest officially starts scoring overnight on Feb. 16. Hi Jamie, I don't follow. So is a 2-year trailing backtest run every night, starting on Feb. 16? If so, how will we access the backtests? That's correct. 2-year trailing backtests will be kicked off after each trading day starting on Feb. 16. You'll be able to get a link to the full backtest from the contest dashboard. @Jay: It looks like the issue is that the leverage in your backtest drops to 0. The notebook reports 0% holding in QTU if the leverage drops to 0. If you fix that, I imagine your QTU % will look more sensible. Sorry for the confusion. Thanks Jamie, this is an intraday algo and usually clears out positions by end of day... so would it be possible to pass this requirement if there are no positions being held? Hi Jay, Unfortunately, contest entries will need to hold positions at the end of the day in order to meet the leverage criteria. End-of-day positions are required to accurately measure exposure to the risk model. Hi Jamie - Regarding your comment: That's correct. 2-year trailing backtests will be kicked off after each trading day starting on Feb. 16. You'll be able to get a link to the full backtest from the contest dashboard. A few questions/comments: 1. I'm confused why the backtest would be limited to 2 years. Wouldn't you want to run a backtest starting with the original in-sample backtest start date, and every day, extend the end date by one day? This way, the current backtest would contain the full in-sample and out-of-sample data sets. Also, the rules state that positive returns are required back to the original in-sample backtest start date, so wouldn't this mean that you'd need to re-compute the total return relative to the original start date every day? 2. I'm trying to understand how users might keep track of everything associated with a given contest entry, with all of the related backtests, notebooks, forum threads, examples, support requests, etc. flying around in various places. Is there any way some sort of folder could be made available, for capturing everything germane to a given entry? Got to say, I like the old system where you can see whether you algorithm was accepted, associated backtest and score given after submission. @Grant: 1. You're right, sorry, I made a mistake in that post. The start date of the backtest is fixed. Here's the description from the announcement post: After each subsequent trading day, a new backtest is run starting from 2 years before the submission date, through the most recent trading day (the backtest grows in length by one day each night) The same explanation can be found in the official rules. 1. The new contest dashboard will be the best place to keep track of your contest entries and the associated backtests. It's blank now, but the Active Entries view will be the spot to navigate these backtests as well as your contest score. @Alexander: The new system will start providing feedback on a T+1 basis like the old system after Feb. 16th. We're in a special submission period now. Once the contest gets going, you'll have a similar submission experience to the old system. I think$10M forces too much into high dollarvolume stocks plus large numbers of them (digging too deeply into weaker signals). In my experience, easy alpha and profits exist in low priced stocks (I'm looking at a 2017 backtest with 0.0 drawdown, 230% returns, beta .05). I would consider other possibilities, I would look for ways to discuss it, a lot. Maybe even scheduled conference calls, maybe even allowing anonymous. It's unfortunately possible new winners in today's status quo could merely be luck of the draw, a fragility that would not meet anyone higher goals.

Jamie - Presently, is there no way to get the backtest IDs of submitted entries? The code?

@Grant: Unfortunately, you can't get the backtest ID yet. You'll be able to see the backtests after the first leaderboard run.

@Karl: Are you asking if downsample should be used in the rule or in an algo?

In the algorithm, Jamie for example:

Fundamentals.market_cap.latest.percentile_between(2, 98).downsample('week_start')


Thanks for clarifying, Karl. The necessity of downsampling a factor will likely depend on the algo. I'd recommend experimenting with both approaches to see how it affects your results. There are likely situations where downsampling your signal might actually mean the algo has a harder time keeping above the 95% threshold (since the algo wouldn't make the mid-week adjustment).

Hi Jamie -

I recall for live-trading algos, you require that the algo be repeatable, exactly, given the same backtest start and end dates. Thus, for example, any pseudo-random number generators need to have a fixed seed. But for the current contest, you aren't running the algos live. So, does the repeat-ability requirement apply?

@Grant: Correct me if I'm wrong, but I believe that non-deterministic algorithms were not supported in the live trading system, it's not that determinism was required. Our paper trading system was built in such a way that the current state of the algo each morning was retrieved by re-simulating the history of the algorithm (a backtest of sorts that we call 'catch-up'). If the algorithm was non-deterministic, then the current state each morning could be inconsistent and the live algorithm results would not be continuous.

The same is true for the new contest. Your 2+ year backtest could look different each day if the algorithm is non-deterministic. As a participant, I'm not sure that there's an advantage to having a non-deterministic algorithm, so I'd recommend that you use deterministic algorithms, but maybe I'm missing something. In the allocation process, we require algorithms to be deterministic. We use the same 'catch-up' process in our internal live trading setup, so an algorithm needs to be deterministic in order to be eligible to receive an allocation. We considered adding determinism as a requirement to enter the contest at some point, but right now, it's not a high priority item on our to-do list. Detecting determinism in an efficient, automated fashion is not trivial.

Thanks Jamie -

Sounds like an algo wouldn't be disqualified if it had non-deterministic output. I don't have a particular use case at this point (I was toying around with np.random and realized that you hadn't said anything for the contest). I'll use a fixed seed if I do end up using np.random.

Testing for it seems pretty easy, assuming that only two tests are required. Just run the algo, and run it again, and compare. This wouldn't be exhaustive, but would probably catch algos with non-fixed random number generator seeds.

@Jamie,

Any feedback on the above notebooks I showed you regarding the possibility of later start date outscoring the early start date? And would you rethink or revise to apples to apples measurement?

Also, any updates on the TargetWeights issue?

Hey Guys, I would like to congratulate you for this contest. I believe it is significantly better because of the constrains that are used to screens algorithms before entering the contest as well as the new universe selection framework and finally portfolio construction.

I would like to offer an idea to improve upon based on an observation. It is about the three algorithm limit allowed in the contest.

I currently have a number of algorithms, some are using different fundamental indicators, others are the same with adjustments to the feature weights to perform better over a specific period of time etc. All algorithms that I deem appropriate, I live trade for my own information. As of now, my best performing algorithm (since the kick-start of this contest) is not in the 3 algorithms I selected to use in the contest. This is both a limitation on my side, as well as quantopian's, due to not being able to have access to such an algo.

What I am asking is to consider allowing more than 3 algorithms to compete at a given time by each user.

Thanks

Hi ioannis,

Why not withdraw your least performing algo so you can submit your best performing one? It's only the algo with the highest current score that counts in the contest anyway, and since the score is based on a 63 day rolling window (cumulative starting from when you submitted the algo for the Returns in the numerator, so it may take a few days for the new algo to cumulate a higher return), there's no real 'start' day for the contest.

Personally I think 3 algos per person is a fair balance. The more algos they allow, the higher the likelihood that one of the algos will be 'overfit' (either by design or by chance) to perform very well (high returns with low volatility) during certain short-term market conditions, which is not really what they are looking for.

For being considered for an allocation (the REAL prize), I believe they are looking for algos that have been demonstrated to generate, out of sample, consistently high Sharpe ratios (high returns with low volatility), and that are also somewhat scalable if you throw a lot of capital/leverage at them. Unfortunately the incentive for the contest and for an allocation are not completely aligned in my view (the former is more short term, whereas the latter is longer term), but I also understand the difficulty in having them completely aligned and also being able to have a 'rolling' start date for new entrants. Perhaps having a longer rolling window that's exponentially weighted to the short term would be more aligned with the allocation objective, though I do also appreciate the simplicity of the current scoring method.

There will always be people who are incentivised to only win the daily cash prizes, but I think Q want us to submit algos that can be competitive both in the short-term (for the contest) and also in the long run for possibly an allocation.

Just my thoughts and opinion, feel free to disagree.

Joakim

Joakim, thanks for writing your thoughts. I understand the overfitting problem but the new criteria are set in order to minimize submission of overfitted algos. The only drawback I see in what I’m proposing is that Q will have to review more Algos. Before allocating capital to an algo Q does a more thorough analysis and longer backtests to find times when the algo was performing poor.
My proposal minimizes the probability of missing a good algo at the cost of reviewing more algos. At the end of the day, you can only allocate money to your best algo.
As to your point of submitting the algo now, the algo was already competing and was removed to be replaced with one which had better returns in the recent past. Removing one algo that was performing less (I am 9th in the contest) to start another that was performing better in a short period of time (~month) is (in my view) like chasing your tail.
Also if one factor weight was enhanced to perform better over a period of time or another feature was added etc., why the old algorithm should be removed? It did have potitive returns after all.
-Ioannis

Hi Jamie,

I'm wondering if it might be time to start talking about a review and update of the rules of the present long-short U.S. equity contest, given that we are coming up on the 6-month mark?

Also, I recall the idea that y'all would devise other "flavors" of contests. What do you have in mind?