Back to Community
New tearsheet challenge on the estimates dataset!

Our contest remains the most successful method to inspire new ideas, license new algorithms, and provide guidelines on the type of algorithms we are looking for. First off the bat: the contest is not changing. It is unrelated to this challenge and remains your best bet to receive an allocation. But because of its success, we want to experiment with some additional challenges, and this is the first of these experiments.

Here are the rules:

  • There is no submission or live-updated leaderboard like for the contest. To enter this challenge, simply post an alpha tearsheet as a reply to this thread. For this, you would run a backtest on your factor and run the alpha notebook which loads in your backtest results.
  • The deadline to submit a factor is September 16, 2019 at 9:30am ET.
  • There is no hold-out testing, just post your best factor starting on June 1, 2015 until Sept 1, 2018.
  • We will look at all submissions and manually determine the 5 best algorithms according to our discretion. Each winner gets a $100 prize.
  • There is no limit on the number of submissions.

Algorithm requirements to enter the challenge:

  • Use at least one of the estimates datasets (Consensus, Actuals, Long term, Broker recommendations, Guidance) as your primary signal source.
  • Use TargetWeights in the optimizer and do not put any constraints on common risk exposures.
  • Use the QTU.
  • Set slippage and commission costs to 0.

When selecting a winner, we will primarily look at:

  • Specific Sharpe Ratio (IR) in the first 5 days (higher is better).
  • Turnover (lower is better).
  • Universe size (larger is better).
  • Not driven mainly by common risk (but no reason to try and artificially reduce your exposures, ideally your idea is dissimilar enough from common factors that it will be naturally uncorrelated).

These rules are mostly derived from our updated guidelines on getting an allocation and are based on many community members' feedback. Thank you for all your input and creative suggestions. If this challenge is well-received, we will continue to offer more experimental challenges.
Ultimately, we want to keep improving our ability to find your best ideas and fund them!

Good luck and happy coding!

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

77 responses

Game on!

What if we run out of memory trying to load such a long backtest into the notebook?

Viridan: Good point, I hadn't thought of that. I considerably reduced the duration of the backtest so that there shouldn't be any memory problems.

Edit: see below for my two "final" submissions. One based on this algo, and another that uses higher turnover to drive better short-term returns.

3 estimates factors

Loading notebook preview...
Notebook previews are currently unavailable.

thank you Quantopian!

Nice. My submission -

Loading notebook preview...
Notebook previews are currently unavailable.

@Balakrishnan Swaminathan I think you posted the wrong notebook.

first try

Loading notebook preview...
Notebook previews are currently unavailable.

Likely too high of turnover to be exciting for Q, but here's another shot.

Loading notebook preview...
Notebook previews are currently unavailable.

(Re-submitting because I made some fixes.)

To my surprise there actually appears to be some good alpha in analyst price targets, albeit with very quick alpha decay. Trading at close is already much worse than trading at open, and then within a day it's all gone.

Interesting to see the style analysts latch onto -- heavily momentum.

It's a shame Q can't do anything with this. Fundamental alpha seems like a lot of hocus pocus to me, like predicting next year's fashions, whereas this has a clear mechanism that explains it.

Loading notebook preview...
Notebook previews are currently unavailable.

This is the same strategy, just starting at Jan 1 2010.

Loading notebook preview...
Notebook previews are currently unavailable.

Would Quantopian be kind enough to post a template algorithm (backtest) using all the features and data so that we can work on it? The learning curve is quite steep otherwise for many of us.

Portfolio Blend.

Loading notebook preview...
Notebook previews are currently unavailable.

I previously uploaded the wrong NB so I just deleted the post to keep clean. Consists of 4 FS EST Factors. There is a drop off from Day 1 to Day 2 but seems remain stable thereafter. Turnover averages out around 10%. One take-away for me is to see if I can minimize risk tilts a little better but overall not too bad.

Loading notebook preview...
Notebook previews are currently unavailable.

My first one. 4 Estimates based factors (maybe too complex?). This was my 'validation' period (single backtest).

Loading notebook preview...
Notebook previews are currently unavailable.

Here is a template for creating an algo for this challenge.

Clone Algorithm
88
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
"""
Sample algo using estimates datasets for mini-contest
(Consensus, Actuals, Long Term, Broker Recommendations, Guidance)

This template is based on an example algorithm from Daniel Cascio.
"""

# Import required pipeline methods
from quantopian.pipeline import Pipeline
from quantopian.algorithm import attach_pipeline, pipeline_output

# Import any built in filters and/or factors
from quantopian.pipeline.filters import QTradableStocksUS

# Import datasets being used
from quantopian.pipeline.data.factset.estimates import (PeriodicConsensus, 
                                                        Actuals, 
                                                        ConsensusRecommendations as Consensus, 
                                                        LongTermConsensus, 
                                                        Guidance)

# Import optimize
import quantopian.optimize as opt

import pandas as pd

def initialize(context):
    # Normally a contest algo uses the default commission and slippage
    # This is unique and only required for this 'mini-contest'
    set_commission(commission.PerShare(cost=0.000, min_trade_cost=0))    
    set_slippage(slippage.FixedSlippage(spread=0))

    attach_pipeline(make_pipeline(context), 'sales_estimate_pipeline') 

    # Place orders towards the end of each day
    schedule_function(rebalance, date_rules.every_day(), time_rules.market_close(hours=2))
               
    # Record any custom data at the end of each day    
    schedule_function(record_positions, date_rules.every_day(), time_rules.market_close())

        
def create_factor():
    # Use the QTradableStocksUS universe as a base
    qtu = QTradableStocksUS()
    
    # Create an alpha factor
    # Must use at least one of the estimates datasets as the primary signal
    # (Consensus, Actuals, Long term, Broker recommendations, Guidance)
    # Replace this logic with your own factor(s).
    up = PeriodicConsensus.slice('SALES', 'qf', 1).up.latest
    down = PeriodicConsensus.slice('SALES', 'qf', 1).down.latest
    num_est = PeriodicConsensus.slice('SALES', 'qf', 1).num_est.latest
    alpha_factor = (up - down) / num_est
        
    # Filter out securities with very few estimates or factor is invalid
    screen = qtu & alpha_factor.isfinite() & (num_est > 2)
    
    return alpha_factor, screen

def make_pipeline(context):  
    alpha_factor, screen = create_factor()
    
    # Winsorize to remove extreme outliers
    alpha_winsorized = alpha_factor.winsorize(min_percentile=0.05,
                                              max_percentile=0.95,
                                              mask=screen)
    
    # Zscore to get long and short (positive and negative) alphas to use as weights
    alpha_zscore = alpha_winsorized.zscore()
    
    return Pipeline(columns={'alpha_factor': alpha_zscore}, 
                    screen=screen)

def rebalance(context, data): 
    # Get the alpha factor data from the pipeline output
    output = pipeline_output('sales_estimate_pipeline')
    alpha_factor = output.alpha_factor
    
    # Weight securities by their alpha factor
    # Divide by the abs of total weight to create a leverage of 1
    weights = alpha_factor / alpha_factor.abs().sum() 
    
    # Must use TargetWeights as an objective
    order_optimal_portfolio(
        objective=opt.TargetWeights(weights),
        constraints=[],
    )
    
def record_positions(context, data):
    pos = pd.Series()
    for position in context.portfolio.positions.itervalues():
        pos.loc[position.sid] = position.amount
        
    pos /= pos.abs().sum()
        
    quantiles = pos.quantile([.05, .25, .5, .75, .95]) * 100
    record(q05=quantiles[.05])
    record(q25=quantiles[.25])
    record(q50=quantiles[.5])
    record(q75=quantiles[.75])
    record(q95=quantiles[.95])
There was a runtime error.

Thank you @Thomas

Thank you @Thomas.

The IR dropoff :(

Loading notebook preview...
Notebook previews are currently unavailable.

@Jamie, by the looks of the turnover chart, you're turning over more than 100% per day. Under the new guidelines, they only look at EOD positions, with likely maximum turnover limits in the range 20% - 30% per day.

I'm a little bit confused by the turnover tbh, I'm only ordering once at the beginning of the day.

@Jamie, I do like that equity curve. Great job.

Sorry for the submission spam, I've had a lot of fun with some of these datasets :). Will Q do a live video evaluating our submissions?

Loading notebook preview...
Notebook previews are currently unavailable.

Five estimate factors (across LongTermConsensus, ConsensusRecommendations, PeriodicConsensus, Guidance, Actuals).

Loading notebook preview...
Notebook previews are currently unavailable.

Does the entry need to include estimates datasets only? Can I use other factors as well in addition to factors derived from estimates datasets?
Here is my entry. This contains only the estimates datasets. Thanks!

  • Newer Version Below
Loading notebook preview...
Notebook previews are currently unavailable.

Shiv: No, every factor should be using estimates as its primary source, but if there is alpha in combining estimate data with other data sources, that's totally fine. Your tearsheet looks great, but a) it looks like you're close to equal weight, and b) can you increase your universe size, and c) you have a pretty high tech exposure, which doesn't have to be bad, but maybe that is a spot where you could apply the optimizer only at tech exposure..

Signal Blend.

Loading notebook preview...
Notebook previews are currently unavailable.

@Jamie -- Lets say you have a $100mm book and rebalance the entire portfolio once a day, if you exit $100mm worth of positions and enter $100mm worth of new positions, you just turned over a total of $200mm. That's presumably why your one notebook's turnover is hitting 2.0 (e.g. $200mm transacted/$100mm portfolio value).

Four Factors

  • Newer Version Below
Loading notebook preview...
Notebook previews are currently unavailable.

I'm really excited about all these fantastic submissions!

One request: If you post an updated version of your factor, please remove the previous version. That will make it much easier for us to sort through when determining the winners. We will take correlations and similarity into account.

four-factor model updated

Loading notebook preview...
Notebook previews are currently unavailable.

This one uses
https://www.quantopian.com/posts/build-alpha-factors-with-cointegrated-pairs
as the starting point.
@Rene uses co-integrated pairs to look at earnings surprises, and goes long-or-short based on the direction of the surprise.
Because there are limited pairs, we use a minimum-total-number-of-assets filter(60 in this case) and hence create a factor that both has a limited number of assets in it, and is investment-on-off filtered.

Loading notebook preview...
Notebook previews are currently unavailable.

Three estimates factors

Loading notebook preview...
Notebook previews are currently unavailable.

My final submission.

2 Sales and 2 EPS estimates-based factors.

Loading notebook preview...
Notebook previews are currently unavailable.

Three factors alpha there is a newer version further below.

Loading notebook preview...
Notebook previews are currently unavailable.

Final Submission:
Added 4 more factors (total of 8 factors). Correlated with last entry with correlation at 0.80

Loading notebook preview...
Notebook previews are currently unavailable.

Final Submission 1: Four Estimates Factors

Loading notebook preview...
Notebook previews are currently unavailable.

Here's two factors with better risk exposure tilts

Loading notebook preview...
Notebook previews are currently unavailable.

Two factors to get in early before the start of analyst momentum herding.

Loading notebook preview...
Notebook previews are currently unavailable.

Five estimate factor model

Loading notebook preview...
Notebook previews are currently unavailable.

My second attempt

Loading notebook preview...
Notebook previews are currently unavailable.

One factor to keep things simple

Loading notebook preview...
Notebook previews are currently unavailable.

Final Submission 2: Seven Estimates Factors

Loading notebook preview...
Notebook previews are currently unavailable.

single factor

Loading notebook preview...
Notebook previews are currently unavailable.

Could someone please guide me on how to create a Custom Factor with a 90-day window on Factset PeriodicConsensus data?

4 estimate factors. Tried to focus on large universe, long-term stability and controlled risk exposure via factor construction.

Loading notebook preview...
Notebook previews are currently unavailable.

Single Estimates alpha without constraints.
There is a better version further below.

Loading notebook preview...
Notebook previews are currently unavailable.

Three factors alpha without constraints

Loading notebook preview...
Notebook previews are currently unavailable.

Another single factor

Loading notebook preview...
Notebook previews are currently unavailable.

Attempt 1.

Loading notebook preview...
Notebook previews are currently unavailable.

Pure raw 8 Estimates factors

Loading notebook preview...
Notebook previews are currently unavailable.

Pure raw 5 factors, quantile binning

Loading notebook preview...
Notebook previews are currently unavailable.

Wow, some really impressive tearsheets. My submission is using the Broker Recommendations dataset as main source.

Loading notebook preview...
Notebook previews are currently unavailable.

Another tearsheet, also using Broker Recommendations als main dataset.

Loading notebook preview...
Notebook previews are currently unavailable.

Late submission. Tried to replace my second model but wasn't able to.

Edit: Deleted my second model above, as this one replaces it.

Loading notebook preview...
Notebook previews are currently unavailable.

Thanks for all your submissions, we are really excited about all the algorithms, some of which look really good. What I'll do next is go through all of them, select the winners together with David and then do a live review of them where we will announce the winners (of course I'll post it here too).

Due to the success of this, I'd really like to get some feedback, as we'll definitely do more of them in the future. What did you enjoy here, what was not so great, what questions came up, was it enough time or not nearly enough?

Also, if you want to make my life easier, please go back and delete any outdated submissions that are superseded by more up-to-date ones. Or, even better, edit those old ones saying that there is a newer version further below, that way we can see the progression. Thanks!

Thanks, @Thomas for your template algorithm. I mostly struggled with API and it would be nice if we could see more examples of using Factset data.

  1. Did not know how to get a series of consensus estimates for a historic window (say 90 days).
  2. Could not figure out why this does not work as expected: factor.rank(groupby=sector, mask=screen).zscore(groupby=sector)
  3. Q does a great job pre-trade, but access to post-trade analytics within an algorithm is very poor. I want to see the IC of my signals as algo progresses so that I could use some form of dynamic weighting.

Thanks everyone for submitting their factors, we are really excited with the results of this challenge!

I announced the winners during our webinar where I also went over the new alpha tearsheet, showed some common failure modes (and what to do about it), how the winners performed out-of-sample, and showed what happens when we combine the winning factors on our end. If you missed the live webinar, you can watch it here: https://www.youtube.com/watch?v=FYnxvdHPan8&feature=youtu.be

The winners are (in no particular order) drumroll:

  • Antony Jackson
  • Shiv Chawla
  • Vedran Rusman
  • Kyle M
  • Vladimir

We liked these 5 factors so much that we are sending licensing agreements to all of them. Please join me in congratulating them, as they really did outstanding work.

If you submitted a factor but didn't win, we still want to thank you for submitting by sending you a Quantopian T-Shirt (if shipping is too hard because of where you live we will send you an Amazon gift card instead).

If your factor received some feedback that you now want to fix, please post an updated version here and I'll be happy to take another look.

Stay tuned for the next data challenge coming up soon!

I don't believe you have my address, how are you going to ship a t-shirt?

@Jamie: My people will get in touch with your people ;)

Yes, please get in touch with me, too.
I want the Q T-Shirt!
Even if I didn't have good results on this particular challenge I'm doing really great on the daily contest, so I really think I deserve one!

Thanks!

Here' my resubmission. Six estimates factors.

Loading notebook preview...
Notebook previews are currently unavailable.

Here's the same as above but implemented with an Unsupervised ML algo. Looks almost identical but wanted to illustrate implementation possibilities.

Loading notebook preview...
Notebook previews are currently unavailable.

@James: Those look excellent, thanks for posting the updates. There is a good chance we'll want to license that one too.

My updated submission.

Loading notebook preview...
Notebook previews are currently unavailable.

I'll be posting an update version with some changes.

@Thomas
I worked further and added (improved) more factors. This includes the latest work with Guidance dataset which I have shared separately on the other post.

Loading notebook preview...
Notebook previews are currently unavailable.

Thank you

Loading notebook preview...
Notebook previews are currently unavailable.

My updated submission. Daily turnover and alpha decay are still incredibly high; that's just part of this strategy.
There's a thread on the evolution of this strategy; thanks again for the great feedback I got there!

Loading notebook preview...
Notebook previews are currently unavailable.

Thanks everyone for posting the updates, we'll evaluate them carefully.

Apologies that the winners and contestants have not received their prizes yet. As this is the first time we've been doing this we're still figuring things out. The winners should have the cash prizes by the end of this week. If you submitted and want to get your t-shirt, please email [email protected] with your address and shirt-size (S, M, L, XL).

Finally, I can't promise that we will keep the T-Shirts for future challenges, we'll play that by ear.

Here is the updated version to the estimates challenge. I spent almost all of my time in research environment for most of the last three weeks to go over it as a fun exercise and follow the methodology laid out by Thomas. Attached are the notebooks for 2007-2012 and 2013-2018 periods. I plan to post OOS updates of this submission in this thread a year and two years from now.
2007-2012

Loading notebook preview...
Notebook previews are currently unavailable.

2013-2018

Loading notebook preview...
Notebook previews are currently unavailable.

I have some feedback on the allocation process. It appears from the webinar that what you did with all these strategies was that you had a minimium criteria to meet with respect to the alpha notebook, consistent turnover, consistent and low spread in the perc holdings, returns mostly specific returns and then you sorted on IR and took the 5 best that had 2.5-3.0 IR in the in sample period and rejected the rest . Three of the five strategies were flat in the 1 year factset holdout/OOS period though and there is no market validation to confirm that they are in fact 2.5-3.0 strategies in the OOS period commensurate to their in sample sharpe/IR. I'd recommend an alternate approach. Use hierarchical risk parity among the submitted strategies and allocate based on your current confidence level. If you think these strategies are 5 times better than others allocate in 5:1 in that cluster. That way you can recalibrate going forward how they perform in the future in "real market conditions". Going all-in on the top end of highest IR strategies in the last 3 years when you are at the tail end of bull market has risks of its own. Having a wider range of strategies to work with within your estimates factor will probably help you with more options if it gets tough for the factor under the current assigned weights. Personally I don't think the difference between the strategies allocated (and others submitted and meeting all the basic criteria) is a binary yes/no, maybe it is more like a fraction between 0 to 1 based on the information currently present which could always change as one gets real market performance going forward.

Finding a way to keep more talent engaged in ongoing challenges with varying rewards based on current information when meeting a basic criteria will help in continued participation and a community sense in the effort.

Thanks for the feedback Leo. The example equal-weighted allocation I showed in the webinar is not how we allocate things in our fund, so I completely agree.

Thomas, I also had another feedback. It will probably be a good idea to compare an industry standard Estimates factor's performance during the in-sample and out of sample period and see how much relative degradation happened with respect to that benchmark degradation. In the absence of an industry standard Estimate factor performance, one could possibly take the average degradation (or top 50% degradation) in the submission universe.

@Leo M: That's a really good idea, thanks!