New tearsheet challenge on the estimates dataset!

Our contest remains the most successful method to inspire new ideas, license new algorithms, and provide guidelines on the type of algorithms we are looking for. First off the bat: the contest is not changing. It is unrelated to this challenge and remains your best bet to receive an allocation. But because of its success, we want to experiment with some additional challenges, and this is the first of these experiments.

Here are the rules:

• There is no submission or live-updated leaderboard like for the contest. To enter this challenge, simply post an alpha tearsheet as a reply to this thread. For this, you would run a backtest on your factor and run the alpha notebook which loads in your backtest results.
• The deadline to submit a factor is September 16, 2019 at 9:30am ET.
• There is no hold-out testing, just post your best factor starting on June 1, 2015 until Sept 1, 2018.
• We will look at all submissions and manually determine the 5 best algorithms according to our discretion. Each winner gets a $100 prize. • There is no limit on the number of submissions. Algorithm requirements to enter the challenge: • Use at least one of the estimates datasets (Consensus, Actuals, Long term, Broker recommendations, Guidance) as your primary signal source. • Use TargetWeights in the optimizer and do not put any constraints on common risk exposures. • Use the QTU. • Set slippage and commission costs to 0. When selecting a winner, we will primarily look at: • Specific Sharpe Ratio (IR) in the first 5 days (higher is better). • Turnover (lower is better). • Universe size (larger is better). • Not driven mainly by common risk (but no reason to try and artificially reduce your exposures, ideally your idea is dissimilar enough from common factors that it will be naturally uncorrelated). These rules are mostly derived from our updated guidelines on getting an allocation and are based on many community members' feedback. Thank you for all your input and creative suggestions. If this challenge is well-received, we will continue to offer more experimental challenges. Ultimately, we want to keep improving our ability to find your best ideas and fund them! Good luck and happy coding! Disclaimer The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances. 72 responses What if we run out of memory trying to load such a long backtest into the notebook? Viridan: Good point, I hadn't thought of that. I considerably reduced the duration of the backtest so that there shouldn't be any memory problems. Edit: see below for my two "final" submissions. One based on this algo, and another that uses higher turnover to drive better short-term returns. 3 estimates factors 15 Click to load notebook preview thank you Quantopian! Nice. My submission - 1 Click to load notebook preview @Balakrishnan Swaminathan I think you posted the wrong notebook. first try 4 Click to load notebook preview Likely too high of turnover to be exciting for Q, but here's another shot. 3 Click to load notebook preview (Re-submitting because I made some fixes.) To my surprise there actually appears to be some good alpha in analyst price targets, albeit with very quick alpha decay. Trading at close is already much worse than trading at open, and then within a day it's all gone. Interesting to see the style analysts latch onto -- heavily momentum. It's a shame Q can't do anything with this. Fundamental alpha seems like a lot of hocus pocus to me, like predicting next year's fashions, whereas this has a clear mechanism that explains it. 4 Click to load notebook preview This is the same strategy, just starting at Jan 1 2010. 1 Click to load notebook preview Would Quantopian be kind enough to post a template algorithm (backtest) using all the features and data so that we can work on it? The learning curve is quite steep otherwise for many of us. I previously uploaded the wrong NB so I just deleted the post to keep clean. Consists of 4 FS EST Factors. There is a drop off from Day 1 to Day 2 but seems remain stable thereafter. Turnover averages out around 10%. One take-away for me is to see if I can minimize risk tilts a little better but overall not too bad. 2 Click to load notebook preview My first one. 4 Estimates based factors (maybe too complex?). This was my 'validation' period (single backtest). 0 Click to load notebook preview Here is a template for creating an algo for this challenge. 223 Loading... Total Returns -- Alpha -- Beta -- Sharpe -- Sortino -- Max Drawdown -- Benchmark Returns -- Volatility --  Returns 1 Month 3 Month 6 Month 12 Month  Alpha 1 Month 3 Month 6 Month 12 Month  Beta 1 Month 3 Month 6 Month 12 Month  Sharpe 1 Month 3 Month 6 Month 12 Month  Sortino 1 Month 3 Month 6 Month 12 Month  Volatility 1 Month 3 Month 6 Month 12 Month  Max Drawdown 1 Month 3 Month 6 Month 12 Month """ Sample algo using estimates datasets for mini-contest (Consensus, Actuals, Long Term, Broker Recommendations, Guidance) This template is based on an example algorithm from Daniel Cascio. """ # Import required pipeline methods from quantopian.pipeline import Pipeline from quantopian.algorithm import attach_pipeline, pipeline_output # Import any built in filters and/or factors from quantopian.pipeline.filters import QTradableStocksUS # Import datasets being used from quantopian.pipeline.data.factset.estimates import (PeriodicConsensus, Actuals, ConsensusRecommendations as Consensus, LongTermConsensus, Guidance) # Import optimize import quantopian.optimize as opt import pandas as pd def initialize(context): # Normally a contest algo uses the default commission and slippage # This is unique and only required for this 'mini-contest' set_commission(commission.PerShare(cost=0.000, min_trade_cost=0)) set_slippage(slippage.FixedSlippage(spread=0)) attach_pipeline(make_pipeline(context), 'sales_estimate_pipeline') # Place orders towards the end of each day schedule_function(rebalance, date_rules.every_day(), time_rules.market_close(hours=2)) # Record any custom data at the end of each day schedule_function(record_positions, date_rules.every_day(), time_rules.market_close()) def create_factor(): # Use the QTradableStocksUS universe as a base qtu = QTradableStocksUS() # Create an alpha factor # Must use at least one of the estimates datasets as the primary signal # (Consensus, Actuals, Long term, Broker recommendations, Guidance) # Replace this logic with your own factor(s). up = PeriodicConsensus.slice('SALES', 'qf', 1).up.latest down = PeriodicConsensus.slice('SALES', 'qf', 1).down.latest num_est = PeriodicConsensus.slice('SALES', 'qf', 1).num_est.latest alpha_factor = (up - down) / num_est # Filter out securities with very few estimates or factor is invalid screen = qtu & alpha_factor.isfinite() & (num_est > 2) return alpha_factor, screen def make_pipeline(context): alpha_factor, screen = create_factor() # Winsorize to remove extreme outliers alpha_winsorized = alpha_factor.winsorize(min_percentile=0.05, max_percentile=0.95, mask=screen) # Zscore to get long and short (positive and negative) alphas to use as weights alpha_zscore = alpha_winsorized.zscore() return Pipeline(columns={'alpha_factor': alpha_zscore}, screen=screen) def rebalance(context, data): # Get the alpha factor data from the pipeline output output = pipeline_output('sales_estimate_pipeline') alpha_factor = output.alpha_factor # Weight securities by their alpha factor # Divide by the abs of total weight to create a leverage of 1 weights = alpha_factor / alpha_factor.abs().sum() # Must use TargetWeights as an objective order_optimal_portfolio( objective=opt.TargetWeights(weights), constraints=[], ) def record_positions(context, data): pos = pd.Series() for position in context.portfolio.positions.itervalues(): pos.loc[position.sid] = position.amount pos /= pos.abs().sum() quantiles = pos.quantile([.05, .25, .5, .75, .95]) * 100 record(q05=quantiles[.05]) record(q25=quantiles[.25]) record(q50=quantiles[.5]) record(q75=quantiles[.75]) record(q95=quantiles[.95]) There was a runtime error. Thank you @Thomas Thank you @Thomas. The IR dropoff :( 5 Click to load notebook preview I'm a little bit confused by the turnover tbh, I'm only ordering once at the beginning of the day. @Jamie, I do like that equity curve. Great job. Sorry for the submission spam, I've had a lot of fun with some of these datasets :). Will Q do a live video evaluating our submissions? 1 Click to load notebook preview Five estimate factors (across LongTermConsensus, ConsensusRecommendations, PeriodicConsensus, Guidance, Actuals). 1 Click to load notebook preview Does the entry need to include estimates datasets only? Can I use other factors as well in addition to factors derived from estimates datasets? Here is my entry. This contains only the estimates datasets. Thanks! • Newer Version Below 2 Click to load notebook preview Shiv: No, every factor should be using estimates as its primary source, but if there is alpha in combining estimate data with other data sources, that's totally fine. Your tearsheet looks great, but a) it looks like you're close to equal weight, and b) can you increase your universe size, and c) you have a pretty high tech exposure, which doesn't have to be bad, but maybe that is a spot where you could apply the optimizer only at tech exposure.. @Jamie -- Lets say you have a$100mm book and rebalance the entire portfolio once a day, if you exit $100mm worth of positions and enter$100mm worth of new positions, you just turned over a total of $200mm. That's presumably why your one notebook's turnover is hitting 2.0 (e.g.$200mm transacted/\$100mm portfolio value).

Four Factors

1

I'm really excited about all these fantastic submissions!

One request: If you post an updated version of your factor, please remove the previous version. That will make it much easier for us to sort through when determining the winners. We will take correlations and similarity into account.

four-factor model updated

0

This one uses
https://www.quantopian.com/posts/build-alpha-factors-with-cointegrated-pairs
as the starting point.
@Rene uses co-integrated pairs to look at earnings surprises, and goes long-or-short based on the direction of the surprise.
Because there are limited pairs, we use a minimum-total-number-of-assets filter(60 in this case) and hence create a factor that both has a limited number of assets in it, and is investment-on-off filtered.

0

Three estimates factors

1

Three factors alpha there is a newer version further below.

2

Final Submission:
Added 4 more factors (total of 8 factors). Correlated with last entry with correlation at 0.80

0

Final Submission 1: Four Estimates Factors

1

Here's two factors with better risk exposure tilts

1

Two factors to get in early before the start of analyst momentum herding.

2

Five estimate factor model

0

My second attempt

0

One factor to keep things simple

0

Final Submission 2: Seven Estimates Factors

0

single factor

0

Could someone please guide me on how to create a Custom Factor with a 90-day window on Factset PeriodicConsensus data?

4 estimate factors. Tried to focus on large universe, long-term stability and controlled risk exposure via factor construction.

0

Single Estimates alpha without constraints.
There is a better version further below.

0

Three factors alpha without constraints

1

Another single factor

0

Attempt 1.

0

Pure raw 8 Estimates factors

0

Pure raw 5 factors, quantile binning

0

Wow, some really impressive tearsheets. My submission is using the Broker Recommendations dataset as main source.

0

Another tearsheet, also using Broker Recommendations als main dataset.

1

Late submission. Tried to replace my second model but wasn't able to.

Edit: Deleted my second model above, as this one replaces it.

0

Thanks for all your submissions, we are really excited about all the algorithms, some of which look really good. What I'll do next is go through all of them, select the winners together with David and then do a live review of them where we will announce the winners (of course I'll post it here too).

Due to the success of this, I'd really like to get some feedback, as we'll definitely do more of them in the future. What did you enjoy here, what was not so great, what questions came up, was it enough time or not nearly enough?

Also, if you want to make my life easier, please go back and delete any outdated submissions that are superseded by more up-to-date ones. Or, even better, edit those old ones saying that there is a newer version further below, that way we can see the progression. Thanks!

Thanks, @Thomas for your template algorithm. I mostly struggled with API and it would be nice if we could see more examples of using Factset data.

1. Did not know how to get a series of consensus estimates for a historic window (say 90 days).
2. Could not figure out why this does not work as expected: factor.rank(groupby=sector, mask=screen).zscore(groupby=sector)
3. Q does a great job pre-trade, but access to post-trade analytics within an algorithm is very poor. I want to see the IC of my signals as algo progresses so that I could use some form of dynamic weighting.

Thanks everyone for submitting their factors, we are really excited with the results of this challenge!

I announced the winners during our webinar where I also went over the new alpha tearsheet, showed some common failure modes (and what to do about it), how the winners performed out-of-sample, and showed what happens when we combine the winning factors on our end. If you missed the live webinar, you can watch it here: https://www.youtube.com/watch?v=FYnxvdHPan8&feature=youtu.be

The winners are (in no particular order) drumroll:

• Antony Jackson
• Shiv Chawla
• Vedran Rusman
• Kyle M

We liked these 5 factors so much that we are sending licensing agreements to all of them. Please join me in congratulating them, as they really did outstanding work.

If you submitted a factor but didn't win, we still want to thank you for submitting by sending you a Quantopian T-Shirt (if shipping is too hard because of where you live we will send you an Amazon gift card instead).

If your factor received some feedback that you now want to fix, please post an updated version here and I'll be happy to take another look.

Stay tuned for the next data challenge coming up soon!

I don't believe you have my address, how are you going to ship a t-shirt?

@Jamie: My people will get in touch with your people ;)

Yes, please get in touch with me, too.
I want the Q T-Shirt!
Even if I didn't have good results on this particular challenge I'm doing really great on the daily contest, so I really think I deserve one!

Thanks!

Here' my resubmission. Six estimates factors.

4

Here's the same as above but implemented with an Unsupervised ML algo. Looks almost identical but wanted to illustrate implementation possibilities.

1

@James: Those look excellent, thanks for posting the updates. There is a good chance we'll want to license that one too.

My updated submission.

0

I'll be posting an update version with some changes.

@Thomas
I worked further and added (improved) more factors. This includes the latest work with Guidance dataset which I have shared separately on the other post.

0

Thank you

0

My updated submission. Daily turnover and alpha decay are still incredibly high; that's just part of this strategy.
There's a thread on the evolution of this strategy; thanks again for the great feedback I got there!

1

Thanks everyone for posting the updates, we'll evaluate them carefully.

Apologies that the winners and contestants have not received their prizes yet. As this is the first time we've been doing this we're still figuring things out. The winners should have the cash prizes by the end of this week. If you submitted and want to get your t-shirt, please email [email protected] with your address and shirt-size (S, M, L, XL).

Finally, I can't promise that we will keep the T-Shirts for future challenges, we'll play that by ear.

Here is the updated version to the estimates challenge. I spent almost all of my time in research environment for most of the last three weeks to go over it as a fun exercise and follow the methodology laid out by Thomas. Attached are the notebooks for 2007-2012 and 2013-2018 periods. I plan to post OOS updates of this submission in this thread a year and two years from now.
2007-2012

0