$10K Third-Party Challenge: Design a Factor for a Large US Corporate Pension We’ve been working with a large US corporate pension fund, who is extremely interested in the Quantopian community’s ability to come up with interesting new factors. They want you to find something that they can’t get anywhere else, so we are asking you to send us your most unique and innovative ideas. 15 winners will share in a total prize pool of$10,000 and will be eligible to potentially have their factors licensed for inclusion in an entirely new strategy!

Quantopian is interesting to this pension fund because of the diversity of ideas that come from our community. The pension fund would like to construct an entirely new strategy combining the best unique ideas that fit within their criteria and to provide diversification from their current sources of returns.

This challenge has a new element compared to previous Quantopian challenges: we have been provided with anonymized returns data for several existing strategies that the pension fund uses to evaluate potential new managers. We are providing this obfuscated data to you so that you can check its correlation with your factors. We'll be using this correlation in the final scoring criteria.

Strategy Objectives:

This challenge will have a wider set of constraints than previous ones, which should be interesting to many of you. We are looking for factors with the following properties:

• The portfolio must hold at least 100 assets in the QTU.
• There should be a daily turnover of 5% to 20%.
• There are no constraints on risk exposures or beta to SPY, but your exposures must be time-varying — these tilts should be moving daily.
• The specific Sharpe ratio over the first 5 days must be positive.

Requirements:

Post an alpha tearsheet as a reply to this thread to submit to the challenge. To do this, run a backtest on your factor from January 4, 2014, to August 29, 2018. Then run the alpha decay notebook attached to this thread to analyze your backtest results. Note that this notebook is custom for this challenge.

Selection Criteria:

Similar to previous challenges, we will only evaluate only your factor’s end-of-day holdings. For more examples of what we look for, check out our last live tearsheet reviews.

Please avoid combining too many ideas into a single factor. You are free to submit multiple ideas, but keeping them isolated lets us better find the most interesting submissions. The scoring will be based on a combination of the following:

• alpha decay analysis - if your strategy decays too quickly it is more difficult to implement.
• consistency between in-sample and out-of-sample testing.
• uniqueness to the provided benchmark dataset (you can view your uniqueness score at the top of the alpha decay plots).

Prizes:

A total of $10,000 will be awarded as follows: • 5 winners will receive$1,000 each
• 10 winners will receive \$500 each

You will be eligible to win multiple prizes if you have multiple unique ideas that are selected!

Important Dates:
The submission deadline for this challenge is March 3, 2020, at 9 a.m. EST.

We look forward to seeing what you produce!

Thomas Wiecki,
VP of Data Science at Quantopian

350
Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

431 responses

Here is a template algorithm from the insiders challenge you can use.

490
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Template algorithm for the insiders challenge. Based on an algorithm provided by Leo M
# The algo uses documented example from: https://www.quantopian.com/docs/data-reference/ownership_aggregated_insider_transactions

from quantopian.algorithm import attach_pipeline, pipeline_output

import quantopian.optimize as opt
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.domain import US_EQUITIES

# Form 3 transactions
# Form 4 and Form 5 transactions

import pandas as pd
import numpy as np

def initialize(context):
"""
Called once at the start of the algorithm.
"""
# Normally a contest algo uses the default commission and slippage
# This is unique and only required for this 'mini-contest'

# Rebalance every day, 1 hour after market open.
schedule_function(
rebalance,
date_rules.every_day(),
time_rules.market_open(hours=2),
)
# Create our dynamic stock selector.
attach_pipeline(make_pipeline(context), 'pipeline')

# Record any custom data at the end of each day
schedule_function(record_positions,
date_rules.every_day(),
time_rules.market_close())

def create_factor():
# Base universe set to the QTradableStocksUS
# DataSetFamily into DataSets. Here, insider_txns_form3_90d is a DataSet
# containing insider transaction data for Form 3 over the past 90 calendar
# days, and insider_txns_form4and5_90d is a DataSet containing insider
# transaction data for Forms 4 and 5 over the past 90 calendar days. We only
# include non-derivative ownership (derivative_holdings is False).
# From each DataSet, extract the number of unique buyers and unique sellers.
# We do not need to include unique sellers using Form 3, because Form 3 is
# an initial ownership filing, and so there are no sellers using Form 3.
unique_filers_form3_90d = insider_txns_form3_90d.num_unique_filers.latest
unique_sellers_form4and5_90d = insider_txns_form4and5_90d.num_unique_sellers.latest
# Sum the unique buyers from each form together.
unique_sellers_90d = unique_sellers_form4and5_90d
# Compute the fractions of insiders buying and selling.
frac_insiders_selling_90d = unique_sellers_90d / (unique_buyers_90d + unique_sellers_90d)

# compute factor as buying-selling rank zscores

screen = qtu & ~alpha_factor.isnull() & alpha_factor.isfinite()

return alpha_factor, screen

def make_pipeline(context):
alpha_factor, screen = create_factor()

# Winsorize to remove extreme outliers
alpha_winsorized = alpha_factor.winsorize(min_percentile=0.02,
max_percentile=0.98,

# Zscore and rank to get long and short (positive and negative) alphas to use as weights
alpha_rank = alpha_winsorized.rank().zscore()

return Pipeline(columns={'alpha_factor': alpha_rank},
screen=screen, domain=US_EQUITIES)

def rebalance(context, data):
# Get the alpha factor data from the pipeline output
output = pipeline_output('pipeline')
alpha_factor = output.alpha_factor
log.info(alpha_factor)
# Weight securities by their alpha factor
# Divide by the abs of total weight to create a leverage of 1
weights = alpha_factor / alpha_factor.abs().sum()

# Must use TargetWeights as an objective
order_optimal_portfolio(
objective=opt.TargetWeights(weights),
constraints=[],
)

def record_positions(context, data):
pos = pd.Series()
for position in context.portfolio.positions.values():
pos.loc[position.sid] = position.amount

pos /= pos.abs().sum()

# Show quantiles of the daily holdings distribution
# to show if weights are being squashed to equal weight
# or whether they have a nice range of sensitivity.
quantiles = pos.quantile([.05, .25, .5, .75, .95]) * 100
record(q05=quantiles[.05])
record(q25=quantiles[.25])
record(q50=quantiles[.5])
record(q75=quantiles[.75])
record(q95=quantiles[.95])
There was a runtime error.

Model 0.1

45

Model #1

8

Model #2

6

Model #3

1

Model #4

2

Model #5

2

Model #6

2

Model #7

2

@Thomas,

If we want to include the Insiders dataset (two year holdout period), can we use the period January 4, 2014 - January 4, 2018 instead?

Very excited to have a go at this challenge!

@Joakim: Good question, yes absolutely. It won't matter for the scoring as we can run it internally on the full IS and OOS period, so the only disadvantage is to you in not being able to see part of the backtest period other users have access to.

model v0.1

2

Thomas, I posted this concern on the tearsheet thread -- this tearsheet doesn't factor in beta as a risk factor? So if I submit a short-only algorithm, for instance, the specific IR will be much lower than it should be.

model v0.2

7

Thanks @Anotony, I got mixed up on my notebooks!

factset estimates

2

alternative (self-serve) SHORT-ONLY

as mentioned earlier, the correct specific IR is closer to 2.0 than 1.0.

Factor was created > 6 months ago.

3

Model #8

1

Model #9

1

Model #10

2

Model #11

2

Model 1A

2

Model 1B (alternative ranking).

1

Model 2A.

2

Model 1

1

model 1

2

Model #1

1

Submission v1. Fixed some coding issues with my previous v1 submission to better reflect my economic hypothesis.

2

Thank you for all the submissions so far! One pattern I see here in submissions by Mikhai, Arun, and Emiliano is that the lower-left plot shows the portfolio to be equal weighted. We will probably punish that type of pattern and invite you to not rely on maximize_alpha or the optimizer at all. You can see the template algorithm I shared above for how to get to a more sensible weighting scheme. Also, see the tearsheet review of an earlier challenge for more info on what we are looking for: https://www.youtube.com/watch?v=r5FRV5XnY1M

@Thomas. Thanks for the review. I'll remove the optimizer constraints and resubmit.

looks bad over this time span...

1

Model 2B (alternative ranking).

1

Model 3A.

1

Model 3B (alternative ranking).

1

.

1

Model 3.1A (updated factors with slightly higher uniqueness score).

1

Model 4A.

2

alphaV

3

Model 5A.

1

Model 5B (alternative ranking).

1

Model 4B (alternative ranking).

1

Model 4.1B

1

technical

1

Model 6A

1

Model 6B (alternative ranking).

1

Model 7A - Average Daily Turnover: 21.03% (raw, un-smoothed).

1

Model 7B (alternative ranking) - Average Daily Turnover: 20.85% (raw, un-smoothed).

1

2005-2019 Model #2

2

2014-2018 Model #2

1

Model 1

1

Model 0.2

1

Model 7.1A (smoothed). Average Daily Turnover: 10.26%.

1

Model 7.1B (smoothed). Average Daily Turnover: 10.19%.

1

Model 8.

1

Model 9A.

2

Model 9B (alternative ranking).

1

Model 10A.

1

Model 10B (alternative ranking).

0

Model 11 - Value Factor Composite, each statistically robust (or so I've convinced myself anyway).

0

Model 12 - Growth Factor Composite, each statistically robust (or so I've convinced myself anyway).

0

Model 12.1 - Growth Factor Composite (alternative ranking)

0

Model 2

0

Strategy #1, looks like the universe is small based on the FS fields used.

0

Model1:

0

Model 13A - Three pure Insiders factors (each statistically significant on held out data).

I'm worried that my previous models were possibly overfit, so this and future models have increased focus on robustness, simplicity, and statistical significance.

0

Model 13B - Same factors; alternative ranking.

0

Model 14.

0

Strategy #2

0

Model 15

0

Model 3

0

Model 1

0

Model: 3A

0

@Thomas,
Is it ok to use targetweights with some constraints? Graphs looks much nicer.

Model #12

0

Model #13

0

Model #14

0

Model#15

0

model #16

0

Model #17

0

Strategy #3 mixes various sentiments around the market

0

Model #18

0

Model #19

0

Model #20

0

Model 1

0

V6 - 2 Factors . Only uses FS Guidance and FS Estimates.

0

Submission v2

0

Model 16

0

My entry 1

1

Model #21

0

Model #22

0

@all: The submissions so far look really great. One pattern however, that seems to run pretty common is one where the exposures are long momentum and short value and volatility. There is nothing wrong with that per se but note that being in a cluster of strategies with very similar exposure patterns will decrease your chances of winning. It also seems that this type of exposure pattern results in lower uniqueness scores.

If you find that pattern in your algorithm, do NOT use the optimizer to get rid of it, but rather realize that your idea might not be as unique as you might have hoped for and see if you can try things more off the beaten path.

Seems like if you have both low turnover and positive returns, that would translate to at least incidental momentum exposure. It's a tautology, no?

Low turnover and low alpha decay will lead to static exposures to factors. If the factor generates a return, it'll also be exposed to returns (momentum). If value generated a negative return over the time period, it will have been short value.

Is there a way around this? Perhaps by shrinking the portfolio to a subset of the universe where the exception to the rule prevails?

S&P is on a bull market. Value is not doing well

Model 1.

0

Model 2.

0

Low volatility equity returns

0

Model 2

0

Model #3

0

Model 2

0

Model 3

0

Alt

0

Model4.1

0

Model #23

0

Model #4

0

1. Is the @ operator in compute_uniq()specific to Qunatopian? I have never seen this used in NumPy or Pandas.

2. I don't think you need the line:longs = expos.loc[expos > 0] in get_max_median_position_concentration()and the similar line after it with shorts. I don't see it used anywhere.

3. You can get rid of the annoying red warning when you are fixing the risk loadings bug risk_returns.loc[risk_returns.value.idxmax(), 'value'] = 0 by adding
factor_returns.is_copy = False prior to the bug-fix line.

Model #5 - Single Value Factor

0

Model 17A

0

Model 4

0

Model 17B

0

Model 0.3

0

Model 0.4

0

Model 1.1

0

Model 1.2

0

Model 4 v2 -Alternate weighing

0

Wow, so many low turnover, high Sharpe ratio, low correlation strategies. It's actually quite discouraging. Unless this is a festival of overfits, congrats everybody for cracking the secrets to the stock market.

Model #24, using the optimizer

0

Model 1.3

0

Model 1.4

0

Single Factor - 1st Submission Model 1

0

Model A

0

Submission v2.1

0

Submission v2.2

0

Model7

0

Model8

0

Model 2 , based on Earnings Expectations

0

Submission v1.2

0

Submission v1.1

0

Modl 5 revised - Nonlinear ML

0

Model 6 - Nonlinear ML

0

Long Backtest of Model 6 (For illustration purposes only to show long term consistencies of performance under different market regimes/conditions).
From 01/04/2006 To 01/31/2019

0

Submission v3

0

Submission v3.1

0

Submission v3.2

0

Submission v4

0

Submission v4.1

0

Submission v4.2

0

Model 18 - Financials specific model.

1

Model 19

0

Insider-only

0

Model #24 One factor

0

Single Factor 0

0

Model #25

0

Model #26

0

Model #27

0

Model #28

0

Model #29

0

Model #30

0

Model 12

0

Model 7 -Nonlinear ML

1

Model A

0

Model AA

0

I have a question regarding the following property:

• The portfolio must hold at least 100 assets in the QTU

My question:
Would it be ok if the portfolio on average holds 140 assets in the QTU and 10 assets outside the QTU?

Model 3

0

Model 8 - same inputs and ML algo as Model 7 but added another factor derived from a separate ML algo that determines which current state each individual asset is, where state defined as trending, mean reverting or neutral.

1

Model 8a - same inputs and ML algo as Model 7 but added another factor derived from a separate ML algo that determines which current market regime is at, where regime defined as trending, mean reverting or neutral.

0

Model 1

1

Model 2

0

Strategy 2a

0

Model 9

0

Model 10

0

Alpha Factor #2

0

another factor

0

Model 9 - ChaosTheory principles implemented in Nonlinear ML

1

Alpha #3

0

Here's the long backtest of Model 9 (01/04/2006 - 02/08/2019). This is an alternative validation process to see if the model in sample performance is consistent on data it has not seen before. It can give some sense of statistical confidence and verifies accuracy of generalization in the face of changing market regimes and conditions.

3

Model 2.1

0

Model 2.2

0

Model 3

0

Model 4

0

Model 2.3

0

Model 2.4

0

Model 2.5

0

Model 2.6

0

Model 2.7

0

Model 2.8

1

Model 5

0

Model 6

0

Model 7

0

Model 3.1

0

Model 3.2

0

ST BB V

0

My first submission

0

Submission v5

0

Submission v5.1

0

Submission v5.2

0

Submission v6

0

Submission v6.1

0

Submission v6.2

0

Submission v7

0

Submission v7.1

0

Submission v7.2

0

Submission v8

0

Submission v8.1

0

Submission v8.2

0

Model15

0

Model16

0

Model17

0

Model14

0

My first submission

0

alt 2.0

0

Model 10

0

Model 11

0

Model13

0

Model18

0

Model 12

0

Model19

0

Model 13

0

2

0

3

0

Submission 4/ 2 factors with data limited to 400-450 Stocks

0

Model v1, using some insider data.

1

Model 14

0

Model A1

0

Model B1

0

Model C1

0

Model D1

0

Model E1

0

Model F1

0

Model G1

0

sub 5/ Only 1 factor with limited data (not estimates)

0

Sub 5.1/ the same 1 factor above.

0

Fund V1

0

Model #31

0

Model #32

0

Model #33

0

Model #34

0

Model 16

0

Model 17

0

Model 18

0

Model 19

0

Model 20

1

Model #35

0

Model 4.0

0

Model 1.1

2

Model 21

0

Model 22

0

Model 23

0

Model 24

0

Model #36

0

Model #37

0

v1.12

0

6/ Only 1 factor.

0

Model 2

0

Model 1, Version 1

I am not sure what to make of the huge turnover spike around Dec 2015. On the backtest, it is a lot tamer, corresponding to an increase in turnover from 16% to 22%

0

Model 1

0

Pension-01-bt7--Insiders+ML
Only up to 02-21-2018 due to Insiders data availability.

0

v 1.1

0

Model 2

0

Submission

0

Model 1, Version 2

0

Avg_IRS = 2.41; uniqueness = 97.05; Avg_n_holdings = 279; Avg_turnover = 0.078; v3 Backtest 34

1

Submission

0

Strat #8, high turnover

0

Avg_IRS = 2.48; uniqueness = 97.19; Avg_n_holdings = 402; Avg_turnover = 0.080; v5 Backtest 1

0

Avg_IRS = 2.51; uniqueness = 96.90; Avg_n_holdings = 616; Avg_turnover = 0.073; v5 Backtest 3

0

Avg_IRS = 2.53; uniqueness = 97.06; Avg_n_holdings = 737; Avg_turnover = 0.076; v5 Backtest 6

0

Avg_IRS = 2.54; uniqueness = 97.36; Avg_n_holdings = 725; Avg_turnover = 0.086; v5 Backtest 7

0

Avg_IRS = 2.55; uniqueness = 97.40; Avg_n_holdings = 765; Avg_turnover = 0.087; v5 Backtest 9

0

Avg_IRS = 2.57; uniqueness = 97.42; Avg_n_holdings = 795; Avg_turnover = 0.089; v5 Backtest 11

0

Model 6.1

0

Model #6

1

model 1

0

v 1.13

0

v 1.14

0

v 1.15

0

Model 8

0

Model 9

0

Model 10

0

Model 11

0

Model 10

0

Model 3

0

Model #38

0

Model #39

0

Model #40

0

Full tearsheet

1

Pension-01-bt27--MultiFactors+ML
Only up to 02-21-2018 due to Insiders data availability.

0

v1,1

0

v1.5

0