Back to Community
New Tearsheet Challenge: Insider Transactions Dataset, $5000 in Prizes!

Good morning Mr. Hunt,

Your mission, should you choose to accept it, is to come up with predictive factors for the Insider Transactions Dataset. It is the most recent addition to our collection of datasets and is equally interesting and challenging. Submit to this challenge for an opportunity to explore Insider Transactions, win prizes, and vie for an allocation. Since this is our last challenge of 2019 we are 10x-ing the prize pool to a total of $5,000. The top 5 submissions will receive $1,000 each!

If you submit, please fill out the questionnaire (your submission will still be valid even if you don't fill it out).

About the Challenge:

Insiders are informed market participants that must trade according to strict regulations in the company for which they possess material non-public information. Most people consider "insider trading" risky — traders exploiting insider knowledge for their own gain. However, in the case of this dataset, it refers to registered insiders that transact stocks of the companies for which they work.

Insiders' trading abilities are regulated through several methods: (i) scheduled transactions at recurring frequencies, or (ii) open market transactions. Scheduled transactions allow for insiders to buy/sell shares in a consistent manner. A pre-arranged consistent schedule reduces the risk of having material information informing their decision.

Insiders must be careful to ensure they aren't using material information to inform their decision — a big undisclosed contract, a new drug discovery, etc. are all information that the market hasn't had an opportunity to process, and as such, prohibited by insiders to act on. Insiders do, however, possess information about the general pulse of the company and therefore, these transactions contain general sentiments about the company.

For more information, see the announcement post or the documentation on Insider Transactions.

Requirements:
- Post an alpha tearsheet as a reply to this thread to submit to the challenge. For this, you would run a backtest on your factor and run the alpha notebook which loads in your backtest results.
- Post your best factor starting on Jan 4, 2014, until Dec 19, 2017.
- The scoring will be based on the alpha decay analysis in the backtest as well as hold-out period (for more details see below). We will also evaluate consistency between the backtest and hold-out periods to disincentivize overfitting.
- Your algorithm must not use any stocks outside the QTU.
- There is no limit on the number of submissions. If you submit multiple iterations, you may version them.
- For an easy start, clone the template algorithm attached in this post (thanks to Leo M for providing an example algorithm).
- Do not simulate any transaction costs, turnover will be used to evaluate the cost of trading your algorithm.

Selection Criteria:
- Just like with previous challenges, we will only evaluate your algorithm based on its end-of-day holdings.
- Specific Sharpe Ratio (IR) over the first 5 days in the alpha decay analysis (higher is better).
- Turnover (lower is better).
- Not driven mainly by common risk (but no reason to try and artificially reduce your exposures, ideally your idea is dissimilar enough from common factors that it will be naturally uncorrelated).
- Universe size (larger is better).
- For more examples of what we look for, check out our last live tearsheet reviews.

Prizes:
Top 5 factors receive $1000 each and a chance for an allocation.

Important Upcoming Dates:
The submission deadline for this challenge is February 1, 2020, at 9 a.m. EST.

I hope to see your submission on the list!

Thomas Wiecki,
VP of Data Science at Quantopian

Clone Algorithm
487
Loading...
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Template algorithm for the insiders challenge. Based on an algorithm provided by Leo M
# The algo uses documented example from: https://www.quantopian.com/docs/data-reference/ownership_aggregated_insider_transactions

from quantopian.algorithm import attach_pipeline, pipeline_output

import quantopian.optimize as opt
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.filters import QTradableStocksUS
from quantopian.pipeline.domain import US_EQUITIES

# Form 3 transactions
from quantopian.pipeline.data.factset.ownership import Form3AggregatedTrades
# Form 4 and Form 5 transactions
from quantopian.pipeline.data.factset.ownership import Form4and5AggregatedTrades

import pandas as pd
import numpy as np

def initialize(context):
    """
    Called once at the start of the algorithm.
    """
    # Normally a contest algo uses the default commission and slippage
    # This is unique and only required for this 'mini-contest'
    set_commission(commission.PerShare(cost=0.000, min_trade_cost=0))   
    set_slippage(slippage.FixedSlippage(spread=0))
    
    # Rebalance every day, 1 hour after market open.
    schedule_function(
        rebalance,
        date_rules.every_day(),
        time_rules.market_open(hours=2),
    )
    # Create our dynamic stock selector.
    attach_pipeline(make_pipeline(context), 'pipeline') 
    
    # Record any custom data at the end of each day    
    schedule_function(record_positions, 
                      date_rules.every_day(),
                      time_rules.market_close())
    
    
def create_factor():
    # Base universe set to the QTradableStocksUS
    qtu = QTradableStocksUS()
    # Slice the Form3AggregatedTrades DataSetFamily and Form4and5AggregatedTrades
    # DataSetFamily into DataSets. Here, insider_txns_form3_90d is a DataSet
    # containing insider transaction data for Form 3 over the past 90 calendar
    # days, and insider_txns_form4and5_90d is a DataSet containing insider
    # transaction data for Forms 4 and 5 over the past 90 calendar days. We only
    # include non-derivative ownership (derivative_holdings is False).
    insider_txns_form3_90d = Form3AggregatedTrades.slice(False, 90)
    insider_txns_form4and5_90d = Form4and5AggregatedTrades.slice(False, 90)
    # From each DataSet, extract the number of unique buyers and unique sellers.
    # We do not need to include unique sellers using Form 3, because Form 3 is
    # an initial ownership filing, and so there are no sellers using Form 3.
    unique_filers_form3_90d = insider_txns_form3_90d.num_unique_filers.latest
    unique_buyers_form4and5_90d = insider_txns_form4and5_90d.num_unique_buyers.latest
    unique_sellers_form4and5_90d = insider_txns_form4and5_90d.num_unique_sellers.latest
    # Sum the unique buyers from each form together.
    unique_buyers_90d = unique_filers_form3_90d + unique_buyers_form4and5_90d
    unique_sellers_90d = unique_sellers_form4and5_90d
    # Compute the fractions of insiders buying and selling.
    frac_insiders_buying_90d = unique_buyers_90d / (unique_buyers_90d + unique_sellers_90d)
    frac_insiders_selling_90d = unique_sellers_90d / (unique_buyers_90d + unique_sellers_90d)
    
    # compute factor as buying-selling rank zscores
    alpha_factor = frac_insiders_buying_90d - frac_insiders_selling_90d
    
    screen = qtu & ~alpha_factor.isnull() & alpha_factor.isfinite()
    
    return alpha_factor, screen

def make_pipeline(context):  
    alpha_factor, screen = create_factor()
    
    # Winsorize to remove extreme outliers
    alpha_winsorized = alpha_factor.winsorize(min_percentile=0.02,
                                              max_percentile=0.98,
                                              mask=screen)
    
    # Zscore and rank to get long and short (positive and negative) alphas to use as weights
    alpha_rank = alpha_winsorized.rank().zscore()
    
    return Pipeline(columns={'alpha_factor': alpha_rank}, 
                    screen=screen, domain=US_EQUITIES)
    

def rebalance(context, data): 
    # Get the alpha factor data from the pipeline output
    output = pipeline_output('pipeline')
    alpha_factor = output.alpha_factor
    log.info(alpha_factor)
    # Weight securities by their alpha factor
    # Divide by the abs of total weight to create a leverage of 1
    weights = alpha_factor / alpha_factor.abs().sum() 
    
    # Must use TargetWeights as an objective
    order_optimal_portfolio(
        objective=opt.TargetWeights(weights),
        constraints=[],
    )

    
def record_positions(context, data):
    pos = pd.Series()
    for position in context.portfolio.positions.values():
        pos.loc[position.sid] = position.amount
        
    pos /= pos.abs().sum()
    
    # Show quantiles of the daily holdings distribution
    # to show if weights are being squashed to equal weight
    # or whether they have a nice range of sensitivity.
    quantiles = pos.quantile([.05, .25, .5, .75, .95]) * 100
    record(q05=quantiles[.05])
    record(q25=quantiles[.25])
    record(q50=quantiles[.5])
    record(q75=quantiles[.75])
    record(q95=quantiles[.95])
There was a runtime error.
Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

219 responses

First submission to this contest
Model #1,

Click to load notebook preview

Model #2

Click to load notebook preview

@Thomas, just confirming, which tearsheet are we supposed to use? The old one that loads in the backtest, or your new Alpha decay one? Also, are we supposed to use default friction or no friction, or it doesn’t matter because you’re looking at EOD holdings?

Would rebalancing 1 hour before close be ok? The above template I believe uses default friction and rebalance an hour after the open. Or are you looking at intraday alpha (on day T) as well on this one?

Looking forward to spending plenty of time on this one!

Great questions Joakim.

You should use the new alpha decay tearsheet like in the financials, guidance and estimates challenge; i.e. run a backtest, load it into research, and generate the plot which you would post here. Do not use trading costs, although as you say it doesn't matter as the analysis uses EOD holdings (I updated the algorithm to reflect that).

Yes, I think 1h is fine, 2h is probably even safer. No intraday alpha for this one, same as with the previous ones.

Hi,

Is it allowed to use pricing and sector data?

Best regards,
Oleg

Model #3
High turnover 85 %

Click to load notebook preview

Model #4
Sky high turnover 140 %

Click to load notebook preview

Model #5

Click to load notebook preview

Are there style and sector constraints just like in the normal contest?

Are there style and sector constraints just like in the normal contest?

@Alexander Katyk -- It says in the "Selection Criteria":

Not driven mainly by common risk (but no reason to try and artificially reduce your exposures, ideally your idea is dissimilar enough from common factors that it will be naturally uncorrelated).

It also says:

  • Specific Sharpe Ratio (IR) over the first 5 days in the alpha decay analysis (higher is better).

In other words, there aren't directly any hard risk constraints. However, if risk exposure is leading to low Specific IR, then you won't place high in the contest.

wow, $1,000 price. will definitely participate in this one!

My very first entry :)

Click to load notebook preview

@Alex: Thanks for the submission. You might want to check out one of my tearsheet revies: https://www.youtube.com/watch?v=r5FRV5XnY1M&feature=youtu.be In it, I discuss various structural properties we look like in an algorithm, like not squashing risk with the optimizer which your algorithm appears to be doing, this often causes things to get equal-weighted which is revealed by the lower right plot. Also check out the algo template which is a good starting point. Let me know if you have more questions.

Are we allowed to use other factors with the Insider Transactions factor? Thanks!

@Youness: No, all factors need to include Insider Transaction data. However, you can certainly come up with ideas of how insider and maybe outside sentiment could interact and then include factors that exploit that interaction.

Hi Thomas how do you verify if insider transaction data is included in a factor if submissions are anonymous? Also, just to clarify that last question, we are only allowed to use factors that include insider data? What if we sum together several factors and only one of them uses insider data?

@Jay: All factors need to include insider data, otherwise you could just have that be one factor among many and down-weight it significantly to where the contribution is so small that one wouldn't call it an insider factor anymore. The challenge is to develop one targeted factor based on insider data. That targeted factor can be made up from different individual factors which all are based on insider data too. Remember that we are combining these factors ourselves with other ones, like fundamentals or estimates ones, if you already do that combination it gives us less optionality.

One of the most elegant ways for us to test if the algorithms actually uses insider data is to rerun your algorithm but randomly shuffle the insider data and see if the holdings of your algorithm are affected, and by how much.

Here's my first submission using two insider factors. I'm wary of overfitting so I'll work to iterate off of this and minimize that possibility.

Submission v1:

Click to load notebook preview

@Thomas, thank you for taking time to have a look at my submission. Will review the video and will make improvements.

My first attempt.

Click to load notebook preview

@Neeraj Salodia Looks like your factor improves with delay. Have you tried putting a 20-day SimpleMovingAverage filter on your final factor? That would also tame down your turnover spikes.

@Thomas Wiecki - This dataset only gives us access to aggregate tallies of unique insider buyers/sellers and nothing else? No amounts of shares? No amounts of transactions? No job titles? No percentage increase/decrease of existing stake in company? This seems inadequate for investigating most basic conventional wisdom on insider buying/selling.

@Viridian Hawk: Thanks for giving feedback to others on how to improve factors!

Yes, a lot of very interesting information is missing that would be required to build more elaborate factors. We are aware of that limitation but currently unable to provide more so we'll have to do the best with what we got.

@ Viridian Hawk: Thanks a lot for the guidance! I will upload the results for version 2 with SMA applied.

Version 2 with SMA15 applied. Decay looks much better, and the turnover spikes have been subdued.

Click to load notebook preview

Model 0.1

Click to load notebook preview

@Thomas,

I need some clarification to start with this Challenge.
In the Tearsheet Challenge announcement:
- I did not find anything about QTradableStocksUS.
- Post your best factor starting on June 1, 2014, until Sept 1, 2017.
In sample backtester date range setup 9 month longer:
01-04-2014 - 12-19-2017.
And I see some participants already used this range.
What is your recommendation for all?
Date range should be the same for all participants.

@Vladimir: Thanks for highlighting that, those were oversights which I fixed in the post above.

@Everyone: Going forward, please submit backtests in the range 01-04-2014 - 12-19-2017, like in the example algorithm. As we rerun your backtests for evaluation anyway, existing submissions will not have to be updated to be considered, but note that we will use the longer time period to rank your algorithm. Also, make sure your algorithms do not trade any stocks outside the QTU. Apologies for the oversight.

@Leo M: Good point, if a factor only shows specific returns but no (or negative) total returns it should definitely be penalized for that. I will keep that in mind when I score the algorithms.

Version 3 with the required date range.

Click to load notebook preview

Model 1

Click to load notebook preview

Idea #3

Click to load notebook preview

Idea #2

Click to load notebook preview

Idea #4

Click to load notebook preview

Idea 2 Version 1.

Click to load notebook preview

To clarify, we are looking for factors derived from this data to predict the market? (If so can we include other data and factors?) Or are we trying to predict the future data of insider transactions?

Wish all of You with no Decay
A Great,
Prosperous,
Blissful,
Healthy,
Bright,
Delightful,
Mind Blowing,
Energetic,
Terrific
& Extremely HAPPY NEW YEAR 2020.

Avg_IRS = 1.98; Avg_turnover = 0.037; Avg_n_holdings = 1077; Backtest 6

Click to load notebook preview

Hi!

So there's no limit in a single position size, right? Though a larger universe size is better, according to the selection criteria

Happy new year, everyone! Really liking all the submissions so far. It's obvious that this is a rather difficult challenge, compared to the ones before. The data set is not as rich and there seems to be no obvious huge alpha in there. However, I never doubted the Quantopian community would not be up for the task and you don't disappoint!

@Robin: Yes, predict (relative) market movements by building a long-short factor. The factor needs to be based on insider transaction data but you can use other data sets to combine. For example, maybe it is meaningful if the internal sentiment (insiders) and outside sentiment (e.g. price or analyst or twitter sentiment) disagree.

@Emiliano: There is no official limit but if it starts to become high (like 2-5%) I would be a bit worried and see what's going on. Maybe you have outliers in your factor and need to winsorize.

v1 - happy new year all!

Click to load notebook preview

Happy New Year everyone!
Here is my 2nd attempt - Model 0.2 - derived Insider transaction factors combined with Consensus data.
The first 5-day average specific Sharpes: 1.997
The first 10-day average specific Sharpes: 1.942
The 14-day average specific Sharpes: 1.997
Average turnover:10.8%
Average holdings: 946

Click to load notebook preview

1st attempt.

Click to load notebook preview

Model 2

Click to load notebook preview

I enjoyed the financials challenge review just now. I was disappointed though how much the criteria greatly favored overfit strategies. A non-overfit strategy could never compete. It leaves one feeling like one must overfit to have a chance, even knowing that the result is a non-predictive signal. I think the solution would be to rank entrants entirely on OOS performance. Gold medal goes to the winner of the race, not factoring in who had the best coach or did the most training.

Does anybody have any tips for how to balance size exposure without resorting to brute forcing it via the optimizer? I'm encountering huge size skew in the insider transactions factors I'm coming up with. I know the latest guidance is to let risk exposure run free, but I don't think that always makes sense. Sometimes it makes sense to compare a stock to its peers (companies with similar market caps, or within the same industry, etc.)

@Viridian Hawk,

I recommend standardizing, ranking, and/or winsorizing to help adjust for skewed weights or outliers.

Hi guys! This is my first algo submission (Alpha 1).
Do you think it's eligible for the contest?
I am new and still learning. Thanks

Click to load notebook preview

@Viridian: Yes, I agree that's the incentive is skewed unfavorably there. However, in this challenge we will place much higher emphasis on OOS performance so that overfitting the backtest will not be rewarded.

Idea 5

Click to load notebook preview

Model 0.3
Average 5-day SR: 2.011
Average Holdings: 1028
Average turnover: 10.9%

Click to load notebook preview

Model 0.4
Average 5-day SR: 1.91
Average Holdings: 1112
Average turnover: 14%
(with higher total return and increased holdings)

Click to load notebook preview

First entry. I would like to try and get rid of some early alpha decay. Maybe put an SMA on the factor as suggested earlier by someone else.

36.6% Total Returns
32.4% Specific Returns
1.69 Sharpe
-6.2% Max Drawdown
0.06 Volatility
34% Turnover
50 Avg Positions
-0.01 Beta To Spy

This was a fun idea for a contest, thank you @Thomas for putting it all together.

Click to load notebook preview

My first ever submission is stuck at "Pending Review" since yesterday. Did I make any mistake??

@Vivek,

Do you mean your submission to the Q main contest, which is different from the challenge in this thread? If so, it’s probably only because it takes a few trading days for the results to be published, and for new submissions to be reviewed as either qualified as meeting all the required criteria, or not qualified, in which case it will show which criterion/criteria failed.

@Joakim Arvidsson I posted a submission to this thread and it is for the insider-transaction dataset. I can see it in this thread with a "Pending Review" tag i.e. not yet public.

@Vivek: Your submission was so good that our spam filter thought "no way this is true" ;) I approved it now.

@Thomas Wiecki: Thanks , I have just started learning things ,following the tutorials and lectures.
Hopefully, future submissions will be better.

Idea #3, iteration #2

Click to load notebook preview

Model 3

Click to load notebook preview

I am still a bit confused about the dates for the contest. I am using 01/04/2014-12/19/2017 as mentioned in an earlier post but the first post still mentions the a different date and some recent entries might still be using that incorrect date.

My previous submission has in-sample sharpe ratio of 2.6 and an 8-year hold-out sharpe ratio of 0.8 and stability of 0.89. I'm actually quite happy with that. It's fairly consistently profitable out-of-sample year after year. Considering how abstract the dataset is, I wouldn't expect anything more.

Thomas mentioned potentially using a cone projection to weed out over-fit strategies. My strategy likely would not survive that, despite having a decent OOS sharpe, because the in-sample sharpe is much too good. How much of a ding will otherwise fine strategies get for having too-good in-sample performance?

Sorry everybody, I thought I had updated the backtest period but it seems that never happened. The correct time period is Jan 4, 2014, until Dec 19, 2017. If you submitted over the old time period you do not need to resubmit, but your submission will be graded on the Jan 4, 2014, until Dec 19, 2017 period.

@Viridian Hawk: Great question. It sounds like in your case you would be incentivized to worsen your factor so that it has a better chance of surviving the cone. That certainly doesn't sound like the right thing either, so maybe just taking both, IS and OOS into account (with OOS being weighted heavier even) is the most fair and incentivizes the right behavior.

I would tend to agree with @Thomas Wiecki, that when you partition the dataset into IS and OOS into say something like 60% IS and 40% OOS, a more prudent weighing scheme would favor OOS, say 40% weight for Specific returns for IS and 60% weight for OOS. This seems to be the balanced approach towards guarding against overfitting that is used in many time series prediction algos.

Model 1.1 daily rebalance
average turnover 0.12
average holdings 1135.52

Click to load notebook preview

Model 1.2 weekly rebalance
(I ran all models with daily and weekly rebalancing, resulting in slightly different turnovers, holdings and alpha decays. Model x.1 is daily, model x.2 weekly)

average turnover 0.11
average holdings 1135.61

Click to load notebook preview

Model 2.1
average turnover 0.16
average holdings 1135.18

Click to load notebook preview

Model 2.2
average turnover 0.11
average holdings 1152.51

Click to load notebook preview

Model 3.1
average turnover 0.17
average holdings 1154.27

Click to load notebook preview

Model 3.2
average turnover 0.12
average holdings 1153.43

Click to load notebook preview

A little different approach

Model 4.1
average turnover 0.13
average holdings 1016.69

Click to load notebook preview

Model 4.2
average turnover 0.09
average holdings 1064.72

Click to load notebook preview

Model 5.1
average turnover 0.15
average holdings 1050.51

Click to load notebook preview

Model 5.2
average turnover 0.1
average holdings 1096.26

Click to load notebook preview

So, I think I've posted enough for now. Reading the discussion about evaluation and since I don't know how my algorithms will perform in the holdout period, I'm a bit torn between not spamming the forum with tearsheets and posting more that don't look so good but might do well in the holdout period...

@Tentor -- One thing Quantopian has recommended in the past is to keep a "holdout period" when working on a strategy, and only once you're ready to stop playing with the strategy any more you test it over the holdout period to see if it holds up out of sample. You have to resist the urge to revise the algo to make it perform better during the holdout period, because at that point you will be introducing overfit bias.

So, for instance, if you developed your strategy by iterating it on 2014-2017 data, you can test the final version on 2006-2014 data to see if it holds up. If not, then it likely is just an overfit algo.

@Viridian, thanks I already started doing that and will keep on now. - I got the idea from your earlier post ;)

Model 1.0

Backward OOS (1/4/2011-12/31/2013) - Sharpe = 2.56

Click to load notebook preview

@ Thomas Wiecki Hello Thomas, I am still in the data discovery phase of the alpha research. I try to understand the data we can get by using the new FactSet functionality. So I downloaded some data on MSFT with:

insider_txns_form4and5_1d = Form4and5AggregatedTrades.slice(derivative_holdings=False, days_aggregated=1)
unique_sellers_form4and5_1d = insider_txns_form4and5_1d.num_unique_sellers.latest
base_universe = StaticAssets(symbols(['MSFT']))

and after running the pipeline I got something like this (Let's concentrate just on the end of the insample periode):

2017-11-21 00:00:00+00:00 Equity(5061 [MSFT]) 0.0
2017-11-22 00:00:00+00:00 Equity(5061 [MSFT]) 1.0
2017-11-24 00:00:00+00:00 Equity(5061 [MSFT]) 0.0
2017-11-27 00:00:00+00:00 Equity(5061 [MSFT]) 0.0
2017-11-28 00:00:00+00:00 Equity(5061 [MSFT]) 0.0
2017-11-29 00:00:00+00:00 Equity(5061 [MSFT]) 0.0
2017-11-30 00:00:00+00:00 Equity(5061 [MSFT]) 0.0
2017-12-01 00:00:00+00:00 Equity(5061 [MSFT]) 1.0
2017-12-04 00:00:00+00:00 Equity(5061 [MSFT]) 0.0
2017-12-05 00:00:00+00:00 Equity(5061 [MSFT]) 0.0
Name: unique_sellers_form4and5_1d, dtype: float64

I believ the list above should indicate that there was a unique insider seller of MSFT shares on the 1st of December and also on the 22nd of November.

After that I went to the SEC homepage and dowloaded the insider transactions data for MSFT which looks like this:
(pls note I filtered for the common stock transaction, just to be in line with the derivative_holdings=False setting in the FactSet function)

Acquisition or Distribution Deemed Execution Date   Reporting Owner Form    Transaction Type    Direct or Indirect Ownership    Number of Securities Transacted Number of Securities Owned  Line Number Owner CIK   Security Name  

Transaction Date
2017-11-14 D NaN Capossela Christopher C 4 S-Sale --D 4000 185278 1 1601944 Common Stock
2017-11-17 D NaN SMITH BRADFORD L 4 G-Gift --D 17500 870158 1 1193119 Common Stock
2017-11-20 D NaN Capossela Christopher C 4 S-Sale --D 1000 184278 1 1601944 Common Stock
2017-11-20 D NaN Nadella Satya 4 G-Gift --D 12000 1120377 1 1513142 Common Stock
2017-11-21 D NaN Capossela Christopher C 4 S-Sale --D 3000 181278 2 1601944 Common Stock
2017-11-27 D NaN Hood Amy 4 G-Gift --D 36000 466763 1 1576843 Common Stock
2017-11-28 A NaN WARRIOR PADMASREE 4 A-Award --D 589 6905 1 1186249 Common Stock
2017-11-28 A NaN STANTON JOHN W 4 A-Award --D 589 70763 1 904858 Common Stock
2017-11-28 A NaN SCHARF CHARLES W 4 A-Award --D 589 36750 1 1195358 Common Stock
2017-11-28 D NaN PANKE HELMUT 4 F-InKind --D 177 51706 2 1268397 Common Stock
2017-11-28 A NaN PANKE HELMUT 4 A-Award --D 589 51883 1 1268397 Common Stock
2017-11-28 A NaN Johnston Hugh F 4 A-Award --D 464 707 1 1377489 Common Stock
2017-11-28 A NaN Morfit G Mason 4 A-Award --D 589 589 1 1325920 Common Stock
2017-11-30 D NaN BROD FRANK H 4 F-InKind --D 412 115793 1 1183681 Common Stock
2017-12-04 D NaN BROD FRANK H 4 S-Sale --D 18000 82380 1 1183681 Common Stock

As you can see, there are significantly more insider selling in the actual list:
2017-11-02 2.0
2017-11-10 1.0
2017-11-14 1.0
2017-11-17 1.0
2017-11-20 2.0
2017-11-21 1.0
2017-11-27 1.0
2017-11-28 1.0
2017-11-30 1.0
2017-12-04 1.0

Did I use the FactSet function incorrect?

Hey Katalin,
It's tricky to use days_aggregated=1 because that will only consider transactions that have been made and processed by FactSet in the past day. Typically, there is some delay period that is greater than 24 hours between when an insider makes their transaction and when the transaction gets processed by FactSet and added to their data. This delay causes transactions to effectively fall out of the 1-day lookback when days_aggregated is set to 1. For more information on this, see the trailing calendar days section of our Insider Transactions documentation. Using days_aggregated=1 leads to some erratic and generally unexpected behavior, so you must be careful when using it. If possible, see if your strategy can use a larger value (7, 30, or 90).

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Model v0.5

Click to load notebook preview

Revised Model v0.6

Click to load notebook preview

Model #6

Click to load notebook preview

Model #7

Click to load notebook preview

Model 3 v1. Fast moving alpha that had a similar Sharpe decay profile on preceding years OOS.

Click to load notebook preview

Test Submission

Click to load notebook preview

v1.0

@Vladimir,
Thank you for the tip.

Click to load notebook preview

My first challenge submission: v0.1
It's still in development, but I wanted to submit a first try.
Good alpha for three days, then it decays.

Click to load notebook preview

Model 2
Backwards OOS (1/3/2011-12/31/2013) - Total Sharpe = 2.54
- Specific Returns =10.14%
- Common Returns = (1.41)%
- Volatility = 1.1%

Click to load notebook preview

Everyone: Do not be dismayed if you can't find super strong alpha in this challenge, it is a much harder challenge than previous ones. However, factors from different data sources, even if the SR is in the range of 1, still produce diversification benefit so just submit them anyway.

Model 3

Click to load notebook preview

Dear Thomas,

Are there any rules about long/short neutrality?

Dear Quantopians, @Thomas

What is the best practice in case when your alpha gets better with time period?

Should I use alpha factor for, lets say, 5 days ago using custom factor like below or just use SimpleMovingAverage custom factor with window_length=5?

class L_Value(CustomFactor):
window_length=5
def compute(self, today, assets, out, input_factor):
out[:] = input_factor[0]

v3 new range Backtest 6

Click to load notebook preview

v2.0

Click to load notebook preview

Certificate of Low Correlations

Click to load notebook preview

Oh, that's cool, Alex. I hadn't thought of checking how my own strategies correlate to each other. Interesting, I wouldn't have expected that even though they have completely different factors they're still highly correlated.

Click to load notebook preview

Model v1.0 - Pure Insiders (no other dataset imported).

Click to load notebook preview

v2.3

Click to load notebook preview

Model v2.0 - Also Pure Insiders, but with more emphasis on generalization.

Will try some interactions with other data-sets next.

Click to load notebook preview

Hi, version 1.0

Click to load notebook preview

v2.4

Click to load notebook preview

entry #2

Click to load notebook preview

Model 4
Nonlinear signal combination

Click to load notebook preview

Model 5
Nonlinear signal combination

Click to load notebook preview

Model 6
Nonlinear signal combination

Click to load notebook preview

Idea 3 version 1: A trial with Insiders dataset combined with price data.

Click to load notebook preview

v5.5

Click to load notebook preview

Model #8

Click to load notebook preview

Model 3A - Non-additive interactions with 'Value' factors.

Click to load notebook preview

Model 3B - Non-additive interactions with 'Value' factors (alternative ranking).

Click to load notebook preview

Version #1 using EPS & Insiders.

Click to load notebook preview

Idea 4 V1: Insiders data with sentiment data.

Click to load notebook preview

Version 1.3 based on combined factors - longer rebalance cycle

Click to load notebook preview

Version 1.2 based on combined factors - medium rebalance cycle

Click to load notebook preview

Version 1.1 based on combined factors - short rebalance cycle

Click to load notebook preview

First time entering a contest.

Still some work to do but making progress.

What's it mean when the lines cross over in the total/specific returns areas?

Click to load notebook preview

Model 1

Click to load notebook preview

Version 1.4 added a new factor to version 1.2

Click to load notebook preview

Slightly modified version.
My model combines insider data with value factors .

Click to load notebook preview

Model 1.1

Click to load notebook preview

Model 1.2

Click to load notebook preview

Model 1.3

Click to load notebook preview

Model 4A - Non-additive interactions with 'Growth' factors.

Click to load notebook preview

Model 4B - Non-additive interactions with 'Growth' factors (alternative ranking).

Click to load notebook preview

MultiFactor (Insider++) + ML

Click to load notebook preview

Pure factor (just using insider data)

Click to load notebook preview

Can anyone help me understand if my submission was correct? Other people just post the graphs while I do post also the notebook code, is it correct? Is there a way to share just the graphs as you do?
THANKS

@ Emiliano, you'll notice in the upper right corner a tab called "Notebook". Click that and you'll see "Make A Copy". Run the code with your Backtest ID and once completed you can delete all the lines of code except for the graph at the end. I suggest renaming the clone to something new to be organized. I keep the BT ID section and Graph at the end before posting in the forum. Hope this help.

Thanks @Daniel! Anyway it seems to me that not everybody deletes the content before the graphs...
For instance the Alan, who posted just before me, seem to have included the code, too

@Emiliano Fraticelli -- when posting a Notebook you can choose which cell shows up as the "Preview". It's encouraged to choose the graphs as your preview cell so people don't have to open up the notebook to see them.

Model 5A - Non-additive interactions with the Estimates dataset.

Click to load notebook preview

Model 5B - Non-additive interactions with the Estimates dataset (alternative ranking scheme).

Click to load notebook preview

Hi, model 2.0.

Click to load notebook preview

My first algorithm, purely based on Form3AggregatedTrades and Form4and5AggregatedTrades.

Click to load notebook preview

Model 2.1

Click to load notebook preview

Model 2.2

Click to load notebook preview

Model 2.3

Click to load notebook preview

v5.2

Click to load notebook preview

model v1.2

Click to load notebook preview

Model 7 - nonlinear combination ML

Click to load notebook preview

A combination of factors

Click to load notebook preview

Model 3.1

Click to load notebook preview

Model 3.2

Click to load notebook preview

Model 3.3

Click to load notebook preview

1.0

Click to load notebook preview

1.1

Click to load notebook preview

1.2

Click to load notebook preview

Dear Thomas,

It is indeed turned out to be "Mission Impossible" for me! But there are many "Hunt's" in this challenge.
Here is my model for the challenge. All the best everyone.

G S Rao

Click to load notebook preview

Model 2 - Insiders+Estimates ---> ML ---> Factor

Click to load notebook preview

Model 6A - non-additive interactions with Quality factors.

Click to load notebook preview

Model 2.

Click to load notebook preview

2nd model

Click to load notebook preview

Model 3 - Simple Insiders+Estimates ---> ML ---> Factor

Click to load notebook preview

Ver 1 submission. Uses insider transactions only - no other data sets, fundamentals or pricing data.

Click to load notebook preview

Model 4 - Simple 4-factor-(Insiders+Estimates)~Healthcare---> ML ---> Factor

Click to load notebook preview

Model 7A.

Click to load notebook preview

Model 7B (alternative ranking).

Click to load notebook preview

ver 2

Click to load notebook preview

Model 7.1A.

Click to load notebook preview

Model 7.1B (alternative ranking).

Click to load notebook preview

MODEL A1

Click to load notebook preview

MODEL A2

Click to load notebook preview

MODEL A3

Click to load notebook preview

Model 2.1

Click to load notebook preview

Model 1

Click to load notebook preview

Howdy everyone, i think i am going for the participation medal here, its a silly ensemble of some sort of though out factors.

I am probably fitting a bit of noise.

What i basically did was the following:

1) find a few decent factors, -> decent how you ask ? well they had a decent alpha lens reading and seemed significant on a rand correlation basis (IC score)

2) Normalized them winzorised them and ensembled them using a random forest regressor ( tried to use a very shallow tree not to over fit). Retrained the tree once per month (did actually not test any other interval).

3) factor weigted the outputs -> and thats it.

I had not developed any of the features before this challenge so it was completely green field. I think this kind of challenge is great. Creativity is easier to muster up when you have to work with strict constraints.

Click to load notebook preview

Model 4.1 - Reduce average daily turnover by re-balancing weekly.

Click to load notebook preview

Model 8A - Three pure Insiders factors (each statistically significant on held out data).

I'm worried that my previous models were overfit, so this model has increased focus on robustness, simplicity, and statistical significance.

Click to load notebook preview

Model 8B - Same factors; alternative ranking.

Click to load notebook preview

Model 9.

Non-additive interaction of two 'robust' insider factors with six 'robust' Value factors.

Click to load notebook preview

First Submission

Click to load notebook preview

Model 10

Non-additive interaction of two 'robust' insider factors with six 'robust' Growth factors

Click to load notebook preview

Model 11

Non-additive interaction of two 'robust' insider factors with four 'robust' Quality factors.

Click to load notebook preview

MODEL B

Click to load notebook preview

model 1 pure insiders

Click to load notebook preview

model 2 reducing some risk factors

Click to load notebook preview

1st/ Only 1 insider factor.

Click to load notebook preview

Model 1: No fundamental, No technical. Mainly Insider data with interactions with other datasets

Click to load notebook preview

Model 2: No fundamental, No technical. Mainly Insider data with interactions with other datasets. Reduction.

Click to load notebook preview

Go against the flow from bad info by using insider and estimates

Click to load notebook preview

model 3 reducing more risks

Click to load notebook preview

model 4 pure insiders one more variation

Click to load notebook preview

Model 4.1

Click to load notebook preview

Model 4.2

Click to load notebook preview

Model 4.3

Click to load notebook preview

model 5 one more slight modification

Click to load notebook preview

v6.17

Click to load notebook preview

Model 1. It only trades when all criteria are met, hence the periods without holdings. Not sure if this is in accordance with the rules.

Click to load notebook preview

2

Click to load notebook preview

randomfor_insider_1.0. This is an initial version, in case time does not permit an improvement.

Click to load notebook preview

Combined all four insider factors. The highest correlation between any factor is 0.2

Click to load notebook preview

Another go against the flow from bad info by using insider and estimates. Got rid of one factor.

Click to load notebook preview

V 1.0

Click to load notebook preview

Model 12 - Portfolio combination of four 'robust' models.

Click to load notebook preview

Model 2.0

Click to load notebook preview

Go with the flow insider and estimates 2.0

Click to load notebook preview

v6.18

Click to load notebook preview

v3 Backtest 28

Avg_IRT = 3.04; Avg_IRS = 2.90; Avg_n_holdings = 1690; Avg_turnover = 0.050;

Click to load notebook preview

experimental attempt

Click to load notebook preview

experimental attempt 2

Click to load notebook preview

Model 2.

Click to load notebook preview

Model 4

Click to load notebook preview

Model 5

Click to load notebook preview

Model 6

Click to load notebook preview

One man’s trash is another man’s treasure. Insiders give a clue. Final model #7.

Click to load notebook preview

This is a different system without the optimizer. The tearsheet seems negative, although the backtest returns seemed strongly positive.

Click to load notebook preview

Final V5

Click to load notebook preview

Thanks for the submissions everyone - you rock! We'll be analyzing every submission and then post the date for the webinar and winner announcement here.

Late submission - Model A
@ Thomas I think it might be too late to submit. It is up to you whether you check/grade this model. (I would post this model on the pension contest) Sorry for the inconvenience.

Click to load notebook preview

Late submission - Model AA

Click to load notebook preview

Hey Everyone,

The Tearsheet Review & Winners Announcement webinar for the Insider Transactions Dataset Challenge will take place this Thursday, February 27 at 11 am EST. You can watch it at this link if you would like to participate in the live chat, or down below if you would just like to watch!

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Congrats to the winners!

How you managed OOS consistency at such high sharpe ratios baffles me. In my view this dataset is mostly noise/meaningless. What does the number of unique buyers/sellers during a time period tell you without knowing the position of who is doing the buying/selling (e.g. if the general council is buying, that's supposedly bullish), in what amounts, and whether they are regular or unexpected sales/purchases (e.g. unscheduled large buying is bullish whereas scheduled regular selling supposedly doesn't tell you anything). There's no way to differentiate between twenty insiders buying $10k worth each or one insider buying $10mm worth. On top of that, because a larger company will simply have more insiders, many of these "insider transactions" factors may indirectly simply be size factors.

I'm curious if the winners would mind sharing whether you approached this with existing knowledge regarding insider transactions or whether you just data-mined until you found something.

Also if anybody would care to dispel my reservations, feel free.

@ Thomas & Quantopian Team,

Thank you. Many lessons learnt from this review and the previous financials challenge tear sheet review.
Why not create a live leader board for ongoing tear sheet challenges based on your internal assessment of out off sample performance, uniqueness score and other metrics. It will help us align the efforts effectively.
Regards,
G S Rao

@Thomas & @Rene,
Thanks for the webinar...very instructive!

A suggestion...I can't learn anything from your scoring system without knowing what my scores are.
I understand your need for privacy, yet if you make upfront what you're agreeing to by entering the competition, I'm fine with that.
You could individually mail me my scores, yet that is a lot of work.
Perhaps a more Kaggle-like scoring system ???
alan

Is there a danger of people overfitting to the scoring metric if they have access to it upfront? Or would it just improve the quality of the entries (since OOS testing addresses the overfit issues)?

Not to the scoring metric itself, but for the OOS related actual score there would be data leakage. Another reason why I think scores based on live data is a lot more important. In order for held out data to remain valuable, Q should keep OOS based scores internally in my opinion.

I like this sort of challenge, where the data is rather constrained. However in this particular case i was finding it rather hard to deduce the performance as some of the scoring metrics are a bit hard to test for (as i think other posters also have pointed out)

Hi Everyone! If you missed the Insider Transactions Winner Announcement & Tearsheet Review Webinar while it was being held live - it's available to rewatch here:

Congrats once again to our winners!

  • Munkh-Od Jargalsaikhan
  • Joakim Arvidsson
  • Daniele Carabini
  • Magnus O
  • Jay Ross

Your prizes are being processed. You should receive an email from our payment service by this afternoon.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.