Back to Community
Analyzing Alpha in 10-Ks and 10-Qs

This is the second post in a two-part series. For a walkthrough of the dataset construction, see the Data Cleaning notebook.


Companies generally do not make major changes to their 10-K and 10-Q filings. When they do, it is predictive of significant underperformance in the next quarter. We find alpha in shorting the companies with the largest text changes in their filings and buying the companies with the smallest text changes in their filings.


Publicly listed companies in the U.S. are required by law to file "10-K" and "10-Q" reports with the Securities and Exchange Commission (SEC). These reports provide both qualitative and quantitative descriptions of the company's performance, from revenue numbers to qualitative risk factors.

When companies file 10-Ks and 10-Qs, they are required to disclose certain pieces of information. For example, companies are required to report information about "significant pending lawsuits or other legal proceedings". As such, 10-Ks and 10-Qs often hold valuable insights into a company's performance.

These insights, however, can be difficult to access. The average 10-K was 42,000 words long in 2013; put in perspective, that's roughly one-fifth of the length of Moby-Dick. Beyond the sheer length, dense language and lots of boilerplate can further obfuscate true meaning for many investors.

The good news? We might not need to read companies' 10-Ks and 10-Qs from cover-to-cover in order derive value from the information they contain. Specifically, Lauren Cohen, Christopher Malloy and Quoc Nguyen argue in their recent paper that we can simply analyze textual changes in 10-Ks and 10-Qs to predict companies' future stock returns. (For an overview of this paper from Lauren Cohen himself, see the Lazy Prices interview from QuantCon 2018.)

In this investigation, we attempt to replicate their results on Quantopian.

The notebook attached below presents an analysis of the "similarity score" alpha factor. We begin by retrieving our alpha factor data with Pipeline; after transforming the data into the appropriate format for Alphalens, we run a full tearsheet that details mean returns, turnover, Normality, and more.

Loading notebook preview...

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

5 responses

Thanks Lucy for the Alphalens study - results show some compelling spreads and fat tails!

What is the actual portfolio construction process? In the paper, it says that the stock, if it's in the 1st or 5th quintile, would enter the portfolio the next month and stay in the portfolio for 3 months. So the "trading signal (ie. calculation of the factor scores/stock ranking) is only done monthly but is reviewed quarterly according to the paper in the end of the report release month, and finally is executed with a one month lag. Am I correct?
As an illustration, the first set of stock to long/short is easy since it's based on the initial score. And let's say the first construction gets executed in March (one month after the Jan/Feb prior year's Q4 release period). And in April and May, that would just be a rebalancing period. The factor score would be calculated continuously in this period until the end of May, and the final (or 2nd) set of the portfolio would be executed in June. Then this process will go on and on.
Let me know if I am interpreting the paper's methodology correctly.

This anomaly has already vanished! Not worth the effort of collecting all that data. Lauren Cohen should have kept it, instead of publishing it.

Not realistic for real trading

@ George Tyrakis - I disagree. An academic paper that outlines a market anomalies is useless if the anomaly cannot be acted upon. Large part of what quant funds do are based on some or other version of published trading anomalies and risk premiums.