Robinhood Account Growth Factor Analysis Research

I did a study a month or so ago on using data on the number of Robinhood accounts that hold a certain security to predict cross-sectional equity returns. Here are some of the results. Please feel free to comment, ask questions (no guarantee I can answer them), and/or point out mistakes I made.

Shoutout to SMB Capital as they posted a video showing the work one of their traders has done to use Robinhood Data along with other datapoints to develop trading strategies. This encouraged me to share some of the work I have done on the topic.

Summary of Key Points (more discussion in attached notebook)

• Robinhood is a fast growing FINRA broker dealer whose clients are relatively inexperienced, yet trade actively. It has a popular
• Data from Robintrack.net tracks the number of Robinhood accounts that hold a given stock.
• A 5-day, long-short account growth factor (i.e. the percent change in number of accounts holding the stock) appears to show positive alpha over the next 1 to 5 trading days. However, performance does deteriorate in the out-of-sample period, but exceptional market volatility makes it difficult to determine whether this deterioration was simply a result of a difficult market regime, or if it was due to overfitting and/or signal decay.
• There is evidence that the source of alpha could be a result of a herding mentality where Robinhood users visually see which stocks are
“trending” in the app, and then pile on, leading to a momentum effect.
• The factor does show some exposure to the short term mean reversion factor, suggesting Robinhood traders may have a tendency to buy
beaten down names.
• Results are noisy, and should be looked at skeptically, especially given the relatively short sample period (2 years). The factor alone
is not likely to be robust on its own. Combination with other complimentary factors is likely necessary.
109
9 responses

Thanks for sharing this analysis, very cool.

I've been wondering how this data source might interact with upcoming earnings dates and expectations. Would a surge in popularity on RH suggest that popular expectations for an earnings surprise are also surging?

One question about your factor -- Robinhood is growing their account base quite quickly. Have you thought about how underlying growth in Robinhood accounts may or may not bias your factor? How much of the percent growth in account holders is due to underlying account growth?

fawce

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Analyzing interactions with earnings dates and expectations sounds like an interesting avenue of research. It might be fruitful to use this data with an earnings prediction model to determine when expectations and reality are misaligned.

As far as whether or not the underlying growth in Total Robinhood accounts may bias the factor, my assumption is that growth in total accounts should proportionally affect the growth in holdings across stocks. If this is the case, then the ranked growth rate (which is what I use in the model), should still be informative as stocks that grow faster than total accounts should increase as a percentage of the whole and be ranked higher.

That said, with the rapid growth Robinhood is experiencing, new users may have different characteristics or behavior than those in the past, which could be a problem for the model.

It's a good point, and something that I'll have to think more about.

Very cool analysis, thanks for sharing. A few random thoughts on this:
- Your out-of-sample period is unlikely to be representative of your in-sample data. This is the period with the most growth, and also when many people impacted by COVID joined the platform (either due to necessity from being unemployed or out of boredom due to the shut-downs).
- Given that, your out-of-sample data also had the COVID shock and associated volatility, not sure how you get around this. I wonder if using some shares as out of sample during the entire period would be better.
- Also, in the data, did you see any indication of the 'pump-and-dump' phenomenon many of the RH traders might be susceptible to? The hypothesis here is that many of them might get their ideas from StockTwits or chatrooms, and rush in at market tops due to FOMO (fear of missing out) and then get left holding the back as other's (including institutional investors) rush to sell.

Hi Erol,

Thanks for your thoughts. By posting, I was hoping it would start some discussion.

• Regarding your first point, yes there is the question as to whether the high growth in accounts due to COVID biases the recent sample period. I'm not sure it is a huge problem as the factor ranks cross-sectionally to normalize for the overall trend in new accounts. Interestingly, it is possible that the high growth in Robinhood accounts could amplify the effect that was observed pre-covid as "robinhooders" become a bigger influence on the market. This may not show up in the out-of-sample data which was negatively impacted by the extreme market volatility (which may suggest that the "high account growth" assets suffer from an unwinding effect during periods of excess volatility).

• Regarding point 2, this is a good suggestion. I probably should have held out some stocks or used something like Thomas Wiecki's odd/even quarter cross validation method.
• Regarding the pump-and-dump phenomenon, the only evidence I saw here was that the 5th quantile performance was lower than the 4th quantile, which could have just been due to noise in the data. However, I do suspect that there is likely to be some mean reversion on some timeframe. I hypothesized that this factor is a form of momentum due to the positive feedback loop generated by the robin hood userinterface (e.g. trending tickers widget). Typically in the literature, momentum is explained by under-reaction to fundamentals or over-reaction by bandwagon jumpers. Given robinhood traders can generally classified as "uninformed" traders, it is likely that the phenomenon I am capturing is momentum by bandwagon jumpers. In other words, in the short term there is momentum due to traders following the herd of other robin hood traders, and over the intermediate to longer term, price is likely to mean revert.

I would also add that the pump-and-dump phenomenon tends to be more pronounced in low-float stocks. This study focused on a more liquid universe (not to say that overreaction doesn't happen there as well), so it would be an interesting study to measure this pump-and-dump effect in lower float stocks.

Hi Michael,

Thank you for sharing this detailed analysis. I'm not sure if I've understood this correctly, so in your backtest you would go long in the top 20% of stocks that showed the most growth in Robinhood Account numbers in the past 5 days, and short in the bottom 20%?

In my anecdotal experience, there appears to be two types of investor behaviour. One being that investors are "FOMOing" into a stock that has already spiked, in which case a correction is likely to occur soon after. On the other hand, sometimes the investors are trying to "buy the dip" into a stock that has just crashed, in which case a correction, if any, is likely to be more long-term. I think it may be worth distinguishing the two types of drivers behind RH account growth, as with our strategy it may be best to avoid the stocks where RH investors are buying the dip.

Also it is my experience that it is common for a stock to see huge spikes in RH account numbers, but the fall in these numbers tend to be more gradual even as the stock price falls. Has this also been your observation? Perhaps it would imply that a long/short strategy may not be the most appropriate strategy.

The dataset I downloaded from Robintrack.net only provided data on 4311 stocks, which is about half of the stocks that the website tracks and is even missing popular stocks like TSLA. Did you also have an issue with this and would you mind sharing your custom dataset so that I could also run this notebook and fiddle with it? That would be much appreciated.

Cheers
Mark

Hi Mark,

To your first question, yes, you are interpreting it correctly. The backtest is going long the stocks with the highest growth in Robinhood account numbers and short the lowest.

To your second point, it sounds like you are describing a scenario where mean reversion is likely to happen. I apologize if I am misunderstanding here, but yes, I do think a mean reversion study makes sense especially when looking at the tails of the distribution.

Regarding whether spikes up in accounts are more likely (or less gradual) than spikes down, I'm not sure to be honest. There might be something there or it could be an artifact of the time period we are in. I imagine if there was something to shock the market, there could come a tipping point where a bunch of Robinhood users all pile out at once causing a spike down, but who knows.

Regarding whether a long/short strategy is the most appropriate, nothing is perfect, it is just one way to slice the data. In the video I linked to in my original forum post, these traders are using it as one variable in more of an event-based framework (they also trade in a discretionary/hybrid fashion as opposed to pure quant). I happen to like their idea of combining this data with Google Trends data on actual searches (as people are more likely to search for a stock before they buy or sell it).

Lastly, regarding the dataset I simply downloaded it from https://robintrack-data.ameo.design/robintrack-popularity-history.tar.gz. I just downloaded and unzipped the file to double check, and I got 8,534 files (1 for each security). I'd be more than happy to help troubleshoot, but not sure reposting the file will do any good.

Mike

Good news,

We've been working with Casey Primozic at Robintrack to develop a single historical file that can be directly uploaded via our Self-Serve Data functionality (a single value per asset per day). https://robintrack-data.ameo.design/robintrack-eod-combined.csv contains all the End of Day values for all the assets RobinTrack supports going back to 2018-05-02 and is updated on a daily basis. This should make it easier to analyze the robintrack data, and come up with your own ideas.

I just downloaded the file this morning and there were 7858 assets that were symbol mapped to assets in the Q universe. @Mark TSLA is definitely supported

from quantopian.research.experimental import query_self_serve_data
query_self_serve_data(robintrack.columns,'US',assets=symbols(['TSLA'])).tail(20)


We are also working towards providing a daily updating live Robintrack dataset, email us at [email protected] to let us know if you are interested.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Very cool, Chris!

Hey Chris,

This sounds awesome but I've heard news that Robinhood might no longer provide data on customer stock holdings https://twitter.com/Kr00ney/status/1291873556683534342

Do we know if the project will continue if so?