Back to Community
Long-only Trading Strategy with NLP derived Social Media Sentiment - Tear Sheet Attached

Hey all,

Trader mood indices attempt to measure the emotional pulse of the markets; How bullish are traders on $GOOG? or How bearish are traders on $SPY? and Are people equally bullish and bearish on $AAPL?

In order to see what trader mood indices might look like in a trading strategy, I collaborated with the folks over at PsychSignal and backtested a trading strategy based off their data.

The strategy is pretty simple. The PsychSignal dataset is derived from a natural language processing engine that detects bullish/bearish trader moods from raw messages. I follow the BULLISH indicators within the NASDAQ 100 to create a long-only strategy. The raw messages that the scores are based off of are from Stocktwits - you can check out previous work with them here.

PsychSignal has datasets that date back to 2009 with sources like StockTwits, Twitter, T3Live Chat and other private stock chat rooms.

Here are the strategy notes:

Data Fields Used:

  • Bullish Intensity (-5 - 5) – The PsychSignal bullish sentiment score
  • Bullish Minus Bearish Intensity score (-5 - 5) – The PsychSignal bullish minus bearish sentiment score
  • Number of Message Analyzed – The total number of messages analyzed

Strategy Notes:

  • Look for where total messages over the last 30 days are greater than 50
  • Look for where the average bullish - bearish signal is greater than 0
  • Score each security by it's bullish_intensity level and the number of bull messages
  • Go long on the top 7 securities scored by our factor ranking

While the backtest outperforms the market, there are still many more roads to explore. Roads that any interested quant is free to travel down. On that note, Dr. Checkley from PsychSignal comments that:

“There are no short positions, which goes hand-in-hand with the beta problem. But clearly, it would be no great challenge to, say, “invert” the analysis above and seek-out strong bear signals for short trades. And there is unlimited scope for changing filters, the buy and sell rules, the universe of stocks, etc. As you get comfortable with using these sentiment metrics, we recommend blending the bull and bear measures with other indicators, such as long and short-term price trend or trade volume. You can also use a more sophisticated model, such as a Neural Network, to create your predictions and trading signals. With hundreds of heavily-tweeted stocks and assets to choose from, and your full arsenal of analytical tools and algorithm-refinements, you can improve on our results in short-order.”

Request access to the sample data used in this backtest by sending an email to [email protected]

I'd love to hear all of your thoughts and questions on this strategy and/or dataset!

Loading notebook preview...
Notebook previews are currently unavailable.
Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

11 responses

Here are the results of the backtest

Clone Algorithm
295
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 559155788aa037124ffa7fb9
There was a runtime error.

Hello Everyone, my name is Matthew Checkley and I'm a Research Scientist with PsychSignal. I'll keep a beady eye on comments from the excellent Community here and attempt to add helpful comments when I can. Good algo-ing!

Hi Quantopian,

If anyone would like access to any of our full historical data sets, just shoot a quick email to: [email protected]

In addition to the pure StockTwits sourced dataset used in this example, we have four other "flavors" which are:

Twitter without retweets
Twitter with retweets
Stocktwits & Twitter without retweets aggregated
Stocktwits & Twitter with retweets aggregated

We also have a nice collection of white-papers we're happy to share.

there is some errors on your script can/t run it

Tiosa,

Thanks for checking in. If you'd like to run the script - James ([email protected]) is making the full historical dataset available. Just shoot him an email and you can get it setup by replacing the data_link found in

psychsignal_data with the link he sends you.

This might be interesting to some folks so I thought I would share.

A Quantopian user (Will Wright) just created this oscillator and sent it to me.

Chart of the PsychSignal Trader Mood Oscillator

The attached chart shows the PsychSignal Trader Mood Oscillator overlaid against the SPDR S&P ETF (SPY). The black rectangles highlight areas of prolonged bearish oscillator readings. The rectangles are aligned with the moment the oscillator crosses from the positive to negative readings. It's insightful to observe price action within these areas of interest. Notice the seemingly advanced warning time the signal provides prior to the response in price.

Formula for PsychSignal Trader Mood Oscillator from Will:

(RD_BULLISH/RD_BEARISH) -1

where
RD_BULLISH = Rolling 10 period Standard Deviation of the differenced Bullish Message Count
RD_BEARISH = Rolling 10 period Standard Deviation of the differenced Bearish Message Count

Note: I did most of the data work in R and calculated Standard Deviation with sample=TRUE so technically it is a 9 period Rolling Standard deviation period (does not include current day's change in the calculation)

Happy 4th of July folks!

It's about time you got on here James :) ! Awesome work - This looks very impressive!

Someone should do a similar test but filter out any social media posts with poor grammar. I've seen some silly things on stocktwits.

Hi everyone,

We created a file with 5 years worth of daily data containing the top 100 symbols in terms conversation volume.

Feel free to test with this file linked below and if you'd like more data don't hesitate to ask.

PsychSignal Top 100

This is an update to the algorithm that Seong Lee from Quantopian posted earlier in this thread.

The main change is to use the new publicly available “quantopian-100” file that James Crane-Baker from PsychSignal posted above.

While I am sure James would welcome your comments, you can now test and investigate this signal file as you like.

Other changes are:
1) The universe is now set by the contents of the file. While the file contains a few symbols that are not tradeable today (like DAX), they could be used for signals. (When running, you will see that fetcher is warning about dropped rows because of these non-tradeable symbols)
2) Remove dependency that symbol had to be in the nasdaq 100.
3) Use a schedule_function to record leverage at market close which then allows the use of a pass handle_data to speed up execution a bit.
4) Correct number of stocks traded to match program description (7).
5) log.info if a new leverage high limit is detected
6) log.info the symbols that are not available for trading on the first trading day found in the signal csv file

There is a lot more columns of interesting information in the 4+ years worth of financial market social media chatter data in the csv file provided by James then is what is being used here. What is interesting to me is that there are some time periods that show real potential. I think Seong’s work is just scratching the surface of what could be done with this data.

Clone Algorithm
69
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 55a63d820baa310c695ab3bc
There was a runtime error.

Looks promising @Richard, though a longer backtest (using the full range of data james provided) doesn't fare as well.

Clone Algorithm
17
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 55a71dbafeb67a0c88458564
There was a runtime error.