Back to Community
The Social Media Trader Mood Series: Introduction

This is the first post in a multi-part piece that will introduce you to the PsychSignal dataset. PsychSignal is a data analytics firm that provides real time Trader Mood metrics for US equities. PsychSignal uses their natural language processing (NLP) engine to analyze millions of social media messages in order to provide two quantified sentiment scores for each security.

My motivation for creating this series came from Vinesh Jha, CEO of ExtractAlpha, during our crowdsourced earnings webinar. In that webinar, Vinesh broke down the main steps that he uses to find and extract alpha from data. It was surprising to hear because his process was about 90% research oriented with only the last 10% reserved for out-of-sample testing/backtesting.

For those familiar with my work, I tend to work the other way around: 90% backtesting and 10% research. So with help from Dr. Jess Stauth, VP of Quant Strategy at Quantopian, I’m creating this series in an attempt to learn the way that smart and successful quants operate. Here’s an outline of the series:

  1. Introduction - Examining the data. My goal here is to simply look at the dataset and understand what it looks like. I’ll be answering simple questions like, “How many stocks are covered?”; “Which sectors have the most coverage?”; and “What’s the distribution of sentiment scores?”. These are very basic but fundamentally important questions that lay the groundwork for all further development.
  2. Research Design - Here, I’ll be setting up my environment for hypothesis testing define my in and out-of-sample datasets both cross-sectionally and through liquidity thresholds.
  3. Hypothesis Testing - This is where I’ll be setting up a number of different hypotheses for my data and testing them through event studies and cross-sectional studies. The Factor Tearsheet and Event Study notebooks will be used heavily. The goal is to develop an alpha factor to use for strategy creation.
  4. Strategy Creation - After I’ve developed a hypothesis and seen that it holds up consistently over different liquidity and sector partitions in my in-sample dataset, I’ll finally begin the process of developing my trading strategy. I’ll be asking questions like “Is my factor strong enough by itself?”; “What is its correlation with other factors?”. Once these questions have been answered, the trading strategy will be constructed and I’ll move onto the next section
  5. Out-Of-Sample Test - Here, my main goal is to verify the work of steps 1~4 with my out-of-sample dataset. It will involve repeating many of the steps in 2~4 as well the use of the backtester (notice how only step 5 involves the backtester)

As this is my first time working through this flow, the steps above are subject to change as I learn and iterate through my mistakes. Feel free to post feedback and questions.

Strategies with Data

Our allocation process attaches high value to algorithms that use alternative datasets. While many datasets contain a premium section only accessible via subscription, strategies developed on the sample portions of these datasets will also be evaluated.

For questions on accessing this data, please email [email protected]

Loading notebook preview...
Notebook previews are currently unavailable.
Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

7 responses

Looking forward to following this!

Thanks Andrew, I've added an extra plot to part 1 of this series to do a quick sketch of bullish/bearish intensities versus stock price movement. You can find it attached to this reply as well as the main post.

Loading notebook preview...
Notebook previews are currently unavailable.

Look forward to your presentation as well!

Nice, I like this way of presenting the data feeders more compared to a sample trading algorithm, which can be misleading.

Hi Seong,

I've always been interested in how trader sentiment (publicly available posts etc) actually correlate with moves in particular stocks. Continuing on that if the sentiment scores actually have a correlation to the price move, what exposure would algorithmic trading funds have to manipulation of the sentiment scores (assuming the sample size Psychsignal uses isn't sufficiently large enough).

Disclaimer: I haven't done due diligence on PsychSignal yet so this info could be quite easily found and I just haven't looked into it yet... (only just stumbled upon this)

Cheers,
JB

Thanks for the feedback. Part 2 of this series is up (https://www.quantopian.com/posts/the-psychsignal-trader-mood-series-research-design).

It tackles setting up the environment for hypothesis testing define my in and out-of-sample datasets both cross-sectionally and through liquidity thresholds.

Loading notebook preview...
Notebook previews are currently unavailable.

Join us for our webinar on April 4th, 2016 at 2PM EST as we walk through this series with James Crane-Baker from PsychSignal. Register here: https://attendee.gotowebinar.com/register/8832468538788974596