Back to Community
Sentiment Filter for Equity Investment

This post will hope to show that sentiment is a viable stock selection criteria and that even a simple and obvious strategy exhibits sentiment alpha.

Long-Only Positive Sentiment Filter Strategy
The strategy can be described as follow:
- At the start of each trading day
- Select stocks in the universe with previous day sentiment > 7 and invest in them with weightage based on the previous day sentiment holding for 1 trading day

We basically invest in stocks that had a strong sentiment on the previous trading day, believing that the sentiment will drive further gains and profit.
The weightage assigned to each stock will be (sentiment/sum of all sentiment that fits filter) as we want to assign more weightage on stocks with the best sentiment.

Testing Methodology
For testing, 29 stocks that are either current components in the Dow Jones Industrial Average or were component of the index in the past were used. The list of stocks can be found in the code itself. The benchmark index can be changed on line 27. The backtest period can also be extended to include up till August of 2017 as the data is there on the file.

Trading too many Stocks
It is important to ensure that the strategy does not trade too many stocks. With an extremely short holding period of 1 day and high portfolio turnover, trading too many stocks can lead to extremely high transaction costs.

A secondary selection criteria can be used if we find that the first filter selects too many stocks. We can further reduce the number of positions opened by taking the n-largest or n-random stocks from the selected stocks based on how many we need. The code posted has 2 helper functions that can be used to tweak the strategy to reduce stocks traded, namely n_largest and n_random which can be applied to the scores variable before we convert to weights.

Other Possible Filters
Here are some suggestions for other possible filters:
- Sentiment – 5-day Sentiment Simple or Exponential Moving Average
- (Sentiment – 5-day Sentiment Low)/(5-day Sentiment High – 5-day Sentiment Low) (Variant of the Stochastic Oscillator in Technical Analysis)

Both above filters work on the concept of relative sentiment, which means that when we look at current sentiment data, we need to also consider the sentiment data of the recent past before we can determine how good or bad current sentiment data is.

For example, if the past sentiment has been extremely bad for successive days and the sentiment today is not as bad, we will view this as a good sign and count the current sentiment as good even though the absolute value of the sentiment is not high. The above filters will allow us to compare how the current sentiment is compared to the sentiment of the recent past.

More about the data
The data used in this particular instance is powered by FinSentS published by InfoTrie Financial Solutions.

The FinSentS News Sentiment database offers daily media sentiment indicators for 23,000+ global equities, calculated by applying sophisticated real-time machine-learning algorithms to the content of thousands of news websites and media sources from around the world.

Each stock has 5 indicators:
- Sentiment Score: a numeric measure of the bullishness / bearishness of news coverage of the stock.
- Sentiment High / Low: highest and lowest intra-day sentiment scores.
- News Volume: the absolute number of news articles covering the stock.
- News Buzz: a numeric measure of the change in coverage volume for the stock.

Clone Algorithm
64
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 59795c7717e2774e02f33aea
There was a runtime error.
7 responses

Great work Daniel! We at Quantopian are always excited by members of the community using fetch_csv to import more experimental datasets while we work to add more to the supported store.

Some feedback on your algorithms, feel free to start a dialogue on this thread if anything is unclear:

  1. Try changing your universe to better fit the quantitative workflow. In quantitative trading strategies, traders generally to stick to liquid, more easily tradable assets. At Quantopian, we have developed 2 dynamic universes called the Q500 and the Q1500 that constantly reevaluate all US equities for liquidity and tradability (through M&A data sources). Having this larger universe would also reduce the position concentration risk of your strategy, which can better your drawdown and volatility statistics. You mention how trading too many stocks increase the position turnover rate, this is true. Perhaps it would be useful to setup your trading universe to be the intersection of the Q500 and the dataset, which will result in a universe somewhere in the low hundreds. This can be implemented in a few lines of code using our Pipeline API. Pair this with next tip and you will still only trade quality signal and still have a relatively low position turnover, in the realm of cross-sectional equity strategies that is. To trade the Q500, which is based upon the S&P 500, change the baseline index to SPY.
  2. Try using your signal go both long and short. Your data has a top line sentiment score, and you are actively choosing to go long only on the stocks that have a high sentiment on the previous day. If your investment hypothesis is true, it stands to reason that those stocks which have low sentiment on the previous day will have lower returns on the next day, at least lower than the high sentiment stocks. Why not go long-short on the top and bottom deciles of a cross-sectional universe that I suggest using in the previous tip? This would therefore be a targeted bet on the quality of your signal: in other terms, by longing and shorting the top and bottom performers in your sentiment signal, the performance of your strategy would be purely based upon the spread of the sentiment score. The previous statement is not entirely true, the Q500 and Q1500 are not risk optimized for sector or common factor neutrality, but since you are using the Optimize API you can pass in any factors you want and ask the API to tune your weightings so that the performance of the strategy is purely your imported signal and not simply riding an sector or risk factor "wave of returns". We at Quantopian believe that when it comes to developing data based algorithms that the user tune out the noise using dollar neutral and risk factor neutral implementation.

Overall, I want to definitely mention that you are pointed down the right path of algorithm development, and that by implementing some of the changes above you will more closely follow the path of receiving an allocation for the fund. Below are some links to guide you in the implementation of the tips above:

Q500 and Q1500

Pipeline API Tutorial, look into combining filters here

Optimize API, look into constraining common risk factors and dollar neutrality here

Good luck!

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

The margin can be avoided by multiplying weights[key] by around .55. Currently pretty high.

@Tanay
Yes, I found the specifying of the stocks in the trading universe to be rather clunky since I had to list them out one-by-one. At the same time, as the external data was predownloaded for the universe that I used specifically, a dynamic universe might not be suitable as stock that are not in the external data could be added into the universe. The external data would have to either be dynamic or extremely large to factor for all stocks at could be added to the universe.

@Blue
Is there unintended margin? I used the order_optimal_portfolio on a daily basis and that could lead to a 100% portfolio turnover (Sell Everything and Buy positions equal to the portfolio value) That could make it look like there is a 2x leverage. Another explanation is that the division is sometimes rounded up and that leads to the sum of weights to exceed 1. Though sometimes it is also lower so the effect somewhat balanced out. There is also the point that stocks have to be bought in whole lots.

Hey Daniel,
There is way of joining two different filters in the Pipeline such that the union between the two sets is what your universe operates on. https://www.quantopian.com/help#api-data-fetcher_assets gives you the list of securities that are actively in your fetch_csv file. If you "&" that with the Q1500 in Pipeline, which dynamically chooses tradable assets, you will have selected the assets in file universe that are safe to trade in a quantitative method. This step of the Pipeline Tutorial details that process: https://www.quantopian.com/tutorials/pipeline#lesson6
Thanks,
Tanay Trivedi

Hi!
First post here :)
Just for fun (trying to get used to the website), I added some shorts, and leverage recording. I played around with the "shorting limit" and "longing limit", also added a "long_short_ratio" but it didn't seem to affect much the results.
Long story (not so long... it was done in some minutes I think) short: Got beta below 0.3, with still pretty good results on Alpha, Cumulative Rets, Sharpe...
Great dataset! (it seems...)
When does the original data start? (may be interesting to do some Out of Sample tests...)

I hope the attached backtest will appear correctly...

Edit: Oh! Just realized why the "long_short_ratio" wasn't working... haha, little bug (it was supposed to be (1-long_short_ratio) in the shorts...)

Clone Algorithm
8
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 597bc6b3ccbf22532c4d7767
There was a runtime error.

One last result (fixed the "long_short_ratio"). Sorry for the double post (don't know if I can attach in an edit).
It has beta close to 0, and still beats the benchmark...

(Ok, I now return the algorithm to @Daniel . It was just fun playing with it for a while... thanks for sharing it!)

Clone Algorithm
8
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 597bca0bae317c4f36b7bf77
There was a runtime error.

Is there unintended margin?

1.2 million over the initial 1 million, total cash use 2.2 million.

If one were to do the following, they wouldn't see that, most is intraday, and due to scale also.

def before_trading_start(context,data):  
    record(cash = context.portfolio.cash)  

Ways to reveal margin
https://www.quantopian.com/posts/max-intraday-leverage would show 1.91 but a version for cash low can be made from it.
https://www.quantopian.com/posts/pvr does both of those already and more, it catches several max|min.
https://www.quantopian.com/posts/margin-charting (see any overnight margin). That will chart overnight margin for you (where costs are incurred). Then if you click legend items to turn off everything except Margin, you'll see when spikes happen.
In before_trading_start, log.info(context.portfolio.cash). Your worst overnight (not intraday) was -14,086.
This works and is simple for catching just overnight margin:

def before_trading_start(context,data):  
    if context.portfolio.cash < 0:  
        record(margin = context.portfolio.cash)  

It isn't all bad news. Once margin is under control, the changes usually result in higher actual returns.