Back to Community
News & Blog Sentiment Pipeline Factors with Accern

For those interested in using news sentiment data feeds, this thread is dedicated to showing pipeline factor examples created with Accern's Alphaone data feed. These factors are meant for you to build, iterate on, and use in other Pipeline based algorithms you may have.

There will be a number of different factors on this thread and new ones will be updated periodically. To view a complete list of template pipeline factors for all data feeds, please visit this factor library.

So here's what you're going to get:

  • Research: You'll get the economic hypothesis behind the factor and the iterations I went through before arriving at the final factor. You'll also be able to clone and visualize this factor yourselves by adjusting the date ranges and data imports available (for those who don't have the premium datafeed). This process is something you can replicate for yourselves easily by using the Factor Tearsheet.
  • Pyfolio in-sample and out-of-sample factor performance: You'll get the in-sample (26 Aug 2012 - 14 Apr 2014) and out-of-sample (15 Apr 2014 - 11 Apr 2016) results of a template algorithm using this factor.
  • Template algorithm: You'll get the template algorithm used to evaluate this factor. This algorithm is not geared for trading as commissions and slippage are set to zero; however, it provides a good, liquid universe filter for you to build and add other pipeline factors. Commissions and slippage are set to zero in order to analyze the alpha signal of the factor independent of other variables. I recommend using these factors as building blocks for your overall strategy.

Data Description

  • article_sentiment - a score in [-1,1] reflecting the sentiment of articles written about the company in the last day. The higher score, the more positive the outlook
  • impact_score - on [0,100], this is the probability that the stock price will change by more than 1% (given by: close - open / open) on the next trading day
  • Coverage extends to, on average, 5,000 securities per year

All Factors

Each factor is linked to the full research and backtest. You can find them all listed here:

Factor 1: Average monthly article sentiment weighed by sentiment volatility

class WeightedSentimentByVolatility(CustomFactor):  
    # Economic Hypothesis: Sentiment volatility can be an indicator that  
    # public news is changing rapidly about a given security. So securities  
    # with a high level of sentiment volatility may indicate a change in  
    # momentum for that stock's price.  
    inputs = [alphaone.article_sentiment]  
    window_length = 30

    def compute(self, today, assets, out, sentiment):  
        out[:] = np.nanstd(sentiment, axis=0) * np.nanmean(sentiment, axis=0)  

Factor 2: Daily article sentiment weighted by impact score

class DailySentimentByImpactScore(CustomFactor):  
    # Economic Hypothesis: Accern reports both an `impact score`  
    # and `article sentiment`. The `impact score` is used to measure  
    # the likelihood that a security's price changes by more than 1%  
    # in the following day. The `article sentiment` is a quantified daily  
    # measure of news & blog sentiment about a given security. This combined  
    # measure of `impact score` and `article sentiment` may hold information  
    # about price changes in the following day.  
    inputs = [alphaone.article_sentiment, alphaone.impact_score]  
    window_length = 1

    def compute(self, today, assets, out, sentiment, impact_score):  
        out[:] = sentiment * impact_score  

Algorithms using Accern's News Sentiment

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

14 responses

This looks great! Really simplifies adding a news sentiment factor to an exiting pipeline, thanks!

I keep having a NaN error when using this. How can I prevent that?

Can you clarify for what algorithm and dates you're experiencing the error?

I end up getting ValueError: cannot convert float NaN to integer starting in August 2015

I'm sorry to hear that. I want to be able to understand what you're going through, but using the algorithms I've posted in this thread, I can't seem to reproduce the VaueError that you're seeing. Are there any changes that you've made?

Yes it's quit q bit different but using Accern in the pipeline. I added you as a collaborator so you can take a look.

Factor 1: Average monthly article sentiment weighed by sentiment volatility

Updated for Q2:

class WeightedSentimentByVolatility(CustomFactor):  
    # Economic Hypothesis: Sentiment volatility can be an indicator that  
    # public news is changing rapidly about a given security. So securities  
    # with a high level of sentiment volatility may indicate a change in  
    # momentum for that stock's price.  
    inputs = [alphaone.article_sentiment]  
    window_length = 30

    def compute(self, today, assets, out, sentiment):  
        out[:] = np.nanstd(sentiment, axis=0) * np.nanmean(sentiment, axis=0)  
Loading notebook preview...
Notebook previews are currently unavailable.

Factor 1: Average monthly article sentiment weighed by sentiment volatility

Updated for Q2: In-sample template backtest

class WeightedSentimentByVolatility(CustomFactor):  
    # Economic Hypothesis: Sentiment volatility can be an indicator that  
    # public news is changing rapidly about a given security. So securities  
    # with a high level of sentiment volatility may indicate a change in  
    # momentum for that stock's price.  
    inputs = [alphaone.article_sentiment]  
    window_length = 30

    def compute(self, today, assets, out, sentiment):  
        out[:] = np.nanstd(sentiment, axis=0) * np.nanmean(sentiment, axis=0)  
Clone Algorithm
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
import pandas as pd
import numpy as np

from quantopian.algorithm import attach_pipeline, pipeline_output
from quantopian.pipeline import Pipeline
from quantopian.pipeline.factors import CustomFactor, AverageDollarVolume
from quantopian.pipeline.classifiers.morningstar import Sector

# Sample Version available from 26 Aug 2012 - 08 Feb 2014
from quantopian.pipeline.data.accern import alphaone_free as alphaone

# Premium Version found at https://www.quantopian.com/data/accern/alphaone
# from quantopian.pipeline.data.accern import alphaone as alphaone

class WeightedSentimentByVolatility(CustomFactor):
    # Economic Hypothesis: Sentiment volatility can be an indicator that
    # public news is changing rapidly about a given security. So securities
    # with a high level of sentiment volatility may indicate a change in
    # momentum for that stock's price.
    inputs = [alphaone.article_sentiment]
    window_length = 30
    
    def compute(self, today, assets, out, sentiment):
        out[:] = np.nanstd(sentiment, axis=0) * np.nanmean(sentiment, axis=0)
    
def make_pipeline():
    # Create our pipeline
    pipe = Pipeline()
    
    # Screen out penny stocks and low liquidity securities.
    dollar_volume = AverageDollarVolume(window_length=20)
    is_liquid = dollar_volume.rank(ascending=False) < 1000
    
    # Create the mask that we will use for our percentile methods.
    base_universe = (is_liquid)

    # Filter down to stocks in the top/bottom 10% by sentiment rank
    factor = WeightedSentimentByVolatility()
    longs = factor.percentile_between(90, 100, mask=base_universe)
    shorts = factor.percentile_between(0, 10, mask=base_universe)

    # Add Accern to the Pipeline
    pipe.add(longs, "longs")
    pipe.add(shorts, "shorts")

    # Set our pipeline screens
    pipe.set_screen((longs | shorts) & (factor != 0))
    return pipe

# Put any initialization logic here. The context object will be passed to
# the other methods in your algorithm.
def initialize(context):
    attach_pipeline(make_pipeline(), name='factors')
    
    # Create our scheduled functions
    schedule_function(rebalance, date_rules.month_start())
    schedule_function(record_positions, date_rules.every_day(),
                      time_rules.market_close())

    set_commission(commission.PerShare(cost=0, min_trade_cost=0))
    set_slippage(slippage.FixedSlippage(spread=0))

def before_trading_start(context, data):
    # Assign long and short baskets
    results = pipeline_output('factors')
    assets_in_universe = results.index
    context.longs = assets_in_universe[results.longs]
    context.shorts = assets_in_universe[results.shorts]
    
def record_positions(context, data):
    # Record our leverage, exposure, positions, and number of open
    # orders
    record(lever=context.account.leverage,
           exposure=context.account.net_leverage,
           num_pos=len(context.portfolio.positions))
    
def rebalance(context, data):
    short_weight = -1.0/len(context.shorts)
    long_weight = 1.0/len(context.longs)
    assets_in_universe = (context.longs  | context.shorts)

    # Order our shorts
    for security in context.shorts:
        if data.can_trade(security):
            order_target_percent(security, short_weight)
            
    # Order our longs
    for security in context.longs:
        if data.can_trade(security):
            order_target_percent(security, long_weight)
            
    # Order securities not in the portfolio
    for security in context.portfolio.positions:
        if data.can_trade(security):
            if security not in assets_in_universe:
                order_target_percent(security, 0)
There was a runtime error.

Factor 1: Average monthly article sentiment weighed by sentiment volatility

Updated for Q2: Out-of-sample template backtest

class WeightedSentimentByVolatility(CustomFactor):  
    # Economic Hypothesis: Sentiment volatility can be an indicator that  
    # public news is changing rapidly about a given security. So securities  
    # with a high level of sentiment volatility may indicate a change in  
    # momentum for that stock's price.  
    inputs = [alphaone.article_sentiment]  
    window_length = 30

    def compute(self, today, assets, out, sentiment):  
        out[:] = np.nanstd(sentiment, axis=0) * np.nanmean(sentiment, axis=0)  
Clone Algorithm
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
import pandas as pd
import numpy as np

from quantopian.algorithm import attach_pipeline, pipeline_output
from quantopian.pipeline import Pipeline
from quantopian.pipeline.factors import CustomFactor, AverageDollarVolume
from quantopian.pipeline.classifiers.morningstar import Sector

# Sample Version available from 26 Aug 2012 - 08 Feb 2014
# from quantopian.pipeline.data.accern import alphaone_free as alphaone

# Premium Version found at https://www.quantopian.com/data/accern/alphaone
from quantopian.pipeline.data.accern import alphaone as alphaone

class WeightedSentimentByVolatility(CustomFactor):
    # Economic Hypothesis: Sentiment volatility can be an indicator that
    # public news is changing rapidly about a given security. So securities
    # with a high level of sentiment volatility may indicate a change in
    # momentum for that stock's price.
    inputs = [alphaone.article_sentiment]
    window_length = 30
    
    def compute(self, today, assets, out, sentiment):
        out[:] = np.nanstd(sentiment, axis=0) * np.nanmean(sentiment, axis=0)
    
def make_pipeline():
    # Create our pipeline
    pipe = Pipeline()
    
    # Screen out penny stocks and low liquidity securities.
    dollar_volume = AverageDollarVolume(window_length=20)
    is_liquid = dollar_volume.rank(ascending=False) < 1000
    
    # Create the mask that we will use for our percentile methods.
    base_universe = (is_liquid)

    # Filter down to stocks in the top/bottom 10% by sentiment rank
    factor = WeightedSentimentByVolatility()
    longs = factor.percentile_between(90, 100, mask=base_universe)
    shorts = factor.percentile_between(0, 10, mask=base_universe)

    # Add Accern to the Pipeline
    pipe.add(longs, "longs")
    pipe.add(shorts, "shorts")

    # Set our pipeline screens
    pipe.set_screen((longs | shorts) & (factor != 0))
    return pipe

# Put any initialization logic here. The context object will be passed to
# the other methods in your algorithm.
def initialize(context):
    attach_pipeline(make_pipeline(), name='factors')
    
    # Create our scheduled functions
    schedule_function(rebalance, date_rules.month_start())
    schedule_function(record_positions, date_rules.every_day(),
                      time_rules.market_close())

    set_commission(commission.PerShare(cost=0, min_trade_cost=0))
    set_slippage(slippage.FixedSlippage(spread=0))

def before_trading_start(context, data):
    # Assign long and short baskets
    results = pipeline_output('factors')
    assets_in_universe = results.index
    context.longs = assets_in_universe[results.longs]
    context.shorts = assets_in_universe[results.shorts]
    
def record_positions(context, data):
    # Record our leverage, exposure, positions, and number of open
    # orders
    record(lever=context.account.leverage,
           exposure=context.account.net_leverage,
           num_pos=len(context.portfolio.positions))
    
def rebalance(context, data):
    short_weight = -1.0/len(context.shorts)
    long_weight = 1.0/len(context.longs)
    assets_in_universe = (context.longs  | context.shorts)

    # Order our shorts
    for security in context.shorts:
        if data.can_trade(security):
            order_target_percent(security, short_weight)
            
    # Order our longs
    for security in context.longs:
        if data.can_trade(security):
            order_target_percent(security, long_weight)
            
    # Order securities not in the portfolio
    for security in context.portfolio.positions:
        if data.can_trade(security):
            if security not in assets_in_universe:
                order_target_percent(security, 0)
There was a runtime error.

Factor 2: Daily article sentiment weighted by impact score

Updated for Q2:

class DailySentimentByImpactScore(CustomFactor):  
    # Economic Hypothesis: Accern reports both an `impact score`  
    # and `article sentiment`. The `impact score` is used to measure  
    # the likelihood that a security's price changes by more than 1%  
    # in the following day. The `article sentiment` is a quantified daily  
    # measure of news & blog sentiment about a given security. This combined  
    # measure of `impact score` and `article sentiment` may hold information  
    # about price changes in the following day.  
    inputs = [alphaone.article_sentiment, alphaone.impact_score]  
    window_length = 1

    def compute(self, today, assets, out, sentiment, impact_score):  
        out[:] = sentiment * impact_score  
Loading notebook preview...
Notebook previews are currently unavailable.

Factor 2: Daily article sentiment weighted by impact score

Updated for Q2: In-sample template backtest

class DailySentimentByImpactScore(CustomFactor):  
    # Economic Hypothesis: Accern reports both an `impact score`  
    # and `article sentiment`. The `impact score` is used to measure  
    # the likelihood that a security's price changes by more than 1%  
    # in the following day. The `article sentiment` is a quantified daily  
    # measure of news & blog sentiment about a given security. This combined  
    # measure of `impact score` and `article sentiment` may hold information  
    # about price changes in the following day.  
    inputs = [alphaone.article_sentiment, alphaone.impact_score]  
    window_length = 1

    def compute(self, today, assets, out, sentiment, impact_score):  
        out[:] = sentiment * impact_score  
Clone Algorithm
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
import pandas as pd
import numpy as np

from quantopian.algorithm import attach_pipeline, pipeline_output
from quantopian.pipeline import Pipeline
from quantopian.pipeline.factors import CustomFactor, AverageDollarVolume
from quantopian.pipeline.classifiers.morningstar import Sector

# Sample Version available from 26 Aug 2012 - 08 Feb 2014
from quantopian.pipeline.data.accern import alphaone_free as alphaone

# Premium Version found at https://www.quantopian.com/data/accern/alphaone
# from quantopian.pipeline.data.accern import alphaone as alphaone

class DailySentimentByImpactScore(CustomFactor):
    inputs = [alphaone.article_sentiment, alphaone.impact_score]
    window_length = 1
    
    def compute(self, today, assets, out, sentiment, impact_score):
        out[:] = sentiment * impact_score
    
def make_pipeline():
    # Create our pipeline
    pipe = Pipeline()
    
    # Screen out penny stocks and low liquidity securities.
    dollar_volume = AverageDollarVolume(window_length=20)
    is_liquid = dollar_volume.rank(ascending=False) < 1000
    
    # Create the mask that we will use for our percentile methods.
    base_universe = (is_liquid)

    # Filter down to stocks in the top/bottom 10% by sentiment rank
    factor = DailySentimentByImpactScore()
    longs = factor.percentile_between(90, 100, mask=base_universe)
    shorts = factor.percentile_between(0, 10, mask=base_universe)

    # Add Accern to the Pipeline
    pipe.add(longs, "longs")
    pipe.add(shorts, "shorts")

    # Set our pipeline screens
    pipe.set_screen((longs | shorts) & (factor != 0))
    return pipe

# Put any initialization logic here. The context object will be passed to
# the other methods in your algorithm.
def initialize(context):
    attach_pipeline(make_pipeline(), name='factors')
    
    # Create our scheduled functions
    schedule_function(rebalance, date_rules.month_start())
    schedule_function(record_positions, date_rules.every_day(),
                      time_rules.market_close())

    set_commission(commission.PerShare(cost=0, min_trade_cost=0))
    set_slippage(slippage.FixedSlippage(spread=0))

def before_trading_start(context, data):
    # Assign long and short baskets
    results = pipeline_output('factors')
    assets_in_universe = results.index
    context.longs = assets_in_universe[results.longs]
    context.shorts = assets_in_universe[results.shorts]
    
def record_positions(context, data):
    # Record our leverage, exposure, positions, and number of open
    # orders
    record(lever=context.account.leverage,
           exposure=context.account.net_leverage,
           num_pos=len(context.portfolio.positions))
    
def rebalance(context, data):
    short_weight = -1.0/len(context.shorts)
    long_weight = 1.0/len(context.longs)
    assets_in_universe = (context.longs  | context.shorts)

    # Order our shorts
    for security in context.shorts:
        if data.can_trade(security):
            order_target_percent(security, short_weight)
            
    # Order our longs
    for security in context.longs:
        if data.can_trade(security):
            order_target_percent(security, long_weight)
            
    # Order securities not in the portfolio
    for security in context.portfolio.positions:
        if data.can_trade(security):
            if security not in assets_in_universe:
                order_target_percent(security, 0)
There was a runtime error.

Factor 2: Daily article sentiment weighted by impact score

Updated for Q2: Out-of-sample template backtest

class DailySentimentByImpactScore(CustomFactor):  
    # Economic Hypothesis: Accern reports both an `impact score`  
    # and `article sentiment`. The `impact score` is used to measure  
    # the likelihood that a security's price changes by more than 1%  
    # in the following day. The `article sentiment` is a quantified daily  
    # measure of news & blog sentiment about a given security. This combined  
    # measure of `impact score` and `article sentiment` may hold information  
    # about price changes in the following day.  
    inputs = [alphaone.article_sentiment, alphaone.impact_score]  
    window_length = 1

    def compute(self, today, assets, out, sentiment, impact_score):  
        out[:] = sentiment * impact_score  
Clone Algorithm
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
import pandas as pd
import numpy as np

from quantopian.algorithm import attach_pipeline, pipeline_output
from quantopian.pipeline import Pipeline
from quantopian.pipeline.factors import CustomFactor, AverageDollarVolume
from quantopian.pipeline.classifiers.morningstar import Sector

# Sample Version available from 26 Aug 2012 - 08 Feb 2014
# from quantopian.pipeline.data.accern import alphaone_free as alphaone

# Premium Version found at https://www.quantopian.com/data/accern/alphaone
from quantopian.pipeline.data.accern import alphaone as alphaone

class DailySentimentByImpactScore(CustomFactor):
    inputs = [alphaone.article_sentiment, alphaone.impact_score]
    window_length = 1
    
    def compute(self, today, assets, out, sentiment, impact_score):
        out[:] = sentiment * impact_score
    
def make_pipeline():
    # Create our pipeline
    pipe = Pipeline()
    
    # Screen out penny stocks and low liquidity securities.
    dollar_volume = AverageDollarVolume(window_length=20)
    is_liquid = dollar_volume.rank(ascending=False) < 1000
    
    # Create the mask that we will use for our percentile methods.
    base_universe = (is_liquid)

    # Filter down to stocks in the top/bottom 10% by sentiment rank
    factor = DailySentimentByImpactScore()
    longs = factor.percentile_between(90, 100, mask=base_universe)
    shorts = factor.percentile_between(0, 10, mask=base_universe)

    # Add Accern to the Pipeline
    pipe.add(longs, "longs")
    pipe.add(shorts, "shorts")

    # Set our pipeline screens
    pipe.set_screen((longs | shorts) & (factor != 0))
    return pipe

# Put any initialization logic here. The context object will be passed to
# the other methods in your algorithm.
def initialize(context):
    attach_pipeline(make_pipeline(), name='factors')
    
    # Create our scheduled functions
    schedule_function(rebalance, date_rules.month_start())
    schedule_function(record_positions, date_rules.every_day(),
                      time_rules.market_close())

    set_commission(commission.PerShare(cost=0, min_trade_cost=0))
    set_slippage(slippage.FixedSlippage(spread=0))

def before_trading_start(context, data):
    # Assign long and short baskets
    results = pipeline_output('factors')
    assets_in_universe = results.index
    context.longs = assets_in_universe[results.longs]
    context.shorts = assets_in_universe[results.shorts]
    
def record_positions(context, data):
    # Record our leverage, exposure, positions, and number of open
    # orders
    record(lever=context.account.leverage,
           exposure=context.account.net_leverage,
           num_pos=len(context.portfolio.positions))
    
def rebalance(context, data):
    short_weight = -1.0/len(context.shorts)
    long_weight = 1.0/len(context.longs)
    assets_in_universe = (context.longs  | context.shorts)

    # Order our shorts
    for security in context.shorts:
        if data.can_trade(security):
            order_target_percent(security, short_weight)
            
    # Order our longs
    for security in context.longs:
        if data.can_trade(security):
            order_target_percent(security, long_weight)
            
    # Order securities not in the portfolio
    for security in context.portfolio.positions:
        if data.can_trade(security):
            if security not in assets_in_universe:
                order_target_percent(security, 0)
There was a runtime error.

http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2792559#%23

Abstract:

This paper uses a dataset of more than 900,000 news stories to test whether news can predict stock returns. We measure sentiment with a proprietary Thomson-Reuters neural network. We find that daily news predicts stock returns for only 1 to 2 days, confirming previous research. Weekly news, however, predicts stock returns for one quarter. Positive news stories increase stock returns quickly, but negative stories have a long delayed reaction. Much of the delayed response to news occurs around the subsequent earnings announcement.

The Accern dataset has an asof_date. How would one build a filter to only allow asof_date 's within the past X number of days to pass through?