Back to Community
Accern News and Blog Backtest Results using Quantopian (Link to PDF Report attached)

[Quantopian Update] - This algorithm is now outdated. While we haven't replicated this algorithm, we've provided a few other examples using Accern's Alphaone data feed in strategies. Check out this Earnings Drift strategy using Accern for example.

Hello Quantopians,

We have recently backtested over 1.5 million news and blog articles (2.5 years length) with the help of Quantopian community members. We have received very positive results in the backtest and I would like to share it with you all.

The news and blog dataset was designed by Accern. Accern specialized in big data media analytics. We monitor over 20 million news and blog sources each day and provide over 25+ fields of analytics designed specifically for quantitative trading. Accern currently serve some of the largest multi-billion AUM hedge funds worldwide.

The fields of analytics used in this backtest chart are: Article Sentiment, Impact Score on Entity, and Overall Source Rank.

Article Sentiment (-1 – 1): This metric calculated the sentiment score of an article which is relevant to a company.

• A positive sentiment score means that the article was written in a positive tone towards a company. • A negative sentiment score means that the article was written in a negative tone towards a company. • This can be used as a directional trigger.

Overall Source Rank (0-10): This metric calculated the timeliness and reposting of a source; can be used as a trust factor and a viral factor.

• A high overall source rank means that source x is usually first at releasing articles before other sources and other sources usually repost the same information after source x has posted it. • A lower overall source rank means that source x is usually late at releasing articles before other sources and other sources usually never repost the same information after source x has posted it. • This can be used as a trust filter.

Impact Score on Entity (1-100): This metric calculated if the article will have a greater-than-1% impact on the stock on the same trading day.

• A high impact score means that the article has a high probably of affecting the stock price by more than 1%. • A low impact score means that the article has a low probably of affecting the stock price by more than 1%. • This can be used as a decision maker to execute an order.

The backtest report explains it in more details. Please review the report and share it with anyone you like. If you would like to have access to our 2.5 years of news and blog data, send me an email and I will provide you access. We want more of the community to conduct further test on the data to exploit it's value. We have just scratch the surface.

Accern Backtest Report

Request access to over 2.5 years of news and blog history (7.5 million articles) by sending an email to [email protected].

Best,
Kumesh Aroomoogan
Co-Founder and CEO, Accern

Clone Algorithm
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
#Code Structure Provided by: Derek Tishler ([email protected] )

import numpy as np
import pandas as pd

def initialize(context):
    #Use file from dropbox to assign sid's and sentiment values to trade on.
    set_symbol_lookup_date('2015-04-01')
    # Universe is set daily by inputs from the cvs fetch. But we will set a benchmark for comparison.
    context.stocks = symbols('FOXA',
'ATVI',
'ADBE',
'AKAM',
'ALXN',
'ALTR',
'AMZN',
'AMGN',
'ADI',
'AAPL',
'AMAT',
'ADSK',
'ADP',
'AVGO',
'BIDU',
'BBBY',
'BIIB',
'BRCM',
'CHRW',
'CA',
'CTRX',
'CELG',
'CERN',
'CHTR',
'CHKP',
'CSCO',
'CTXS',
'CTSH',
#'CMCSA',
'COST',
'DTV',
#'DISCA',
#'DISCK',
'DISH',
'DLTR',
'EBAY',
'EQIX',
'EXPE',
'EXPD',
'ESRX',
'FFIV',
'FB',
'FAST',
'FISV',
'GRMN',
'GILD',
#'GOOGL',
'GOOG',
'HSIC',
'ILMN',
'INTC',
'INTU',
'ISRG',
'KLAC',
'GMCR',
'KRFT',
#'LBTYA',
#'LINTA',
'LMCK',
'LMCA',
'LLTC',
'MAR',
'MAT',
'MXIM',
'MU',
'MSFT',
'MDLZ',
'MNST',
'MYL',
'NTAP',
'NFLX',
'NVDA',
'NXPI',
'ORLY',
'PCAR',
'PAYX',
'QCOM',
'REGN',
'ROST',
'SNDK',
'SBAC',
'STX',
'SIAL',
'SIRI',
'SPLS',
'SBUX',
'SRCL',
'SYMC',
'TSLA',
'TXN',
'PCLN',
'TSCO',
'TRIP',
'VRSK',
'VRTX',
'VIAB',
'VIP',
'VOD',
'WDC',
'WFM',
'WYNN',
'XLNX',
'YHOO',
)
    set_benchmark(symbol('QQQ'))
    
    # set a more realistic commission for IB, remove both this and slippage when live trading in IB
    set_commission(commission.PerShare(cost=0.014, min_trade_cost=1.4))
    
    # Default slippage values, but here to mess with for fun.
    set_slippage(slippage.VolumeShareSlippage(volume_limit=0.25, price_impact=0.1))
    
    #Only needed in testing/debugging to ensure orders are closed like in IB
    schedule_function(end_of_day, date_rules.every_day(), time_rules.market_close(minutes=1))
    
    fetch_csv("https://copy.com/exNcekPiwb7Z6FE9",
              date_column ='harvested_at',
              symbol_column = 'entities_ticker_1',
              date_format = '%m-%d-%Y %H:%M')
    
    #Article Sentiment
    context.upper_bound = 0.40
    context.lower_bound = -0.35
    #Impact Score
    context.upper_bound_a = 90
    context.lower_bound_a = 90
    #Source Rank
    context.upper_bound_c = 8
    context.lower_bound_c = 8
    
  

# Will be called on every trade event for the securities you specify. 
def handle_data(context, data):
    #Get EST Time
    context.exchange_time = pd.Timestamp(get_datetime()).tz_convert('US/Eastern')
    
    #Check that our portfolio does not  contain any invalid/external positions/securities
    check_invalid_positions(context, data)
    
    for stock in data:
        
        if ('article_sentiment' and 'event_impact_score_entity_1') in data[stock]:
            record(Accern_Article_Sentiment = data[stock]['article_sentiment'], upperBound = context.upper_bound, lowerBound = context.lower_bound)
            record(Event_Impact_Entity = data[stock]['event_impact_score_entity_1'], upperBound = context.upper_bound_a, lowerBound = context.lower_bound_a)
           
            
            
            # We will not place orders if a stock is already in the process of handeling an order(fill time)
            if check_if_no_conflicting_orders(stock):
                try:
                    # Go Long(buy), or exit and then buy(since minute mode so this condition will be valid all day
                    if (data[stock]['article_sentiment'] > context.upper_bound) and (data[stock]['event_impact_score_entity_1'] > context.upper_bound_a):

                        # If we hav no positions, then we are good to buy
                        if context.portfolio.positions[stock.sid].amount == 0:
                            buy_position(context, data, stock)
                        # We have some positions, if they are short, then exit that position so we can go long.
                        else:
                            if context.portfolio.positions[stock.sid].amount < 0:
                                exit_position(context, data, stock)

                    # Go short(sell), or exit and then short(since minute mode so this condition will be valid all day
                    elif (data[stock]['article_sentiment'] < context.lower_bound) and (data[stock]['event_impact_score_entity_1'] > context.lower_bound_a):
                        # If we have no positions, then we are good to buy
                        if context.portfolio.positions[stock.sid].amount == 0:
                            short_position(context, data, stock)
                        # We have some positions, if they are long, then exit that position so we can go short.                        
                        else:
                            if context.portfolio.positions[stock.sid].amount > 0:
                                exit_position(context, data, stock)
                except:
                    pass
                
     
def buy_position(context, data, stock):

    # Place an order, and store the ID to fetch order info
    orderId    = order_target_percent(stock, 0.05)
    # How many shares did we just order, since we used target percent of availible cash to place order not share count.
    shareCount = get_order(orderId).amount

    # We need to calculate our own inter cycle portfolio snapshot as its not updated till next cycle.
    value_of_open_orders(context, data)
    availibleCash = context.portfolio.cash-context.cashCommitedToBuy-context.cashCommitedToSell

    log.info("+ BUY {0:,d} of {1:s} at ${2:,.2f} for ${3:,.2f} / ${4:,.2f} @ {5:d}:{6:d}"\
             .format(shareCount,
                     stock.symbol,data[stock]['price'],
                     data[stock]['price']*shareCount, 
                     availibleCash,
                     context.exchange_time.hour,
                     context.exchange_time.minute))

def short_position(context, data, stock):
    
    #orderId    = order_target_percent(stock, -1.0/len(data))
    orderId    = order_target_percent(stock, -0.05)
    # How many shares did we just order, since we used target percent of availible cash to place order not share count.
    shareCount = get_order(orderId).amount

    # We need to calculate our own inter cycle portfolio snapshot as its not updated till next cycle.
    value_of_open_orders(context, data)
    availibleCash = context.portfolio.cash-context.cashCommitedToBuy+context.cashCommitedToSell

    log.info("- SHORT {0:,d} of {1:s} at ${2:,.2f} for ${3:,.2f} / ${4:,.2f} @ {5:d}:{6:d}"\
             .format(shareCount,
                     stock.symbol,data[stock]['price'],
                     data[stock]['price']*shareCount, 
                     availibleCash,
                     context.exchange_time.hour,
                     context.exchange_time.minute))

def exit_position(context, data, stock):
    order_target(stock, 0.0)
    value_of_open_orders(context, data)
    availibleCash = context.portfolio.cash-context.cashCommitedToBuy-context.cashCommitedToSell
    log.info("- Exit {0:,d} of {1:s} at ${2:,.2f} for ${3:,.2f} / ${4:,.2f} @ {5:d}:{6:d}"\
                 .format(int(context.portfolio.positions[stock.sid].amount),
                         stock.symbol,
                         data[stock]['price'],
                         data[stock]['price']*context.portfolio.positions[stock.sid].amount,
                         availibleCash,
                         context.exchange_time.hour,
                         context.exchange_time.minute))    
    
################################################################################

def check_if_no_conflicting_orders(stock):
    # Check that we are not already trying to move this stock
    open_orders = get_open_orders()
    safeToMove  = True
    if open_orders:
        for security, orders in open_orders.iteritems():
            for oo in orders:
                if oo.sid == stock.sid:
                    if oo.amount != 0:
                        safeToMove = False
    return safeToMove
    #

def check_invalid_positions(context, securities):
    # Check that the portfolio does not contain any broken positions
    # or external securities
    for sid, position in context.portfolio.positions.iteritems():
        if sid not in securities and position.amount != 0:
            errmsg = \
                "Invalid position found: {sid} amount = {amt} on {date}"\
                .format(sid=position.sid,
                        amt=position.amount,
                        date=get_datetime())
            raise Exception(errmsg)
            
def end_of_day(context, data):
    # cancle any order at the end of day. Do it ourselves so we can see slow moving stocks.
    open_orders = get_open_orders()
    
    if open_orders:# or context.portfolio.positions_value > 0.:
        #log.info("")
        log.info("*** EOD: Stoping Orders & Printing Held ***")

    # Print what positions we are holding overnight
    for stock in data:
        if context.portfolio.positions[stock.sid].amount != 0:
            log.info("{0:s} has remaining {1:,d} Positions worth ${2:,.2f}"\
                     .format(stock.symbol,
                             context.portfolio.positions[stock.sid].amount,
                             context.portfolio.positions[stock.sid].cost_basis\
                             *context.portfolio.positions[stock.sid].amount))
    # Cancle any open orders ourselves(In live trading this would be done for us, soon in backtest too)
    if open_orders:  
        # Cancle any open orders ourselves(In live trading this would be done for us, soon in backtest too)
        for security, orders in open_orders.iteritems():
            for oo in orders:
                log.info("X CANCLED {0:s} with {1:,d} / {2:,d} filled"\
                                     .format(security.symbol,
                                             oo.filled,
                                             oo.amount))
                cancel_order(oo)
    #
    log.info('') 
            
def value_of_open_orders(context, data):
    # Current cash commited to open orders, bit of an estimation for logging only
    context.currentCash = context.portfolio.cash
    open_orders = get_open_orders()
    context.cashCommitedToBuy  = 0.0
    context.cashCommitedToSell = 0.0
    if open_orders:
        for security, orders in open_orders.iteritems():
            for oo in orders:
                # Estimate value of existing order with current price, best to use order conditons?
                if(oo.amount>0):
                    context.cashCommitedToBuy  += oo.amount * data[oo.sid]['price']
                elif(oo.amount<0):
                    context.cashCommitedToSell += oo.amount * data[oo.sid]['price']
    #
We have migrated this algorithm to work with a new version of the Quantopian API. The code is different than the original version, but the investment rationale of the algorithm has not changed. We've put everything you need to know here on one page.
There was a runtime error.
30 responses

Impressive . Thanks for sharing this

Cheers

Lionel

Hi Kumesh,

This is exciting. Edit - Your email is in your post, my mistake!

Thanks,
Seong

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Can we safely conclude from this data that positive news releases of a company (new product development, higher than expected earnings, buy status) tend to correlate with increasing share prices after the news release?

Since I am very inexperienced in coding (I have never coded in a particular language before), does your code immediately purchase the stock after the news release? Does it also go through several criteria such as credibility of the source (greater than 5)and article sentiment (positive) before it makes purchases of the stock?

The integration of sentiment is really a great direction and the work looks very promising. Doesn't Quantopian only fetch at the start of the trading day? Your methodology releases you from that constraint but I'm curious whether you looked at the implications of only seeing articles from the previous day and as a corollary whether longer delays in integrating sentiment data has a negative affects on returns. It also looks as if articles outside of the trading times (night/weekends) don't get processed.

@New Trader: Yes, you can use the metric "first_mention" to purchase stock right after the release of a unique story which haven't yet been exposed to millions of viewers. You can also combine various metrics such as source/author ranks and impact score to make decisions on important and credible stories. These metrics decreases your risk of trading on a false story (rumor).

@Carlie: Thank you! And yes, you are correct. We only trade on articles that were released between 9:30 AM EST to 4:00 PM. But would be happy to discuss alternative options as well. We want to try as many options and methods as possible on this data set to understand its true value.

For those of you who have requested our historical data to conduct your own backtest, I will get back to you very soon.

What is neat about this framework is its ability to investigate many areas around sentiment. Another interesting area is the value of predictive analytics or the "time machine scenario". For instance, marketing departments attempt to coordinate releases which creates both buzz and followup articles. Often these can be centered around predictable dates such as trade shows, earnings releases, and industry community events. Knowing the dates of these events and the value that predictors historically generate could support a predictive model approach. This framework provides a simple way to measure the value of predictors on markets and individual stocks by simply subtracting days from the date column and re-running.

Hi Kumesh,
Impressive work.What kind of data did you use for your back test? Daily? Intraday? HFT? Did it include commission/slippage at all?
Also, isn't there a risk that if many people follow your algo it'll dissipate at some point?
Thanks for sharing.

@Carlie: That's a great observation and it would be very interesting to see once we start testing event-driven strategies. We actually did a quick test using an event-driven strategy and found many credible rumors days before the actually release of major announcements such as mergers & acquisition, lawsuits, etc. We're currently testing intraday and will release a report on that then we will move on to event-driven. So stay tune! :)

@Carl: Thank you, I appreciate it! Our data are timestamp in seconds but we traded in minutes and applied a trend trading strategy. We did not close our positions end of day. We only long/short and exit positions based on the signals in our data set. It does include commission and slippage as well.

As for your last question, this is just a very simple strategy we used. Nothing special about it. You can call it a starter strategy. Many quants can exploit it's value by combining it with their own strategies, combine different metrics, set different conditions, etc. There are hundreds of combinations with metrics and conditions inputs you can use with our data set so it's very unlikely for everyone to be using the same type of strategy which will dissipate the returns/alpha.

There is currently a large demand for our historical data so for those of you who have requested it - I will reach out to you by this weekend. I really appreciate your patience and enthusiasm to backtest our data! This is what we wanted from the community :)

Is there any way to figure out if the juice in the sentiment is gone? I mean, the sentiment could have had its effect and it might already be in the price.

Thanks Kumesh for sharing your results. I did share myself some time back similar backtests in Quantopian on our own News and Blog sentiment data generated at InfoTrie with our engine FinSentS (portal.finsents.com or - APPS FINSENTS on Bloomberg - ) .

There are hundreds of simple and original applications to sentiment data - as a main strategy or as an add-on to mainstream strategies. Glad to see your own implementations.

A few additional remarks:

  • Sentiment quality is important, and can be boosted by relevant implementations of algorithms be them "statistical with light linguistic", "hardcode linguistic" or relying on Machine/Deep Learning;
  • Beyond the simple polarity of sentiment, news flow (volume) are also quite interesting and should be put in parallel with actual trade orders flows / prices changes. They are also a good predictor for volatility - which open doors to many additional applications;
  • Ranking is a interesting metric, but beyond free data people in the field also look at what can be done with private or premium datasets (like Bloomberg or Dow Jones);
  • Look at aggregates : Sentiment data is also very valuable (especially on stocks) for aggregates (for instance indexes, or at industry/sector level). It is also a way to overtake some of the noise which me lie in the sentiment analysis done at a single article level (an extremely complex topic...)
  • history history history :-) Like for all back testing a larger depth in the back testing is key. Our up to 15 years + of history within various datasets has proven infinitely valuable to refine trading strategies.

Thanks @Frederic for the reply. Glad to know about your app.
I completely agree with you that there are various sentiment data based implementations available. It is exciting to see that the modern investors are taking these factors into account. In a nutshell, the fundamentals are still human-centric; and how different information is perceived and acted upon drives the market.

  • Agreed with your remark on sentiment quality. A practical insight here is to migrate to Convolutional Neural Network approach as soon as possible. Deep Learning is the key, although the models are difficult to train, especially when you are dealing with over 100 Terabytes of data. We are trying to refrain from hardcoded linguistic, and will probably drop that altogether in our next version update.

  • News flow definitely plays an important role here, but it is really difficult to identify a baseline. For example, is 100 articles about AAPL comparable to 100 articles about BP? Maybe not. That's where we start building individual models for each security. At Accern, we go beyond that and build models for different "event" (like M&A) spread as well. Look at our "saturation" and "volume" metrices.

  • We agree, and that's why we have increased our data coverage recently. Speed is the key here, that's why we are also constantly looking for direct access of information (reports, analyses) from partner banks. As these reports are sent to firms like Bloomberg at the same time, we believe that we may have an upper hand there.

  • That's a very good point. We also aggregate sentiment by entities (e.g. stocks) as well as stories. As each article contains entity-related information, such as industry, index, sector, exchange, competitors etc., a trader may aggregate the sentiment by any of these additional attributes on the fly. We also have an average_day sentiment information, using which you can see how the sentiment for an entity evolved over time.

  • This is one point where I have a slightly different view. I think that it makes sense to run extensive backtest for strategies that utilize market data, and are solely dependent on them. But when we talk about data, such as social media interaction etc., that has evolved significantly in last 4-5 years, the rules of the game change. These days you will find young/old traders who trade solely on news information they find on social media and forums. They are affecting the market. So, you are automatically dependent on evolving algorithms that factor-in relevant information, like the rate of page view, or virtual hyperlinks/relationships between sources. That, in turn, means that you can not (should not?) backtest on 10 years older data. That's why we support out-of-sample or forward testing more, even though we also have access to older news corpus.

All that said, I would like to emphasize that there is higher risk, technology-wise and cost-wise, involved in developing systems that not only scale well, but also take into account hidden factors -- an example would be adjusting weights of sources that are isolated on web, or determining the credibility of the information posted (not only the source), or whether to trust the source (domain) or the author (person) of the article etc. There is a lot more to explore, but we are glad to be moving in the right direction.

@Kumesh

Convolutional Neural Network why not - many algorithms are out there! Linguistic is both very useful and precise but very difficult to scale indeed. To simplify the tradeoff is generally the following: if you want an approach that scales fast and "cheap" you go for statistics with light linguistic, the more you want to increase your precision keeping the ability to scale the more you go for Machine up to Deep Learning (will really need then more servers!). But for accuracy with the right linguists in your team you can do great things!

We have a similar approach. We build different models for various "dimensions" in the data (M&A is one) - Even though multiplying infinitely models is a pitfall one should avoid as it makes things difficult to understand from an external point of view (too much "blackbox").

Reports and analysis are easy to get from banks. I am talking about premium real-time news flows from news agency which are way more difficult and expensive...

Agreed.

I agree that casual (and less casual) investors rely more and more on multiple media. For the good - for instance social media (esp. Twitter) are great to analyze large macro event (earthquakes, election etc ... ), company products (Iwatch, Iphone 5 etc...) and ... US stocks. But - and that is often a bias my American friends forget to consider :-) the "liquidity" of relevant signals may be very poor on many markets or languages. Our policy is then more a case by case integration of social media derived signals for things we consider meaningful. For the rest 15 years + of backtesting on a signal derived for instance from Dow Jones (which we for instance offer) or all major news Agencies (BBG, Reuters, AP, AFP, ...) are both very powerful and stable. You will never convince a large portfolio manager or a large Hedge Fund only with a two year paper back test where you claim a 200% performance (even if well executed). They will for good or bad reason see no different with simply flipping a coin.

Nice example from a useful data source. Quick question, does your data extend to pre-2009/8 timeline? It would be nice to see how the algorithm performs in a bear market. Thanks for posting this.

If I am reading this correctly - when the sentiment analysis dictates that a long position should be taken and there is already one, the algorithm simply passes. Was there a test done to see the performance if the algorithm doubles the position upon a second favorable sentiment determination?

@Bharath: The article sentiment which we used was just one of the three metrics that acted as a decision factor for our trade execution. We mainly wanted to use article sentiment as a directional trigger only. This means that if an article is very positive about a company, we would go into a long position and vice-versa. We could try to backtest article sentiment alone with the price movement but we would be setting up ourselves for a lot of risk exposure. In order to minimize our risk, we needed to apply overall_source_rank which lets us know if the information itself is trustworthy and furthermore, our impact score lets us know if the information will have some sort of impact on the price of the company. That being said, we are currently working on reinventing the wheels for sentiment analysis specifically for trading itself. We're quite far along the way and we will be sure to give you guys an update and show some backtest results on its performance once its released. Stay tune :)

@Frederic: Thanks for reflecting on some of the very important issues. Of course linguistic approach adds to the accuracy, and you are right that it is very difficult to scale. We are just trying to automate that process. As for the premium real-time sources, we do cover them as well. They are included in the 20M+ sources we monitor, but they are fairly small as compared to other, low-traffic, early information sources. And for the last point, bernoulli distribution? :) We are sure that we perform better than that. :)

@Udhay: Thanks for replying to this post. Sorry, but we don't have the pre-2009/8 coverage in the data yet. We are in the process of covering those 2-3 years of additional history to get the performance benchmark for bear market. Stay tune :)

@Udhay - I can offer pre-2008 data. Simply contact me.

Thanks Kumesh. Would love to see something to help quantify the half life of news / sentiment.

Thanks Kumesh. Great works. Accern Backtest Report link seems that can not be accessed.

@Andrew: Yes, that is correct. Also, we did perform the test based on your second statement as well. The returns and alpha was higher but the rest of the performance metrics declined a bit.

@Qiang: Thank you! The link should work. Can you retry clicking it? If not, here is the link again. https://dl.dropboxusercontent.com/u/70792051/Accern%20Backtest/Accern%20Backtest%20Report.pdf

I apologize I haven't gotten a chance to reply to some of your emails yet. I will get to it soon but we are currently working with Quantopian to figure out the best approach to provide you all access to the 2.5 years of history to conduct your own backtest on the platform. I will update you on the progress.

Best,
Kumesh

Just released an article about an interesting finding in our data set: https://www.linkedin.com/pulse/accern-detects-major-story-103-minutes-before-media-kumesh-aroomoogan

Hey all,

Just a quick update: we've had a number of people looking to use Accern's data directly in their algorithms, especially for the contest.

You can do that now through Quantopian Data and Pipeline.

Here's a simple long/short algorithm that James Christopher put together to get you guys started: https://www.quantopian.com/posts/accern-alphaone-long-short

Let me know if you have any questions,
Seong

how do i forward test any of your system either in live trading or quantopian paper trading
what do i need

Hi Nurudeen,

We recently released a direct integration to Accern's Alphaone dataset. You can learn more about the data here: quantopian.com/data/accern/alphaone

That data can be used for out of sample paper trading. It is different from the data sample provided here so we published a simple sample algorithm for you to try here: https://www.quantopian.com/posts/accern-alphaone-long-short

You'll need to purchase a subscription to the data to get the most recent updates. It is a monthly subscription that you can cancel at any time.

Hope that helps,
Josh

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hi, Kumesh

It's really impressive strategy! I found that the Alpha is low while the Beta is high. Dose this mean that the profit we got from the strategy mainly comes from the market. So I have two quick questions:
1. What's the performance for a bearish market?
2. What's leverage level did you use for the strategy?

Is this algo re-written using pipeline and Quantopian2? I followed the link above but couldn't find a strategy on the page that replicates this performance.
https://www.quantopian.com/posts/news-and-blog-sentiment-pipeline-factors-with-accern

Hi Kiran, this algorithm is now outdated. While we haven't replicated this algorithm, we've provided a few other examples using Accern's Alphaone data feed in strategies. Check out this Earnings Drift strategy using Accern for example.

Clone Algorithm
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
import numpy as np

from quantopian.algorithm import attach_pipeline, pipeline_output
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.classifiers.morningstar import Sector
from quantopian.pipeline.factors import CustomFactor, AverageDollarVolume
from quantopian.pipeline.data import morningstar as mstar
from quantopian.pipeline.filters.morningstar import IsPrimaryShare

from quantopian.pipeline.data.zacks import EarningsSurprises

# The sample and full version is found through the same namespace
# https://www.quantopian.com/data/eventvestor/earnings_calendar
# Sample date ranges: 01 Jan 2007 - 10 Feb 2014
from quantopian.pipeline.data.eventvestor import EarningsCalendar
from quantopian.pipeline.factors.eventvestor import (
    BusinessDaysUntilNextEarnings,
    BusinessDaysSincePreviousEarnings
)

# from quantopian.pipeline.data.accern import alphaone_free as alphaone
# Premium version availabe at
# https://www.quantopian.com/data/accern/alphaone
from quantopian.pipeline.data.accern import alphaone_free as alphaone

def make_pipeline(context):
    # Create our pipeline  
    pipe = Pipeline()  

    # Instantiating our factors  
    factor = EarningsSurprises.eps_pct_diff_surp.latest

    # Filter down to stocks in the top/bottom according to
    # the earnings surprise
    longs = (factor >= context.min_surprise) & (factor <= context.max_surprise)
    shorts = (factor <= -context.min_surprise) & (factor >= -context.max_surprise)

    # Set our pipeline screens  
    # Filter down stocks using sentiment  
    article_sentiment = alphaone.article_sentiment.latest
    top_universe = universe_filters() & longs & article_sentiment.notnan() \
        & (article_sentiment > .30)
    bottom_universe = universe_filters() & shorts & article_sentiment.notnan() \
        & (article_sentiment < -.50)

    # Add long/shorts to the pipeline  
    pipe.add(top_universe, "longs")
    pipe.add(bottom_universe, "shorts")
    pipe.add(BusinessDaysSincePreviousEarnings(), 'pe')
    pipe.set_screen(factor.notnan())
    return pipe  
        
def initialize(context):
    #: Set commissions and slippage to 0 to determine pure alpha
    set_commission(commission.PerShare(cost=0, min_trade_cost=0))
    set_slippage(slippage.FixedSlippage(spread=0))

    #: Declaring the days to hold, change this to what you want
    context.days_to_hold = 3
    #: Declares which stocks we currently held and how many days we've held them dict[stock:days_held]
    context.stocks_held = {}

    #: Declares the minimum magnitude of percent surprise
    context.min_surprise = .00
    context.max_surprise = .05

    #: OPTIONAL - Initialize our Hedge
    # See order_positions for hedging logic
    # context.spy = sid(8554)
    
    # Make our pipeline
    attach_pipeline(make_pipeline(context), 'earnings')

    
    # Log our positions at 10:00AM
    schedule_function(func=log_positions,
                      date_rule=date_rules.every_day(),
                      time_rule=time_rules.market_close(minutes=30))
    # Order our positions
    schedule_function(func=order_positions,
                      date_rule=date_rules.every_day(),
                      time_rule=time_rules.market_open())

def before_trading_start(context, data):
    # Screen for securities that only have an earnings release
    # 1 business day previous and separate out the earnings surprises into
    # positive and negative 
    results = pipeline_output('earnings')
    results = results[results['pe'] == 1]
    assets_in_universe = results.index
    context.positive_surprise = assets_in_universe[results.longs]
    context.negative_surprise = assets_in_universe[results.shorts]

def log_positions(context, data):
    #: Get all positions  
    if len(context.portfolio.positions) > 0:  
        all_positions = "Current positions for %s : " % (str(get_datetime()))  
        for pos in context.portfolio.positions:  
            if context.portfolio.positions[pos].amount != 0:  
                all_positions += "%s at %s shares, " % (pos.symbol, context.portfolio.positions[pos].amount)  
        log.info(all_positions)  
        
def order_positions(context, data):
    """
    Main ordering conditions to always order an equal percentage in each position
    so it does a rolling rebalance by looking at the stocks to order today and the stocks
    we currently hold in our portfolio.
    """
    port = context.portfolio.positions
    record(leverage=context.account.leverage)

    # Check our positions for loss or profit and exit if necessary
    check_positions_for_loss_or_profit(context, data)
    
    # Check if we've exited our positions and if we haven't, exit the remaining securities
    # that we have left
    for security in port:  
        if data.can_trade(security):  
            if context.stocks_held.get(security) is not None:  
                context.stocks_held[security] += 1  
                if context.stocks_held[security] >= context.days_to_hold:  
                    order_target_percent(security, 0)  
                    del context.stocks_held[security]  
            # If we've deleted it but it still hasn't been exited. Try exiting again  
            else:  
                log.info("Haven't yet exited %s, ordering again" % security.symbol)  
                order_target_percent(security, 0)  

    # Check our current positions
    current_positive_pos = [pos for pos in port if (port[pos].amount > 0 and pos in context.stocks_held)]
    current_negative_pos = [pos for pos in port if (port[pos].amount < 0 and pos in context.stocks_held)]
    negative_stocks = context.negative_surprise.tolist() + current_negative_pos
    positive_stocks = context.positive_surprise.tolist() + current_positive_pos
    
    # Rebalance our negative surprise securities (existing + new)
    for security in negative_stocks:
        can_trade = context.stocks_held.get(security) <= context.days_to_hold or \
                    context.stocks_held.get(security) is None
        if data.can_trade(security) and can_trade:
            order_target_percent(security, -1.0 / len(negative_stocks))
            if context.stocks_held.get(security) is None:
                context.stocks_held[security] = 0

    # Rebalance our positive surprise securities (existing + new)                
    for security in positive_stocks:
        can_trade = context.stocks_held.get(security) <= context.days_to_hold or \
                    context.stocks_held.get(security) is None
        if data.can_trade(security) and can_trade:
            order_target_percent(security, 1.0 / len(positive_stocks))
            if context.stocks_held.get(security) is None:
                context.stocks_held[security] = 0

    #: Get the total amount ordered for the day
    # amount_ordered = 0 
    # for order in get_open_orders():
    #     for oo in get_open_orders()[order]:
    #         amount_ordered += oo.amount * data.current(oo.sid, 'price')

    #: Order our hedge
    # order_target_value(context.spy, -amount_ordered)
    # context.stocks_held[context.spy] = 0
    # log.info("We currently have a net order of $%0.2f and will hedge with SPY by ordering $%0.2f" % (amount_ordered, -amount_ordered))
    
def check_positions_for_loss_or_profit(context, data):
    # Sell our positions on longs/shorts for profit or loss
    for security in context.portfolio.positions:
        is_stock_held = context.stocks_held.get(security) >= 0
        if data.can_trade(security) and is_stock_held and not get_open_orders(security):
            current_position = context.portfolio.positions[security].amount  
            cost_basis = context.portfolio.positions[security].cost_basis  
            price = data.current(security, 'price')
            # On Long & Profit
            if price >= cost_basis * 1.10 and current_position > 0:  
                order_target_percent(security, 0)  
                log.info( str(security) + ' Sold Long for Profit')  
                del context.stocks_held[security]  
            # On Short & Profit
            if price <= cost_basis* 0.90 and current_position < 0:
                order_target_percent(security, 0)  
                log.info( str(security) + ' Sold Short for Profit')  
                del context.stocks_held[security]
            # On Long & Loss
            if price <= cost_basis * 0.90 and current_position > 0:  
                order_target_percent(security, 0)  
                log.info( str(security) + ' Sold Long for Loss')  
                del context.stocks_held[security]  
            # On Short & Loss
            if price >= cost_basis * 1.10 and current_position < 0:  
                order_target_percent(security, 0)  
                log.info( str(security) + ' Sold Short for Loss')  
                del context.stocks_held[security]  
                
# Constants that need to be global
COMMON_STOCK= 'ST00000001'

SECTOR_NAMES = {
 101: 'Basic Materials',
 102: 'Consumer Cyclical',
 103: 'Financial Services',
 104: 'Real Estate',
 205: 'Consumer Defensive',
 206: 'Healthcare',
 207: 'Utilities',
 308: 'Communication Services',
 309: 'Energy',
 310: 'Industrials',
 311: 'Technology' ,
}

# Average Dollar Volume without nanmean, so that recent IPOs are truly removed
class ADV_adj(CustomFactor):
    inputs = [USEquityPricing.close, USEquityPricing.volume]
    window_length = 252
    
    def compute(self, today, assets, out, close, volume):
        close[np.isnan(close)] = 0
        out[:] = np.mean(close * volume, 0)
                
def universe_filters():
    """
    Create a Pipeline producing Filters implementing common acceptance criteria.
    
    Returns
    -------
    zipline.Filter
        Filter to control tradeablility
    """

    # Equities with an average daily volume greater than 750000.
    high_volume = (AverageDollarVolume(window_length=252) > 750000)
    
    # Not Misc. sector:
    sector_check = Sector().notnull()
    
    # Equities that morningstar lists as primary shares.
    # NOTE: This will return False for stocks not in the morningstar database.
    primary_share = IsPrimaryShare()
    
    # Equities for which morningstar's most recent Market Cap value is above $300m.
    have_market_cap = mstar.valuation.market_cap.latest > 300000000
    
    # Equities not listed as depositary receipts by morningstar.
    # Note the inversion operator, `~`, at the start of the expression.
    not_depositary = ~mstar.share_class_reference.is_depositary_receipt.latest
    
    # Equities that listed as common stock (as opposed to, say, preferred stock).
    # This is our first string column. The .eq method used here produces a Filter returning
    # True for all asset/date pairs where security_type produced a value of 'ST00000001'.
    common_stock = mstar.share_class_reference.security_type.latest.eq(COMMON_STOCK)
    
    # Equities whose exchange id does not start with OTC (Over The Counter).
    # startswith() is a new method available only on string-dtype Classifiers.
    # It returns a Filter.
    not_otc = ~mstar.share_class_reference.exchange_id.latest.startswith('OTC')
    
    # Equities whose symbol (according to morningstar) ends with .WI
    # This generally indicates a "When Issued" offering.
    # endswith() works similarly to startswith().
    not_wi = ~mstar.share_class_reference.symbol.latest.endswith('.WI')
    
    # Equities whose company name ends with 'LP' or a similar string.
    # The .matches() method uses the standard library `re` module to match
    # against a regular expression.
    not_lp_name = ~mstar.company_reference.standard_name.latest.matches('.* L[\\. ]?P\.?$')
    
    # Equities with a null entry for the balance_sheet.limited_partnership field.
    # This is an alternative way of checking for LPs.
    not_lp_balance_sheet = mstar.balance_sheet.limited_partnership.latest.isnull()
    
    # Highly liquid assets only. Also eliminates IPOs in the past 12 months
    # Use new average dollar volume so that unrecorded days are given value 0
    # and not skipped over
    # S&P Criterion
    liquid = ADV_adj() > 250000
    
    # Add logic when global markets supported
    # S&P Criterion
    domicile = True
    
    # Keep it to liquid securities
    ranked_liquid = ADV_adj().rank(ascending=False) < 1500
    
    universe_filter = (high_volume & primary_share & have_market_cap & not_depositary &
                      common_stock & not_otc & not_wi & not_lp_name & not_lp_balance_sheet &
                    liquid & domicile & sector_check & liquid & ranked_liquid)
    
    return universe_filter
There was a runtime error.