Back to Community
Use Accern's Sentiment Dataset with Quantopian's Optimization Package to get 2.37 Sharpe Ratio & Questions on Acceptable Avg Daily Turnover Rates

I used Accern's sentiment datasets to create trading signals/scores for S&P 500 companies and rebalance on an hourly basis. With the Quantopian's optimization package, the result looks solid without trading/slippage costs.

My question is, with the hourly rebalance frequency, what's the approximate common/acceptable range of daily turnover rate for most of the institutional funds? My strategy's daily turnover rate is in the range of 300% to 600%. So would like to compare to the industry standard.

Appreciate the answers in advance.

Clone Algorithm
143
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month

import numpy as np
import quantopian.optimize as opt
import quantopian.algorithm as algo
import pandas as pd

data_file_2014 = 'https://dl.dropboxusercontent.com/s/rtj06sorq4rftfj/5_22_hourly_modified_PA1_sum_2014.csv?dl=0'

data_file_2015 = 'https://dl.dropboxusercontent.com/s/curqa024pzaot3j/5_22_hourly_modified_PA1_sum_2015.csv?dl=0'

data_file_2016 = 'https://dl.dropboxusercontent.com/s/zc05wl3etuaocn2/5_22_hourly_modified_PA1_sum_2016.csv?dl=0'

data_file_2017 = 'https://dl.dropboxusercontent.com/s/ms7ykxsd9ihywop/5_22_hourly_modified_PA1_sum_2017.csv?dl=0'

data_file_2018 = 'https://dl.dropboxusercontent.com/s/3awxenq649bfhs0/5_22_hourly_modified_PA1_sum_2018.csv?dl=0'


#Set Maximum portfolio Leverage
MAX_GROSS_EXPOSURE = 1
#Set Maximum Position sizes for individual longs and shorts
MAX_SHORT_POSITION_SIZE = 0.01
MAX_LONG_POSITION_SIZE = 0.01
single_metric_string = 'normed_score'

data_take=0
k=0
def merge_data(df):
    global data_take
    global k
    if k==0:
        data_take=df
        k=1
    else:
        frames=[data_take,df]
        data_take=pd.concat(frames)
    return data_take


def initialize(context):
    set_benchmark(sid(8554))
    set_slippage(slippage.FixedSlippage(spread=0.0))
    # set_slippage(slippage.FixedBasisPointsSlippage(basis_points=0, volume_limit=1))
    set_commission(commission.PerShare(cost=0.0, min_trade_cost=0))
    # schedule_function(daily_liquidate,
    #                   date_rules.every_day(),
    #                   time_rules.market_close(hours = 0, minutes = 35))
    
    schedule_function(my_rebalance,
                      date_rules.every_day(),
                      time_rules.market_open(hours = 0, minutes = 30)) # 10:00 am
    
    schedule_function(my_rebalance,
                      date_rules.every_day(),
                      time_rules.market_open(hours = 1, minutes = 30)) # 11:00 am 
    
    schedule_function(my_rebalance,
                      date_rules.every_day(),
                      time_rules.market_open(hours = 2, minutes = 30)) # 12:00 am
    
    schedule_function(my_rebalance,
                      date_rules.every_day(),
                      time_rules.market_open(hours = 3, minutes = 30)) # 13:00 pm
    
    schedule_function(my_rebalance,
                      date_rules.every_day(),
                      time_rules.market_open(hours = 4, minutes = 30)) # 14:00 pm
    
    schedule_function(my_rebalance,
                      date_rules.every_day(),
                      time_rules.market_open(hours = 5, minutes = 30)) # 15:00 pm
       
    schedule_function(record_vars,
                      date_rules.every_day(),
                      time_rules.market_close(hours = 0, minutes = 1))   
    
    fetch_csv(data_file_2014,
          date_column='next_hour',  # Assigning the column label
          date_format='%m/%y/%d %H:%M:%S',  # Using date format from CSV file
          mask=False,
          post_func=merge_data,
          timezone='EST')
    
    fetch_csv(data_file_2015,
          date_column='next_hour',  # Assigning the column label
          date_format='%m/%y/%d %H:%M:%S',  # Using date format from CSV file
          mask=False,
          post_func=merge_data,
          timezone='EST')
    
    
    fetch_csv(data_file_2016,
          date_column='next_hour',  # Assigning the column label
          date_format='%m/%y/%d %H:%M:%S',  # Using date format from CSV file
          mask=False,
          post_func=merge_data,
          timezone='EST')
    
    fetch_csv(data_file_2017,
          date_column='next_hour',  # Assigning the column label
          date_format='%m/%y/%d %H:%M:%S',  # Using date format from CSV file
          mask=False,
          post_func=merge_data,
          timezone='EST')
    
    
    fetch_csv(data_file_2018,
          date_column='next_hour',  # Assigning the column label
          date_format='%m/%y/%d %H:%M:%S',  # Using date format from CSV file
          mask=False,
          post_func=merge_data,
          timezone='EST')    
    
    


def my_rebalance(context, data):
    scores = []
    stocks = []
    for stock in data.fetcher_assets:
        current_hour = get_datetime('US/Eastern')
        enter_hour = data.current(stock, 'enter_hour')
        enter_hour = pd.to_datetime(enter_hour)    
    
    
        if current_hour.strftime('%Y-%m-%d %H:%M:%S') == enter_hour.strftime('%Y-%m-%d %H:%M:%S'):
            stocks.append(stock)
            score = data.current(stock, single_metric_string)
            scores.append(score)
            
    df = pd.DataFrame(scores, columns=[single_metric_string], index = stocks)
    
    
    #Set objective for our Optimizer
    df[single_metric_string] = df[single_metric_string].astype(float)
    objective = opt.MaximizeAlpha(df[single_metric_string])
    #Set Constraints
    constrain_gross_leverage = opt.MaxGrossExposure(MAX_GROSS_EXPOSURE)
    market_neutral = opt.DollarNeutral(tolerance=0.08)
    constrain_pos_size = opt.PositionConcentration.with_equal_bounds(
        -MAX_SHORT_POSITION_SIZE,
        MAX_LONG_POSITION_SIZE,
    )
    
    #Place Orders based on our objective and constraints
    try:
        algo.order_optimal_portfolio(
            objective=objective,
            constraints=[
                constrain_gross_leverage,
                constrain_pos_size,
                market_neutral,
            ],
        )
        
    except:
        pass
    
def record_vars(context, data):
    long_count = 0
    short_count = 0
    for position in context.portfolio.positions.itervalues():
        if position.amount > 0:
            long_count += 1
        elif position.amount < 0:
            short_count += 1
    record(gross_leverage=context.account.leverage)  # Plot leverage to chart
    record(net_leverage=context.account.net_leverage)
    record(num_longs = long_count)
    record(num_shorts = short_count)
    record(portfolio_size = len(context.portfolio.positions))

There was a runtime error.
12 responses

@ Brad

Sorry, I don't have an answer for you. However, I have a question. How/where did you get the accern data? Did you purchase it on their site?

This is a sweet aglo. Like the low vol.

Thanks,

Luc

@ Luc

Appreciate the reply and thanks for the comment on this strategy. No I didn't buy it, I am actually working at Accern now using our raw data to develop and research sentiment-driven trading strategies.

If you are interested in our dataset/strategies, you can send a message to me along with your email address, or send an email to [email protected]. My email is [email protected]. Thanks!

+1 Luc agree very low volatility (and drawdown).

Hi Brad, I like the 40~50 positions and even leverage relatively stable throughout; granted the high daily turnover x 40~50 positions.

Would a tear sheet be forthcoming? Keen to see the risk metrics and position concentration (if you could set the round_trips = True please)

Thanks

@ Hi Karl,

Thanks for your great feedback as well. Attached is the tearsheet for your reference.

Loading notebook preview...
Notebook previews are currently unavailable.

The lack of volatility/drawdown looks amazing. I think you'll discover though if you apply slippage and commissions that it quickly dives to $0 due to the high turnover rate.

Nice strategy Brad! Did you include dictionaries to come up with the sentiment data or a deep learning technique?

Thanks for sharing!

Very impressive, Brad and thanks for sharing! Quick queries:

  • Could not see the Profit & Loss summary, Probability Win/Loss and all parts of the tear sheet - normally shown when round_trips=True - suggest you let it run a bit longer to its full conclusion and see that they appear?
  • Is Accern part of the Quantopian Data Bundle and accessible for Pipeline using from quantopian.pipeline.data.accern import alphaone ?
  • Possible to run the algorithm & tear sheet with slippage and commission for comparison, say:
set_commission(commission.PerShare(cost=0.001, min_trade_cost=1))  
set_slippage(slippage.VolumeShareSlippage(volume_limit=0.025, price_impact=0.1))  

I can see an application in my algorithms.. pending the Profit & Loss and Probability Win/Loss distribution. Excellent effort, Brad!

@ Viridian,

You are definitely correct. I myself am also always suspicious of any higher frequency(turnover) strategy unless the commissions/slippage is small enough.

This is just showcased to explore and prove Accern dataset's pure predictive power that potentially helps make money out of the financial market. I have posted more lower frequency (daily and weekly) strategies built on Accern's DS2's data. Feel free to check here (Backtest with Accern's ML-Driven DS2 Dataset to Generate Daily Strategy with Sharpe Ratio of 3 with Trading Costs)

@ Adrian,

Thanks for the great feedback.

Our raw sentiment data is built based on our proprietary NLP algorithms of Accern's, including tagging and dictionaries techniques you mentioned. Our DS2 daily score dataset, specially for trading S&P500 companies, is built upon our raw sentiment data, which involves deep learning techniques. You can find DS2 related strategies here.

@ Karl,

  • Apologies, probably because of bad connections and didn't show all the results at the first try above. Here you go with the updated full-tear sheet to see the round-trip analysis.

  • No. I believe Alphaone was the old version of package several years ago on Quantopian which no longer can be used now. But recently we are probably planning to be back on Quantopian in the near future.

  • I believe the result will be very bad because of the crazy daily turnover rate that causes huge amount of trading costs. In real life trading situations, Accern's datasets are more helpful when it comes to mid/longer term strategies, e.g. daily/weekly/monthly strategies, which will be impacted much less by trading costs.

Feel free to check my reply to Adrian above about our DS2 dataset's strategies. You can find DS2 based strategies here.

Loading notebook preview...
Notebook previews are currently unavailable.

That's tremendous, Brad that's very helpful + DS2 datasets for that ML based strategy.

Appreciate the full tear sheet and thanks for your comment on Alphaone. Also noted that Q has now pipelined your DS2 data - thanks to Josh's prompt and Jamie's work - into your algorithm to exemplify the application.

My interest in this Sentiment Dataset version is on augmenting performance, more particularly the "sentiment" as an incremental alpha before portfolio optimisation. For that purpose:

  • Agree with you that this strategy is about the alpha generating capacity of the sentiment datasets and, as an increment, the slippage and cost are already taken into account in the pre existing algorithm.
  • Yes that's crazy daily turnover (that also fails the contest rules) - possible to run your algorithm and the tear sheet for 50~100 positions but scheduled to run twice a week? For instance:
for i in range(1, 5, 2):  # Tues & Thurs only  
    schedule_function(func=rebalance,  
        date_rule=date_rules.week_start(days_offset=i),  
        time_rule=time_rules.market_open(hours=2, minutes=1),  
        half_days=True)  
  • While Accern is "planning to be back on Quantopian in the near future".. is there a way to subscribe to the sentiment data?

Thanks for sharing and your great responses!

Karl

Thanks @Karl,

Apologies for the late reply here. Exactly, Josh and Jamie have been a great help in directing me to use the self-serve data pipeline features on Quantopian! Thanks again to @Josh and @Jamie.

I like the new idea of rebalancing on lower frequencies. In fact, for lower-frequency strategies, I just recently built some weekly strategies using Accern's daily DS2 dataset, which passed all contest requirements on Quantopian. Feel free to check here.

Yes, we provide subscriptions, you can shoot me an email at [email protected] or [email protected] so that I can provide you with more details.