Back to Community
Tips for writing robust algorithms for the hedge fund

Robustness is a key quality of an algorithm. The algorithm should continue live trading, day over day, and handle any situation. Below is a list of tips that we've developed - keep these in mind as you're creating algos for the contest and fund!

The lucky 13 tips:

  1. Use "context" instead of global variables to save state. This makes your algorithm more robust to starts/stops and correctly saves the state of variables. For example:

    import pandas as pd
    
    timeperiod = 20 # don't use this syntax
    
    def initialize(context):  
       context.timeperiod = 20  # use this instead
    
  2. Use schedule_function to arrange the day/time for your orders and signals. Don't use a manual comparison. For example:

    def initialize(context):  
         schedule_function(             # use this syntax  
            func=myfunc,  
            date_rule=date_rules.every_day(),  
            time_rule=time_rules.market_open(minutes=1)  
         )  
    
    def handle_data (context,data):  
         if get_datetime().time()) == '9:31':  # don't use this!  
             pass
    
  3. Use order_target functions instead of manually calculating the number of shares to buy. The algo shouldn't blindly buy/sell shares, but rather take existing positions into account.

  4. Check for open orders before placing new orders using the target functions.

  5. Add logging of intended orders (timing, symbols and order target size). This helps us monitor that the algo is behaving correctly, and was able to achieve its target positions.

  6. Record the leverage (using context.account.leverage) to monitor the behavior. Algorithms selected for the fund will have leverage restricted to 1.05. The leverage will be applied at the fund level.

  7. Make sure algo is 'aware' of existing portfolio positions at deploy. The algo should handle starts/stops smoothly and pick up the correct positions.

  8. Verify the algo will backfill any historical data needed to set initial parameters. The algorithm should begin trading immediately, without a warm-up period.

  9. Use sid() instead of symbol() if hard-coding a list of securities. The sid() function is more robust to handling securities getting acquired and delisted. If you use symbol('ABC') in a live trading algo and the stock it acquired, your algo will stop trading.

  10. Give your algorithm time to place its trades. Don't trade 1 minute before market close; try trading at 3:45PM or a VWAPBestEffort to place trades over a specific time period.

  11. If your algo is pair trading, think about pending orders and failed orders. How will your algo react if one of the legs can't be ordered? Or if the position can't be fully reached? Have a check that if target positions aren't met after X time (3 min? 5 min?), close the pair.

  12. Don't use deprecated functions in your algo (ie use history instead of batch_transform).

  13. Protect yourself against bad data prints. Our data vendor, like all data vendors, sometimes passes us bad data. Those bad prints might cause an algorithm to place a trade that it shouldn't, or skip a trade that it would otherwise have made. Check if a price is outside of an interval - for example 10 standard deviations - before acting on the signal.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

21 responses

Alisa,

This post has excellent information! I don't think many people saw your post, otherwise I am sure there would be more discussion on these items.

I decided to try to write an algo that accomplished as many of these "best practices" as I could. This version covers best practice 1-7, 9-10, and 12-13.

I would love for Quantopian to create an similar algo that could be added to your documentation.

Please continue sharing tips like these!

Tristan

Clone Algorithm
191
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 55b31a999926900c6d46eecb
There was a runtime error.

how about.. if I want the algo to trade every other day.. can that be done.. using... schedule function... or do we have to hard code it each date??tnxs... ;)

John,

Here's one way of trading every other day. Conceptually, you need a flag to keep track of whether trading was allowed on the prior day. If so, then don't allow it for the current day. If not, then it's o.k. to trade in the current day.

Grant

def initialize(context):  
    context.stocks = [sid(8554), sid(33652)] # SPY & BND  
    schedule_function(trade,date_rule=date_rules.every_day(),time_rule=time_rules.market_open(minutes=15))  
    context.execute = True

def handle_data(context, data):  
    pass

def trade(context, data):  
    if context.execute:  
        for stock in context.stocks:  
            order_target_percent(stock,0.5)  
        print "Orders submitted"  
        context.execute = False  
    else:  
        print "Trading day skipped"  
        context.execute = True  
Clone Algorithm
52
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 55b34b2ed3db740c61c0eb75
There was a runtime error.

These tips are pretty much essential.

One of my strategy tends to go on a buying spree (leverage shoots up to >5)... on several particular stocks on several particular days. Have never quite figured out why... until now! Thanks for the convenience function 'has_orders', Alisa!

@Alisa

I was reading through Q's Sample Algorithms, and I noticed that some of them use "batch_transform".

Isn't that violation of tip #12?

@Tristan - that algo is fantastic! It's a great example of how to apply the best practices. Thanks for the elegant write-up and if I had internet points to dole out, you would certainly get them :)

As for your second question, batch_transform was the first function we created to return a trailing window of data. As we've grown, it's shown some holes, and we are not working to support it. The function requires warm-up (ie will return 10 days of data only on the 11 day), is memory intensive, and doesn't have the convenience features of pandas. In short, it's cumbersome. We have left this in the past and encourage you to use the more nimble, powerful history() function. The sample algo still exists because history() cannot be applied to Fetcher data. It can only be applied to data in our database. However, there have been some hacks shared in the community as a workaround until we bridge the gap.

@John - If you want to trade every other day, I would recommend to use multiple schedule_functions. You can use the "week_start" option and have multiple calls. For example:

def initialize(context):  
   context.stock = sid(24)  #AAPL  
   schedule_function(trade, date_rules.week_start())   # trade on Monday  
   schedule_function(trade, date_rules.week_start(2))  # trade on Wednesday  
   schedule_function(trade, date_rules.week_start(4))  # trade on Friday

def handle_data(context,data):  
    pass

def trade(contet,data):  
  # your ordering logic here  

@Ted - glad you were able to find the reason!

@Tristan Thanks for the template! I think I found a bug on the #13 though. It's computing std deviation including the last day and then comparing it to the last two days difference. If the last day is off, then it will already inflate the std deviation:

> close_prices[security][0:-1].std()  
float: 0.00680190711624

> close_prices[security].std()  
float: 11.1002873574

> context.max_std_deviation_of_bad_prints * std  
float: 111.002873574  

So I think the correct logic should be like this (first line is changed):

####################################################################  
#  Best Practice #13: Protect against bad data prints  
#####################################################################  
# Calculate the standard deviation of prices  
std = close_prices[security][0:-1].std()  
# If the current prices has changed more than 10 standard deviations,  
# we consider this a "bad data print" and don't make changes to that position  
if abs(close_prices[security][-1] - close_prices[security][-2]) > context.max_std_deviation_of_bad_prints * std:  
    print "{0}: STD = {1}  Assuming bad print.".format(security.symbol, std)  
    continue  

Great catch Alex! Actually, I don't think we should include the last day in the STD calculation at all.... so the first line should be:

std = close_prices[security][0:-2].std()  

The only downside to this logic is that any REAL drastic change, such as an overnight scandal or merger, the algorithm would not trade until the day after the big move.

Can I take this opportunity to repeat my opinion that Quantopian has the responsibility to handle bad input, before it gets to our algos?

The second index in the interval is non-inclusive, so [0:-1] means "up until the last element", not including.

Well, that's why you have 10x coefficient when comparing. If you have some market jumps in mind you can see if it blocks such trades or not. In my case I was comparing a jump from 0.33 to 33 so it's way over than 10x jump of deviation as you can see (more like 2000x).

(Regarding quotes, I agree that it would be nice if Q handled them. However, I think it's more important to have orders routed to the same exchanges which print those quotes. If IB smart routing sends orders to 5 destinations and Nanex aggregates quotes from 10, an algorithm will be trading on prices it can not get if I understand this correctly.)

Alex: You are right, of course. I always forgot how Python slices are half-open.

Hi Tristan

 I believe you just used the ma as a way to document good algorithm practices, and not so much as a finished strategy, so the comments below may be superfluous to you, but they may help others thinking of use moving average strategies as the core of their approaches.

Let me repeat the praise you've already gotten above for very elegant code. I was just curious to see how the moving average affected the results itself, so I ran the ma algo against a limited universe consisting only of SPY. That's what this shows

(sorry for butchering the elegance of your code. I took some pretty ugly shortcuts)

My conclusions:
1) ma strategies will almost always underperform the actual security itself but they will do so with more downside protection
2) you CAN develop ma strategies that outperform a simple long strategy, but to do so you have to go "intracandle", buying and selling just
after an ma cross (or minimum ma[-1] ma[-2] move) This in turn brings up the problem of "whipsawing" but that can be dealt with and given the very low cost of trading stocks these days days becomes much less expensive than waiting until "candle close" to act.

Again, thanks for sharing

Clone Algorithm
20
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 5691a85369ec2f11a12fcc47
There was a runtime error.

@Serge: Great points about using MA trading strategies. This shared algo was not at all about the strategy, but more about trying to implement the "robustness" tips that Alisa shared.

BTW, I have shared a notebook to test which MA values would work best over a given time-period. The problem was that those values did not provide predictive value for any other time periods!

I have had some issues with using the talib library and walk forward testing. According to item 8 above, I should look to backfill the indicator, so I can test the also immediately. I am having some trouble working to accomplish this, can someone help me with the proper order of operations to accomplish backfilling, for example with RSI?

Of all those bullet points, I am missing one important one. 11. Don't know how to code it. Could I see an example?

@Adam, the history() function is what you want to get a trailing window of data, you can then pass the values returned to the TA-Lib functions. Here is one of the example algos from the help docs that uses history with TA-Lib.

Clone Algorithm
7
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 56990e17c11f201175db5186
There was a runtime error.

Hi Alisa,

I have a difficulty to implement the tips as your suggested in item 8 (reduce warm-up period). Most of my algo require more than 7 days to start the first trade.

Does any one can provide an easy example for me?

Cheers.

@David TA-Lib is very sensitive to NaNs in data. I wish help docs had more realistic examples which can be used directly. I am doing this for ATR but it looks very clumsy:

close_prices = history(context.atr_period+1, '1d', 'close_price')  
highs = history(context.atr_period+1, '1d', 'high')  
lows = history(context.atr_period+1, '1d', 'low')

for security in data:  
    nan_present_highs = np.any(pandas.isnull(highs[security]))  
    nan_present_lows = np.any(pandas.isnull(lows[security]))  
    nan_present_closes = np.any(pandas.isnull(close_prices[security]))

    if nan_present_highs or nan_present_lows or nan_present_closes:  
        continue

    atr = talib.ATR(highs[security], lows[security], close_prices[security], timeperiod = context.atr_period) [ -1 ]

    if (atr is None) or math.isnan(atr):  
        continue  

Is there a better, canonical way to avoid NaNs here?

@Adam All rule 8 says is that if your algorithm accumulates data as it goes and computes some internal values, it should use history() and loop through the past X days during the first time handle_data is invoked (or better yet - inside of before_trading_start). This way you will compute your values on the first run and can trade right away instead of skipping X days to warm up. If your algorithm uses history and does not save any internal calculations inside of context then there is nothing to worry about.

@Alex K, unfortunately there isn't much you can do about NaNs in some cases. For example, if you need 200 days of history but one of the stocks has only been trading for 100 days you will have NaN values that you can't do anything about. Your best bet is to forward fill the missing bars and drop any stocks that still have NaNs in the resulting dataframe. It's probably safest to not trade stocks with missing data.

@David I understand, I am asking more from the coding perspective - is there a more elegant way to filter out those stocks? Are there any plans to support it more directly as a part of the API?

@David and Alex K - Thank you! Will give your suggestions a try!