Back to Community
Tax Harvesting Long Only

I've been playing around with a random selection algo that was recently shared, mostly testing how many securities are needed to effectively diversify (I think most academic studies put the number at about 15 before you hit diminishing returns).

Anyways, the results from those test got me thinking about this algo, a tax efficient approach that attempts to give you a beta of 1, while selling off losing stocks to "harvest" the tax benefit (see investopedia tax harvesting). This algo starts with 30 random stocks picked from the sp500. Each month it checks to see which stocks have unrealized losses, sells them, and blacklists them for 31 days to avoid running into a wash sale. It then randomly picks a few more stocks to replace the ones that were sold and rebalances everything to equal weighting. Additionally, each year, it increases the number of holdings by 5 to inject some fresh blood (otherwise, the companies get stale)

Right now, I'm measuring the tax benefit with a back of the envelope calculation that says (cost basis - current price) * shares, which I expect to be a reasonable proxy for the losses. Also, I would expect that splits and dividends complicate it in ways that I haven't accounted for, but I think it should be pretty close.

I've run the backtest 8 times over the same time to see how the random selection effects it and this was the worst result. The other returns were 270%, 307%, 451%, 337%, 308%, 406%, and 407%. I tried getting a notebook going so I can run a batch of test, but ran into trouble with that.

Clone Algorithm
19
Loading...
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Random 10 from US30

import numpy as np
import pandas as pd
from sqlalchemy import or_
import random
import datetime
#random.seed(987654321)

def initialize(context):
    schedule_function(trade, date_rules.month_start(), time_rules.market_open(hours=1))
    schedule_function(evaluate, date_rules.month_start(), time_rules.market_open())
    context.last_month = 0
    context.tax_losses = 0
    context.securities = {}
    context.blacklist = {}
    context.extra_comps = 0
    set_commission(commission.PerTrade(cost=0))
    
def before_trading_start(context,data):
    #check blacklisted stocks for 30 days
    remove = {}
    if len(context.blacklist) > 0:
        for stock in context.blacklist:
            if context.blacklist[stock] == 0:
                remove[stock] = 0
            else:
                context.blacklist[stock] -= 1
    for stock in remove:
        del context.blacklist[stock]
    
    
    #refresh fundamentals
    x = 500
    
    month = get_datetime().month
    if context.last_month == month: return
    context.last_month = month
    context.extra_comps += 1
    
    context.fundamentals = get_fundamentals(query(fundamentals.valuation.market_cap, 
                             fundamentals.company_reference.primary_exchange_id)
                        .filter(fundamentals.valuation.market_cap > 5e8)
                        .filter(fundamentals.company_reference.country_id == "USA")
                        .filter(or_(fundamentals.company_reference.primary_exchange_id == "NAS", fundamentals.company_reference.primary_exchange_id == "NYS"))
                        .order_by(fundamentals.valuation.market_cap.desc())
                        .limit(x)
    )
     
    record(leverage = context.account.leverage, tax_benefit = context.tax_losses)

def trade(context, data):
    #replace and rebalance stocks
   
    if get_open_orders(): return
    
    for stock in context.portfolio.positions:
        if stock not in context.securities and data.can_trade(stock): 
            order_target(stock, 0)
            
    for stock in context.securities:
        if data.can_trade(stock):
            order_target_percent(stock, 1.0/len(context.securities))
        
def evaluate(context, data):
    #remove stocks that have lost for tax harvesting; blacklist them for 31 days to prevent wash sale concerns
    remove = {}
    if len(context.securities) > 0:
        for stock in context.securities:
            px_0 = context.portfolio.positions[stock].cost_basis
            px = data.current(stock, 'price')
            if not data.can_trade(stock):
                remove[stock] = 0
            elif px < px_0 :
                remove[stock] = 0
                context.blacklist[stock] = 31
                context.tax_losses += (px_0 - px) * context.portfolio.positions[stock].amount
    for stock in remove:
        del context.securities[stock]
    
    x = len(context.securities)    
    y = 30 + (context.extra_comps / 12)*5
    z = y - x
    print x
    print y
    print z
    
    
    #replace blacklisted stocks with new companies 
    new_comps = []
    
    for stock in context.fundamentals:
        if (data.can_trade(stock)) and stock not in context.blacklist and stock not in context.securities:
            new_comps.append(stock)
    
    random.shuffle(new_comps)
    new_comps = new_comps[:z] 
    for stock in new_comps:
        context.securities[stock] = 1
    print len(context.securities)
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
There was a runtime error.
3 responses

@James

Just cloned and ran this algorithm, not sure yet why my run had better results. Interesting idea and hope to study your concept in the future.

Clone Algorithm
9
Loading...
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Random 10 from US30

import numpy as np
import pandas as pd
from sqlalchemy import or_
import random
import datetime
#random.seed(987654321)

def initialize(context):
    schedule_function(trade, date_rules.month_start(), time_rules.market_open(hours=1))
    schedule_function(evaluate, date_rules.month_start(), time_rules.market_open())
    context.last_month = 0
    context.tax_losses = 0
    context.securities = {}
    context.blacklist = {}
    context.extra_comps = 0
    set_commission(commission.PerTrade(cost=0))
    
def before_trading_start(context,data):
    #check blacklisted stocks for 30 days
    remove = {}
    if len(context.blacklist) > 0:
        for stock in context.blacklist:
            if context.blacklist[stock] == 0:
                remove[stock] = 0
            else:
                context.blacklist[stock] -= 1
    for stock in remove:
        del context.blacklist[stock]
    
    
    #refresh fundamentals
    x = 500
    
    month = get_datetime().month
    if context.last_month == month: return
    context.last_month = month
    context.extra_comps += 1
    
    context.fundamentals = get_fundamentals(query(fundamentals.valuation.market_cap, 
                             fundamentals.company_reference.primary_exchange_id)
                        .filter(fundamentals.valuation.market_cap > 5e8)
                        .filter(fundamentals.company_reference.country_id == "USA")
                        .filter(or_(fundamentals.company_reference.primary_exchange_id == "NAS", fundamentals.company_reference.primary_exchange_id == "NYS"))
                        .order_by(fundamentals.valuation.market_cap.desc())
                        .limit(x)
    )
     
    record(leverage = context.account.leverage, tax_benefit = context.tax_losses)

def trade(context, data):
    #replace and rebalance stocks
   
    if get_open_orders(): return
    
    for stock in context.portfolio.positions:
        if stock not in context.securities and data.can_trade(stock): 
            order_target(stock, 0)
            
    for stock in context.securities:
        if data.can_trade(stock):
            order_target_percent(stock, 1.0/len(context.securities))
        
def evaluate(context, data):
    #remove stocks that have lost for tax harvesting; blacklist them for 31 days to prevent wash sale concerns
    remove = {}
    if len(context.securities) > 0:
        for stock in context.securities:
            px_0 = context.portfolio.positions[stock].cost_basis
            px = data.current(stock, 'price')
            if not data.can_trade(stock):
                remove[stock] = 0
            elif px < px_0 :
                remove[stock] = 0
                context.blacklist[stock] = 31
                context.tax_losses += (px_0 - px) * context.portfolio.positions[stock].amount
    for stock in remove:
        del context.securities[stock]
    
    x = len(context.securities)    
    y = 30 + (context.extra_comps / 12)*5
    z = y - x
    print x
    print y
    print z
    
    
    #replace blacklisted stocks with new companies 
    new_comps = []
    
    for stock in context.fundamentals:
        if (data.can_trade(stock)) and stock not in context.blacklist and stock not in context.securities:
            new_comps.append(stock)
    
    random.shuffle(new_comps)
    new_comps = new_comps[:z] 
    for stock in new_comps:
        context.securities[stock] = 1
    print len(context.securities)
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
There was a runtime error.

The underlying stock selection is done randomly, so each backtest is going to be different. What's interesting to me is that I've run this backtest 10 times, and the worst one matched SP500 returns.

It'd be awesome if someone could create a notebook where we could run a bunch of backtests at once to see how the randomness materializes. I tried, but I'm not familiar enough with zipline to run a dynamic stock selection backtest

I did not realize the stocks were selected randomly. That makes testing the concept a real challenge.