Got this work finally. The strategy is a linear mean reversion strategy pair trading EWA and EWC. The trading used minute data. The earning is quite consistent with the description in Ernie's book.

However, the disappointing thing is that I found it quite difficult to long or short a large amount of EWA or EWC at close price. If I want to long or short more than 50 shares of EWA or EWC each day, I simply can't long or short these shares at the close price of that day. I can't only achieve this earning using $1000 of starting cash and base share of only 20 shares. If I try to use more money I found it difficult to get similar performance because I can't long or short EWA and EWC at the price that I want. Please comment and advise if you have better ideas to improve this algo. Thanks. 1655 Loading... Total Returns -- Alpha -- Beta -- Sharpe -- Sortino -- Max Drawdown -- Benchmark Returns -- Volatility --  Returns 1 Month 3 Month 6 Month 12 Month  Alpha 1 Month 3 Month 6 Month 12 Month  Beta 1 Month 3 Month 6 Month 12 Month  Sharpe 1 Month 3 Month 6 Month 12 Month  Sortino 1 Month 3 Month 6 Month 12 Month  Volatility 1 Month 3 Month 6 Month 12 Month  Max Drawdown 1 Month 3 Month 6 Month 12 Month import numpy as np import pandas as pd from collections import deque from pytz import timezone window_days = 28 # window length in minutes window_minutes = window_days*390 # window length in minutes; each trading day has 390 minutes def initialize(context): context.max_notional = 100000 context.min_notional = -100000 context.stocks = [sid(14516), sid(14517)] context.evec = [0.943, -0.822] context.unit_shares = 20 context.tickers = [int(str(e).split(' ')[0].strip("Security(")) for e in context.stocks] context.prices = pd.DataFrame({ k : pd.Series() for k in context.tickers } ) context.previous_datetime = None context.new_day = None set_commission(commission.PerShare(cost=0.00)) def handle_data(context, data): # skip tic if any orders are open or any stocks did not trade for stock in context.stocks: if bool(get_open_orders(stock)) or data[stock].datetime < get_datetime(): return current_datetime = get_datetime().astimezone(timezone('US/Eastern')) # detect new trading day if context.previous_datetime is None or current_datetime.day != context.previous_datetime.day: context.new_day = True context.previous_datetime = current_datetime #log.info("len price: {lp}, window: {window}".format(lp = len(context.prices), window = window_days)) if len(context.prices)<window_days and context.new_day: context.previous_datetime = get_datetime().astimezone(timezone('US/Eastern')) if intradingwindow_check(context): newRow = pd.DataFrame({k:float(data[s].price) for k,s in zip(context.tickers, context.stocks) },index=[0]) context.prices = context.prices.append(newRow, ignore_index = True) context.new_day = False else: if intradingwindow_check(context) and context.new_day: #context.new_day = False comb_price_past_window = np.zeros(len(context.prices)) for ii,k in enumerate(context.tickers): comb_price_past_window += context.evec[ii]*context.prices[k] meanPrice = np.mean(comb_price_past_window); stdPrice = np.std(comb_price_past_window) comb_price = sum([e*data[s].price for e,s in zip(context.evec, context.stocks)]) h = (comb_price - meanPrice)/stdPrice current_amount = []; cash_spent = []; for ii, stock in enumerate(context.stocks): current_position = context.portfolio.positions[stock].amount new_position = context.unit_shares * (-h) * context.evec[ii] current_amount.append(new_position) cash_spent.append((new_position - current_position)*data[stock].price) order(stock, new_position - current_position) context.new_day = False #log.info("ordered!") notionals = [] for ii,stock in enumerate(context.stocks): #notionals.append((context.portfolio.positions[stock].amount*data[stock].price)/context.portfolio.starting_cash) notionals.append((context.portfolio.positions[stock].amount*data[stock].price)/context.portfolio.starting_cash) log.info("h = {h}, comb_price = {comb_price}, notionals = {notionals}, total = {tot}, price0 = {p0}, price1 = {p1}, cash = {cash}, amount = {amount}, new_cash = {nc}".\ format(h = h, comb_price = comb_price, notionals = notionals, \ tot = context.portfolio.positions_value + context.portfolio.cash, p0 = data[context.stocks[0]].price, \ p1 = data[context.stocks[1]].price, cash = context.portfolio.cash, amount = current_amount, \ nc = context.portfolio.cash - sum(cash_spent))) newRow = pd.DataFrame({k:float(data[s].price) for k,s in zip(context.tickers, context.stocks) },index=[0]) context.prices = context.prices.append(newRow, ignore_index = True) context.prices = context.prices[1:len(context.prices)] record(h = h, mPri = meanPrice) record(comb_price = comb_price) record(not0 = notionals[0], not1 = notionals[1]) #if not context.new_day: # log.info("time = {time}, cash = {cash}".format(cash = context.portfolio.cash, time = current_datetime)) #record(price0 = data[context.stocks[0]].price*abs(context.evec[0]), price1 = data[context.stocks[1]].price*abs(context.evec[1])) #record(price0 = data[context.stocks[0]].price, price1 = data[context.stocks[1]].price) #record(port = context.portfolio.positions_value, cash = context.portfolio.cash) def intradingwindow_check(context): # Converts all time-zones into US EST to avoid confusion loc_dt = get_datetime().astimezone(timezone('US/Eastern')) # if loc_dt.hour > 10 and loc_dt.hour < 15: if loc_dt.hour == 15 and loc_dt.minute > 0: return True else: return False  We have migrated this algorithm to work with a new version of the Quantopian API. The code is different than the original version, but the investment rationale of the algorithm has not changed. We've put everything you need to know here on one page. This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes. There was a runtime error. 8 responses Hi Huapu, Can you explain the strategy? I'm wiling to help you, but I don't have access to the book and don't quite understand your code. Regards, Ed Hello Huapu, The algorithm seems to rely on the 2008-2009 crash to be profitable. Also, my understanding is that$1K capital is not adequate if you are shorting; you'll need at least $25K, due to regulatory requirements. Grant Hello Huapu, EWA and EWC trade 1M - 1.5M shares each day on average according to Yahoo Finance. You could modify commission and slippage in initialize as shown in this example:  set_commission(commission.PerShare(cost=0.005)) set_slippage(slippage.FixedSlippage(spread=0.00)  The second line allows you to trade in a backtest irrespective of volume. P. Let me explain the algorithm first. The algo is based on pair trading EWA and EWC. I assigned a pre-defined weight to EWA and EWC in the algo in context.evec, which means if I order 0.943 share of EWA, I will order -0.822 share of EWC, and vice versa. Then I look at the combined price of the EWA/EWC pair weighted by their weights in the context.evec, which would be a semi-stationary price. I will have -h*context.unit_shares of the EWA/EWC pair, where h is the z-score of the combined EWA/EWC pair price. So this basically a mean-reversion strategy trading on the spread between EWA and EWC. The problem I have is that this strategy relies on I can buy EWA/EWC at the price I want. However, when I increase the number of shares that I long/short, the price start to slip. For example, here I increase context.unit_shares to 200, which means I will long/short on the order of 200*z-score shares of EWA/EWC everyday. In this Full Backtest attached, you can see in the Transaction Details that the algo bought 125 shares of EWA at$20.38 at 2006-06-06 10:04:00, 150 shares at $20.34 at 2006-06-06 10:07:00, and 109 shares at$20.20 at 2006-06-06 10:08:00. But really what I meant is to buy (125+150+109) shares at the same price at once. However, if I set context.unit_shares to 20, I don't have the problem because I can buy 30 shares of EWA all at once.

This price slippage problem will basically kill most of pair trading strategy that has a small price spread, if you can't buy enough shares at the price you want. Can you give some suggestions on that? Of course the regulation on minimum short of $25K would kill the algo as well. If I have to short at least$25K, I am sure the price that I short the stock would be all over the place.

Forgot to attach the Full Backtest using context.unit_shares of 200. Here it is. The Transaction Details show the slippage problem.

1655
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import numpy as np
import pandas as pd
from collections import deque
from pytz import timezone

window_days = 28 # window length in minutes
window_minutes = window_days*390 # window length in minutes; each trading day has 390 minutes

def initialize(context):
context.max_notional = 100000
context.min_notional = -100000

context.stocks = [sid(14516), sid(14517)]
context.evec = [0.943, -0.822]
context.unit_shares = 200
context.tickers = [int(str(e).split(' ')[0].strip("Security(")) for e in context.stocks]
context.prices = pd.DataFrame({ k : pd.Series() for k in context.tickers } )
context.previous_datetime = None
context.new_day = None
set_commission(commission.PerShare(cost=0.00))

def handle_data(context, data):

# skip tic if any orders are open or any stocks did not trade
for stock in context.stocks:
if bool(get_open_orders(stock)) or data[stock].datetime < get_datetime():
return

current_datetime = get_datetime().astimezone(timezone('US/Eastern'))
if context.previous_datetime is None or current_datetime.day != context.previous_datetime.day:
context.new_day = True
context.previous_datetime = current_datetime

#log.info("len price: {lp}, window: {window}".format(lp = len(context.prices), window = window_days))
if len(context.prices)<window_days and context.new_day:
context.previous_datetime = get_datetime().astimezone(timezone('US/Eastern'))
newRow = pd.DataFrame({k:float(data[s].price) for k,s in zip(context.tickers, context.stocks) },index=[0])
context.prices = context.prices.append(newRow, ignore_index = True)
context.new_day = False
else:
#context.new_day = False
comb_price_past_window = np.zeros(len(context.prices))
for ii,k in enumerate(context.tickers):
comb_price_past_window += context.evec[ii]*context.prices[k]

meanPrice = np.mean(comb_price_past_window); stdPrice = np.std(comb_price_past_window)
comb_price = sum([e*data[s].price for e,s in zip(context.evec, context.stocks)])
h = (comb_price - meanPrice)/stdPrice
current_amount = []; cash_spent = [];
for ii, stock in enumerate(context.stocks):
current_position = context.portfolio.positions[stock].amount
new_position = context.unit_shares * (-h) * context.evec[ii]
current_amount.append(new_position)
cash_spent.append((new_position - current_position)*data[stock].price)
order(stock, new_position - current_position)
context.new_day = False
#log.info("ordered!")

notionals = []
for ii,stock in enumerate(context.stocks):
#notionals.append((context.portfolio.positions[stock].amount*data[stock].price)/context.portfolio.starting_cash)
notionals.append((context.portfolio.positions[stock].amount*data[stock].price)/context.portfolio.starting_cash)

log.info("h = {h}, comb_price = {comb_price}, notionals = {notionals}, total = {tot}, price0 = {p0}, price1 = {p1}, cash = {cash}, amount = {amount}, new_cash = {nc}".\
format(h = h, comb_price = comb_price, notionals = notionals, \
tot = context.portfolio.positions_value + context.portfolio.cash, p0 = data[context.stocks[0]].price, \
p1 = data[context.stocks[1]].price, cash = context.portfolio.cash, amount = current_amount, \
nc = context.portfolio.cash - sum(cash_spent)))

newRow = pd.DataFrame({k:float(data[s].price) for k,s in zip(context.tickers, context.stocks) },index=[0])
context.prices = context.prices.append(newRow, ignore_index = True)
context.prices = context.prices[1:len(context.prices)]

record(h = h, mPri = meanPrice)
record(comb_price = comb_price)
record(not0 = notionals[0], not1 = notionals[1])
#if not context.new_day:
#    log.info("time = {time}, cash = {cash}".format(cash = context.portfolio.cash, time = current_datetime))
#record(price0 = data[context.stocks[0]].price*abs(context.evec[0]), price1 = data[context.stocks[1]].price*abs(context.evec[1]))
#record(price0 = data[context.stocks[0]].price, price1 = data[context.stocks[1]].price)
#record(port = context.portfolio.positions_value, cash = context.portfolio.cash)

# Converts all time-zones into US EST to avoid confusion
loc_dt = get_datetime().astimezone(timezone('US/Eastern'))
# if loc_dt.hour > 10 and loc_dt.hour < 15:
if loc_dt.hour > 12 and loc_dt.minute > 0:
return True
else:
return False

We have migrated this algorithm to work with a new version of the Quantopian API. The code is different than the original version, but the investment rationale of the algorithm has not changed. We've put everything you need to know here on one page.
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

Hi peter,

Can I ask how to you derive the weights for the two stocks?
Which period of data did you use to derive the weights?
Thanks.

I believe the johansen test is the answer for determining the correct weights... see this blog post for an example https://robotwealth.com/exploring-mean-reversion-and-cointegration-part-2/ and general post for pairs trading http://epchan.blogspot.com/2006/11/cointegration-is-not-same-as.html

@Jianwei Wang

You can use slope of linear regression between EWA and EWC to determine the weight of two stocks. Another approach rightly pointed out above is to use Johansen test's Eigen vectors as weights. More details about these methods can be found in course by Dr. Ernie Chan on Mean Reversion.

A pitfall alert is don't consider full data to determine weight as you will introduce look ahead bias in your back testing results.