Simple Pairs Trading Strategy: BP & Shell

Regression dates: 2014 - 2018

163
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import numpy as np
import statsmodels.api as sm
import statsmodels.tsa.stattools as ts
import pandas as pd

# ~~~~~~~~~~~~~~~~~~~~~~ TESTS FOR FINDING PAIR TO TRADE ON ~~~~~~~~~~~~~~~~~~~~~~
"""
Augmented Dickey–Fuller (ADF) unit root test
Source: http://www.pythonforfinance.net/2016/05/09/python-backtesting-mean-reversion-part-2/
"""

def __init__(self):
self.p_value = None
self.five_perc_stat = None
self.perc_stat = None
self.p_min = .0
self.p_max = .05
self.look_back = 63

self.p_value = model[1]
self.five_perc_stat = model[4]['5%']
self.perc_stat = model[0]

def use_P(self):
return (self.p_value > self.p_min) and (self.p_value < self.p_max)

def use_critical(self):
return abs(self.perc_stat) > abs(self.five_perc_stat)

"""
# DEPRECATED
class KPSS(object):
#Kwiatkowski-Phillips-Schmidt-Shin (KPSS) stationarity tests
def __init__(self):
Exception("Not implemented yet")
self.p_value = None
self.ten_perc_stat = None
self.perc_stat = None
self.p_min = 0.0
self.p_max = 0.2
self.look_back = 50

def apply_kpss(self, time_series):
self.five_perc_stat = ts.adfuller(time_series, 1)[4]['5%'] # possibly make this 10%

def use(self):
return (self.p_value > self.p_min) and (self.p_value < self.p_max) and (self.perc_stat > self.five_perc_stat)
"""

class Half_Life(object):
"""
Half Life test from the Ornstein-Uhlenbeck process
Source: http://www.pythonforfinance.net/2016/05/09/python-backtesting-mean-reversion-part-2/
"""

def __init__(self):
self.hl_min = 1.0
self.hl_max = 42.0
self.look_back = 43
self.half_life = None

def apply_half_life(self, time_series):
lag = np.roll(time_series, 1)
lag[0] = 0
ret = time_series - lag
ret[0] = 0

# adds intercept terms to X variable for regression

model = sm.OLS(ret, lag2)
res = model.fit()

self.half_life = -np.log(2) / res.params[1]

def use(self):
return (self.half_life < self.hl_max) and (self.half_life > self.hl_min)

class Hurst():
"""
If Hurst Exponent is under the 0.5 value of a random walk, then the series is mean reverting
Source: https://www.quantstart.com/articles/Basics-of-Statistical-Mean-Reversion-Testing
"""

def __init__(self):
self.h_min = 0.0
self.h_max = 0.4
self.look_back = 126
self.lag_max = 100
self.h_value = None

def apply_hurst(self, time_series):
"""Returns the Hurst Exponent of the time series vector ts"""
# Create the range of lag values
lags = range(2, self.lag_max)

# Calculate the array of the variances of the lagged differences
tau = [np.sqrt(np.std(np.subtract(time_series[lag:], time_series[:-lag]))) for lag in lags]

# Use a linear fit to estimate the Hurst Exponent
poly = np.polyfit(np.log10(lags), np.log10(tau), 1)

# Return the Hurst exponent from the polyfit output
self.h_value = poly[0]*2.0

def use(self):
return (self.h_value < self.h_max) and (self.h_value > self.h_min)

# ~~~~~~~~~~~~~~~~~~~~~~ FUNCTIONS FOR FILING AN ORDER ~~~~~~~~~~~~~~~~~~~~~~
def hedge_ratio(Y, X):
# Look into using Kalman Filter to calculate the hedge ratio
model = sm.OLS(Y, X).fit()
return model.params[1]

def softmax_order(stock_1_shares, stock_2_shares, stock_1_price, stock_2_price):
stock_1_cost = stock_1_shares * stock_1_price
stock_2_cost = stock_2_shares * stock_2_price
costs = np.array([stock_1_cost, stock_2_cost])
return np.exp(costs) / np.sum(np.exp(costs), axis=0)

def initialize(context):
"""
Called once at the start of the algorithm.
"""

context.asset_pairs = [[symbol('BP'), symbol('RDS.A'), {'in_short': False, 'in_long': False, 'spread': np.array([]), 'hedge_history': np.array([])}]]
#[symbol('YUM'), symbol('MCD'), {'in_short': False, 'in_long': False, 'spread': np.array([]), 'hedge_history': np.array([])}]]
context.z_back = 20
context.hedge_lag = 2
context.entry_z = 0.5

schedule_function(my_handle_data, date_rules.every_day(),
time_rules.market_close(hours=4))
# Typical slippage and commision I have seen others use and is used in templates by Quantopian
set_slippage(slippage.VolumeShareSlippage(volume_limit=0.025, price_impact=0.1))

def my_handle_data(context, data):
"""
Called every day.
"""

if get_open_orders():
return

for i in range(len(context.asset_pairs)):
pair = context.asset_pairs[i]
new_pair = process_pair(pair, context, data)
context.asset_pairs[i] = new_pair

def process_pair(pair, context, data):
"""
Main function that will execute an order for every pair.
"""

# Get stock data
stock_1 = pair[0]
stock_2 = pair[1]
prices = data.history([stock_1, stock_2], "price", 300, "1d")
stock_1_P = prices[stock_1]
stock_2_P = prices[stock_2]
in_short = pair[2]['in_short']
in_long = pair[2]['in_long']
hedge_history = pair[2]['hedge_history']

# Get hedge ratio (look into using Kalman Filter)
try:
hedge = hedge_ratio(stock_1_P, stock_2_P)
except ValueError as e:
log.error(e)

hedge_history = np.append(hedge_history, hedge)

if hedge_history.size < context.hedge_lag:
log.debug("Hedge history too short!")

hedge = hedge_history[-context.hedge_lag]
spread, stock_1_P[-1] - hedge * stock_2_P[-1])

half_life = Half_Life()
hurst = Hurst()

# Check if current window size is large enough for adf, half life, and hurst exponent

# possible "SVD did not converge" error because of OLS
try:
except:

# Check if they are in fact a stationary (or possibly trend stationary...need to avoid this) time series
# * Only cancel if all measures believe it isn't stationary
if not adf.use_P() and not adf.use_critical() and not half_life.use() and not hurst.use():
if in_short or in_long:
# Enter logic here for how to handle open positions after mean reversion
log.info('Tests have failed. Exiting open positions')
order_target(stock_1, 0)
order_target(stock_2, 0)
in_short = in_long = False

log.debug("Not Stationary!")

# Check if current window size is large enough for Z score

# Record measures
if stock_1 == sid(5061):
record(Z_tech=z_score)
record(Hedge_tech=hedge)
else:
record(Z_Score=z_score)
record(Hedge_Ratio=hedge)

# Close order logic
if in_short and z_score < 0.0:
order_target(stock_1, 0)
order_target(stock_2, 0)
in_short = False
in_long = False
elif in_long and z_score > 0.0:
order_target(stock_1, 0)
order_target(stock_2, 0)
in_short = False
in_long = False

# Open order logic
if (z_score < -context.entry_z) and (not in_long):
stock_1_shares = 1
stock_2_shares = -hedge
in_long = True
in_short = False
(stock_1_perc, stock_2_perc) = softmax_order(stock_1_shares, stock_2_shares, stock_1_P[-1], stock_2_P[-1])
order_target_percent(stock_1, stock_1_perc)
order_target_percent(stock_2, stock_2_perc)
elif z_score > context.entry_z and (not in_short):
stock_1_shares = -1
stock_2_shares = hedge
in_short = True
in_long = False
(stock_1_perc, stock_2_perc) = softmax_order(stock_1_shares, stock_2_shares, stock_1_P[-1], stock_2_P[-1])
order_target_percent(stock_1, stock_1_perc)
order_target_percent(stock_2, stock_2_perc)

return [stock_1, stock_2, {'in_short': in_short, 'in_long': in_long, 'spread': spread, 'hedge_history': hedge_history}]
There was a runtime error.
3 responses

Thanks for posting this. A good example of cointegration at work.

Having previously worked in Oil & Gas Exploration & Production (O&G E&P) myself for many years, while modeling financial markets was also my hobby outside of my "day job" before i "retired", i was quite interested to look at a few others including not only the publicly listed Major operating companies (OpCos) such as BP, XOM, CVX, some of the smaller companies such as APA, DVN, etc, as well as the oilfield service companies SLB, HAL, etc, and various oil-related ETFs, and of course the prices of the underlying commodities Oil, Gas, and refined products.

By looking at financial entities of similar size and operational profiles where many of the same factors are at work in both, these should be some very nice low-risk market-neutral trades, as you have identified. The problem that i found was that the cointegration relationships that hold so well for periods of limited duration (e.g. the last 3 years in your example certainly look great) but can also break down suddenly with little or no warning. Sometimes the unpredictable reasons are readily apparent (e.g. a platform fire or tanker oil spill for example), at other times the reason is less clear (e.g. Haliburton's involvement in various non-oilfield operations), and sometimes the reason for the breakdown in the relationship is not clear at all, perhaps related to "investor sentiment" as much as anything else. Whatever the reasons, the caveat is that the nice relationships that are so carefully found can also suddenly disappear without much warning. But nevertheless, still potentially good opportunities. Nice example. Cheers, best regards.

Nice work! For reasons Tony mentioned, most people add a good news feed as a filter to improve performance. Stocks are more likely to mean revert if there is no news.

I’m also very impressed. Thanks for sharing!