Oil Shares Pair Trade based on Kalman Filter & Mahalanobis Distance

Hi, this is my first entry on Quantopian. I hope to seek suggestions on my strategy and results.

My strategy attempts to take advantage of market psychology due to oil price movements in the recent years. Greed and fear has led to close scrutiny of oil stocks, which has also likely lead to stronger co-integration within them.

Kalman filter is applied to compute beta and beta hedging ratio between two Oil & Gas equipment & services companies (Haliburton & Schlumberger). Co-integration test verified the pair and it was one of best pair I could find. I also coded the Kalman filter for the sake of learning and it was running appreciably faster than the pykalman module in my notebook.

The transition and measurement noise is estimated based on the sum of innovation chi square test and verified with innovation sequence chart and innovation residue auto-correlation test.

Mahalanobis distance, a result that can be obtained from the mathematics of Kalman filters is used to compute leverage. It is an attempt to selectively apply leverage. The equation “y = A/x + B” is used to bind the leverage ratio within 1 to 2 to avoid excessive leverage and meet Q fund requirement. It performed better than sigmoid shaped functions.

For similar mean leverage ratio strategy, I noted that leveraging shorter distance trade generates higher return & Sharpe as it appears more frequently. However it is riskier and could lead to larger max drawdown. Longer distance trade seems safer but does not occur as frequently to beat the market.

In this backtest, the unleveraged formula is able to track the market return with 0.02 beta and a max drawdown of 9.4%. For a mean leverage of ~1.85 (similar to the backtest result below), the algorithm can be pushed to obtain 200% return at the cost of 18.5% max drawdown with the commented leverage formula. The strategy does not work well before 2011, except for 2008.

I also have a small margin of safety (1.05*std. dev) to avoid excessive trading, which erodes profit due to commission fee.

Questions:

1. How do I further improve the quality of entering and exit a trade? I want to learn how squeeze more alpha/sharpe from a pair. Adaptive Kalman filter / OU process works?
2. How does code performance/speed affect trade execution especially minute frequency trade?
3. Q fund worthy? :P
323
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import numpy as np

def initialize(context):
set_slippage(slippage.VolumeShareSlippage(volume_limit=0.025, price_impact=0.10))

context.S1 = sid(3443) #hal ,y,
context.S2 = sid(6928) #slb ,x

context.model = myKalman(init_mean = np.zeros((2,1)),
init_covar = np.ones((2,2)),
transit_mat = np.eye(2),
transit_covar = 3.5e-5 * np.eye(2),
observe_mat = None,
observe_covar =  3.5e-3,
transit_offset = 0,
observe_offset = 0)

context.in_position = None

def handle_data(context, data):

Model = context.model
Model.H = np.array([data[context.S2].price, 1.0]).reshape((1, 2))
means, covars = Model.myFilter(data[context.S1].price)

#mahalanobis distance / innovation chi squre
maha_dist= np.sqrt(Model.e.T.dot(np.linalg.inv( Model.S)).dot(Model.e))

if context.in_position is not None:

if (context.in_position == 'long' and Model.e > - 1.0*np.sqrt(Model.S)) or (context.in_position == 'short' and Model.e < 1.0*np.sqrt(Model.S)):

order(context.S1 ,0.0)
order(context.S2 ,0.0)

context.in_position = None

log.info( ' %s ' % context.in_position)
log.info( str(context.account.leverage) + ' 0' + ' 0')
record(lev=context.account.leverage, s1=0,s2=0)

elif context.in_position is  None :

# requirement  enter a long position
if Model.e < -1.05* np.sqrt(Model.S) :

lev1 = Dist2Leverage(maha_dist)
percentage_of_S1, percentage_of_S2 = PairPercentOrder(1,means,
data[context.S1].price,
data[context.S2].price,
lev1)

order_target_percent(context.S1 ,    percentage_of_S1 )
order_target_percent(context.S2 ,   -percentage_of_S2 )

context.in_position = 'long'

log.info( ' %s ' % context.in_position)
log.info( str(context.account.leverage) +
str(percentage_of_S1) +
str(percentage_of_S2))

record(lev=context.account.leverage,s1=percentage_of_S1, s2=percentage_of_S2)

# requirement  enter a short position
elif Model.e > 1.05* np.sqrt(Model.S) :

lev2 = Dist2Leverage(maha_dist)
percentage_of_S1, percentage_of_S2 = PairPercentOrder(1,means,
data[context.S1].price,
data[context.S2].price,
lev2)

order_target_percent(context.S1 , -percentage_of_S1 )
order_target_percent(context.S2 ,  percentage_of_S2 )

context.in_position = 'short'

log.info( '%s ' % context.in_position)
log.info( str(context.account.leverage) +
str(percentage_of_S1) +
str(percentage_of_S2))

record(lev=context.account.leverage,s1=percentage_of_S1, s2=percentage_of_S2)

#class of Kalman filter
class myKalman(object):

def __init__(self,init_mean,init_covar,transit_mat,transit_covar,observe_mat,observe_covar,transit_offset,observe_offset):
self.Uo=init_mean
self.Po=init_covar
self.F=transit_mat
self.Q=transit_covar
self.H=observe_mat
self.R=observe_covar
self.b=transit_offset
self.d=observe_offset
self.xt_t=None
self.Pt_t=None

def myFilter(self,Z):

if self.xt_t is not None and self.Pt_t is not None :
#predict phase
self.xt_t1 = self.F.dot(self.xt_t) + self.b
self.Pt_t1 = self.F.dot(self.Pt_t).dot(self.F.T) + self.d + self.Q

else:
#initialize
self.xt_t1 = self.Uo
self.Pt_t1 = self.Po

#innovation
self.e = Z - self.H.dot(self.xt_t1)
self.S = self.H.dot(self.Pt_t1).dot(self.H.T) + self.R

#Kalman Gain
self.K = self.Pt_t1.dot(self.H.T).dot(np.linalg.inv(self.S))

# Update phase
self.xt_t = self.xt_t1 + self.K.dot(self.e)
self.Pt_t = self.Pt_t1 - self.K.dot(self.H).dot(self.Pt_t1)

return self.xt_t , self.Pt_t

#calculate percentages of portfolio to be ordered to be ordered for each pair
def PairPercentOrder(ratio1,ratio2,price1,price2 ,Z):
p1_percent = (abs(Z)*ratio1*price1) / (abs(ratio1*price1) + abs(price2*ratio2))
p2_percent = (abs(Z)*ratio2*price2) / (abs(ratio1*price1) + abs(price2*ratio2))
return p1_percent, p2_percent

# convert mahanalobis distance to leverage
def Dist2Leverage(x):

#y=1 #control
#y = 0.55/x + 1.75 #lev about 2 max draw 18.5 return 200% sharpe = 1.78
y = 1.1/x + 1.1
return y

There was a runtime error.
3 responses

Maybe trying to see if a momentum or MA indicator would help to enter and exit trades with better timing ?
I am willing to cooperate if you want...

Hi Gabriel,

I am happy to collaborate with anyone. Momentum is a known "anomaly" according to EMH. It will be interesting if we can incorporate it in the model.

However, how are we going to measure momentum effectively? long & short MA is sufficient?
I am worried of over-fitting MA period if we used that approach.

Hey guys,
I thinking of migrating the code to the new Syntax.
Any punters on it?