kalman filter

trying to code my first python class. should I be naming the updateState() method handle_add() instead? I got confused by the function decorators (https://github.com/quantopian/zipline/blob/master/zipline/transforms/utils.py), started googling those, but its a bit late so I'll check it out tomorrow.

Any tips on how to fit this before I run it on out of sample data? I'm not too familiar with the python modules.

Notation (sort of) follows shumway and stoffers book "Time Series Analysis and Its Applications: With R Examples."

import numpy as np
from zipline.transforms.utils import EventWindow

class Kalman(EventWindow):
def __init__(self, mu0, Sigma0, PHIfit, Qfit, Rfit):
#initial state guesses are mu0, Sigma0
#note we are not assuming Bayesian priors...just fixed values
self.X = mu0
self.P = Sigma0

#things that should be from a fit are PHI, Q, and R
self.PHI = PHIfit
self.Q = Qfit
self.R = Rfit

#some convenience functions
def predictState(prevX):
return(PHI*prevX)

def predictStateCov(prevP):
return(PHI*prevP*transpose(PHI) + Q)

def predictObserv(prevX):
return(A*predictState(prevX))
def predictObservCov(prevP):
return( (A * predictStateCov(predP) *transpose(A)) + R)
#this is the guy that gets called every minute
predP = predictStateCov(P)
GAIN = predP*transpose(A)* ( (A*predP*transpose(A)) + R).getI()
innov = observedY - predictObserv(X)
I = np.eye(self.X.shape[0])
self.X = predX + GAIN*innov
self.P = (I - (GAIN*A))*predP

def initialize(context):
context.stock = sid(26578)

4 responses

Hi Taylor,

The algorithm needs to have a top level handle_data method. So, you could instantiate your Kalman object in the initialize, and then pass updates into it from handle_data. Could you explain the parameters expected by the Kalman class' init and the updateState methods?

You would do something like this:

# assuming the Kalman class above is defined first in the script

def initialize(context):
context.stock = sid(26578)
# construct the Kalman class
context.kalman = Kalman(...)
def handle_data(context, data):
# pass data to Kalman to update state


It seems like updateState is expecting matrix parameters - you may be able to use a batch transform to build those parameters. The algorithm would look something like this:

# assuming the Kalman class above is defined first in the script

def initialize(context):
context.stock = sid(26578)
# construct the Kalman class
context.kalman = Kalman(...)
def handle_data(context, data):
# pass data to Kalman to update state
update_kalman(data, context.kalman)

@batch_transform(refresh_period=1, window_length=20)
def update_kalman(datapanel, kalman):
# do stuff to prices to calculate the parameters for the update


thanks,
fawce

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

:::So, you could instantiate your Kalman object in the initialize, and then pass updates into it from handle_data.

Right. Just worried about defining the class right now.

:::It seems like updateState is expecting matrix parameters

Yeah. And batch transforms return pandas panels, right? So inside update_kalman() I'd have to define some auxiliary pandas stuff as matrices, then make the call to updateState, and then it returns more panels objects? Damn. You guys made the right choice using panels, but state space models are a bit difficult without matrices.

def batch_transform(func):
"""Decorator function to use instead of inheriting from BatchTransform.
For an example on how to use this, see the doc string of BatchTransform.
"""

def create_window(*args, **kwargs):
# passes the user defined function to BatchTransform which it
# will call instead of self.get_value()
return BatchTransform(*args, func=func, **kwargs)

return create_window


::: Could you explain the parameters expected by the Kalman class' init and the updateState methods?>

State space models have a latent variable equation, and an observed variable equation. X is the latent state (think of it as a filtered true price/price vector), and P is the covariance of that filtered price/price vector. Both of these are conditional on all the observations (Y) that have been seen. The notation is a bit deprecated. The Kalman filter algorithm updates these two quantities at every minute. The way it does this is by using the Kalman filter equations. Derivation of these equations requires Baye's rule, and that theorem about the distributions when you condition bits of a joint gaussian vector on itself.

In the latent equation, X is assumed to be Markovian. Phi is its transition matrix. It's assumed to be fixed here. Q is the error term.

R is the error term in the observation equation. A would've been the matrix that transforms the the X into the Y, but I think I forgot to define it.

Hi,

Batch transforms receive a pandas panel, which is a keyed set of dataframes. The dataframe has a method to convert to a numpy matrix (doco here).
Your batch can return anything you wish, so you could return the matrix, or some output from the Kalman class.

thanks,
fawce

maybe something like this? not sure I understand everything completely.

19
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Put any initialization logic here.  The context object will be passed to
# the other methods in your algorithm.
import numpy as np

class Kalman(object):
def __init__(self, mu0, Sigma0, Aguess, PHIfit, Qfit, Rfit):
self.X = mu0
self.P = Sigma0
self.A = Aguess
self.PHI = PHIfit
self.Q = Qfit
self.R = Rfit

#some convenience functions
def predictState(self):
return(self.PHI * self.X)

def predictStateCov(self):
return(self.PHI * self.P * np.transpose(self.PHI) + self.Q)

def predictObserv(self):
return(self.A * predictState(self.X))

def predictObservCov(self):
return( (self.A * predictStateCov(self.P) * np.transpose(self.A)) + self.R)

predX = self.predictState(self.X)
predP = self.predictStateCov(self.P)
GAIN = predP * np.transpose(self.A) * ( (self.A * predP * np.transpose(self.A)) + self.R).getI()
innov = observedY - self.predictObserv(self.X)
I = np.eye(self.X.shape[0])
self.X = predX + (GAIN*innov)
self.P = (I - (GAIN*self.A))*predP

def initialize(context):
context.stock = sid(26578)
context.max_notional = 1000000.1
context.min_notional = -1000000.0

# Will be called on every trade event for the securities you specify.
def handle_data(context, data):
price = data[context.stock].price
notional = context.portfolio.positions[context.stock].amount * price

#new data is coming in, and it must be used to adjust the latent state estimate
@batch_transform(refresh_period=1, window_length=1)
def update_kalman(datapanel, kalmanObj, sid):
priceNOW = datapanel['price']['sid'].as_matrix    #this assumed dimension of Y vector


This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.