Back to Community
Using Hidden State Markov Model and Support Vector Machine to detect Market Regimes

Here is a notebook I have created with an overview of two methods. The HMM was lifted from a post on QuantStart. The SVM is my own. I will post two backtests shortly.

Loading notebook preview...
Notebook previews are currently unavailable.
6 responses

Here is a backtest with OCSVM on the SPY. The algo holds the SPY when the OCSVM detects a "normal" regime, and holds cash when an anomaly is detected. Note how cash is held during the 2008 and 2011 downturns and how, even by just holding the SPY as asset, it lowers beta by 30%.

Clone Algorithm
41
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
import pandas as pd
import numpy as np
import random
import quantopian.algorithm as algo
from sklearn import hmm
from sklearn.svm import OneClassSVM
from sklearn.preprocessing import PolynomialFeatures


def initialize(context):
    set_slippage(slippage.FixedBasisPointsSlippage(basis_points=0.0, volume_limit=1.0))
    set_commission(commission.PerShare(cost=0.000, min_trade_cost=0.00))
    

    # schedule_function(my_rebalance_hmm,
    #                   date_rules.every_day(),
    #                   time_rules.market_open(hours = 0, minutes = 1))
    
    schedule_function(my_rebalance_ocsvm,
                      date_rules.week_start(),
                      time_rules.market_open(hours = 0, minutes = 1))

    # schedule_function(train_hmm, 
    #                   date_rules.month_start(),
    #                   time_rules.market_open())

    schedule_function(train_ocsvm, 
                      date_rules.month_start(),
                      time_rules.market_open())

    
    context.train_done_once = False
    context.model = None
    context.means_prior = None
    context.predictions = [1,1,1,1,1]
            
    
def before_trading_start(context, data):    
    if not context.train_done_once:
        # train_hmm(context, data)
        train_ocsvm(context, data)
        
    

def train_hmm(context, data):
    spy = sid(8554)
    df = data.history(spy, ['price', 'volume'], 1000, '1d')
    rets = df['price'].pct_change()[1:]
    volume = df['volume'][1:]

    context.model = hmm.GaussianHMM(n_components=2, covariance_type="full", n_iter=1000)
    X = np.column_stack([rets, np.log(volume)])
    context.model.fit([X])
        
    context.train_done_once = True

    
def train_ocsvm(context, data):
    spy = sid(8554)
    prices = data.history(spy, 'price', 1000, '1d')
    
    # just keep appending new price data to training data
    try:
        context.prices = context.prices.append(prices).resample('D').mean().dropna()
    except:
        context.prices = prices
        
    prices = context.prices.tolist()
        
    out = []
    for i in range(len(prices)):
        chunk = np.array(prices[i-20:i])
        try:
            out.append(chunk/chunk[0])
        except:
            out.append(np.ones(20))
    
    X = np.array(out)
    
    context.model = OneClassSVM(kernel='rbf', nu=0.20, gamma=0.5)
    context.model.fit(PolynomialFeatures(degree=3).fit_transform(X))
    # context.model.fit(X)
    
    context.train_done_once = True

    
def my_rebalance_hmm(context, data):    
    spy = sid(8554)
    df = data.history(spy, ['price', 'volume'], 2, '1d')
    rets = df['price'].pct_change()[1:]

    volume = df['volume'][1:]
    X = np.column_stack([rets, np.log(volume)])
    state = context.model.predict(X)[0]
    record(state=state)
    
    if state == 0:
        order_target_percent(spy, 1.0)
    else:
        order_target_percent(spy, 0.0)
        
        
def my_rebalance_ocsvm(context, data):
    spy = sid(8554)
    prices = data.history(spy, 'price', 20, '1d').tolist()
    X = [prices/prices[0]]
    prediction = context.model.predict(PolynomialFeatures(degree=3).fit_transform(X))[0]
    # prediction = context.model.predict(X)[0]
    
    record(state=prediction)    
    
    if prediction == 1.0:
        order_target_percent(spy, 1.0)
    else:
        order_target_percent(spy, 0.0)
There was a runtime error.

And the backtest using HMM.

Clone Algorithm
41
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
import pandas as pd
import numpy as np
import random
import quantopian.algorithm as algo
from sklearn import hmm
from sklearn.svm import OneClassSVM
from sklearn.preprocessing import PolynomialFeatures


def initialize(context):
    set_slippage(slippage.FixedBasisPointsSlippage(basis_points=0.0, volume_limit=1.0))
    set_commission(commission.PerShare(cost=0.000, min_trade_cost=0.00))
    

    schedule_function(my_rebalance_hmm,
                      date_rules.every_day(),
                      time_rules.market_open(hours = 0, minutes = 1))
    
    # schedule_function(my_rebalance_ocsvm,
    #                   date_rules.week_start(),
    #                   time_rules.market_open(hours = 0, minutes = 1))

    schedule_function(train_hmm, 
                      date_rules.month_start(),
                      time_rules.market_open())

    # schedule_function(train_ocsvm, 
    #                   date_rules.month_start(),
    #                   time_rules.market_open())

    
    context.train_done_once = False
    context.model = None
    context.means_prior = None
            
    
def before_trading_start(context, data):    
    if not context.train_done_once:
        train_hmm(context, data)
        # train_ocsvm(context, data)
        
    

def train_hmm(context, data):
    spy = sid(8554)
    df = data.history(spy, ['price', 'volume'], 1000, '1d')
    rets = df['price'].pct_change()[1:]
    volume = df['volume'][1:]

    means_prior=context.means_prior
    
    context.model = hmm.GaussianHMM(n_components=2, covariance_type="full", n_iter=1000, means_weight=0.5, \
                                    means_prior=context.means_prior)
    X = np.column_stack([rets, np.log(volume)])
    context.model.fit([X])
    
    context.means_prior = context.model.means_
    
    context.train_done_once = True

    
def train_ocsvm(context, data):
    spy = sid(8554)
    prices = data.history(spy, 'price', 1000, '1d')
    
    # just keep appending new price data to training data
    try:
        context.prices = context.prices.append(prices).resample('D').mean().dropna()
    except:
        context.prices = prices
        
    prices = context.prices.tolist()
        
    out = []
    for i in range(len(prices)):
        chunk = np.array(prices[i-20:i])
        try:
            out.append(chunk/chunk[0])
        except:
            out.append(np.ones(20))
    
    X = np.array(out)
    
    context.model = OneClassSVM(kernel='rbf', nu=0.20, gamma=0.5)
    context.model.fit(PolynomialFeatures(degree=3).fit_transform(X))
    # context.model.fit(X)
    
    context.train_done_once = True

    
def my_rebalance_hmm(context, data):    
    spy = sid(8554)
    df = data.history(spy, ['price', 'volume'], 2, '1d')
    rets = df['price'].pct_change()[1:]

    volume = df['volume'][1:]
    X = np.column_stack([rets, np.log(volume)])
    state = context.model.predict(X)[0]
    record(state=state)
    
    if state == 0:
        order_target_percent(spy, 1.0)
    else:
        order_target_percent(spy, 0.0)
        
        
def my_rebalance_ocsvm(context, data):
    spy = sid(8554)
    prices = data.history(spy, 'price', 20, '1d').tolist()
    X = [prices/prices[0]]
    prediction = context.model.predict(PolynomialFeatures(degree=3).fit_transform(X))[0]
    # prediction = context.model.predict(X)[0]
    
    record(state=prediction)    
    
    if prediction == 1.0:
        order_target_percent(spy, 1.0)
    else:
        order_target_percent(spy, 0.0)
There was a runtime error.

Hi Luc,

Thanks for sharing.
In both implementations you measure the last 20 days returns, right? So in the algo it's implicitly assuming that the market is still in that regime?

That is, last 20 days = "abnormal" -> today is assumed "abormal".

When you use the word "unclustered" what kind of behavior do you think the regime detector is picking up on if you would say it with words?

Thanks again,
Bjarke

The HMM uses the a single day return and volume as features. i.e. if you train it with 1000 day of data, you have a (1000,2) training data set. The algo will use the previous day return and volume to determine the current day regime.

The SVM uses 20 day time series. So, if 1000 days of price data is used, the training data shape is (1000,20). The algo uses the previous 20 days prices to predict the current regime.

The SVM assumes that most of the 20 day time series will be clustered. All those clustered time series are assumed to be the "base" or "normal" regime. The outliers (or abnormal) to that cluster will be deemed a different regime then the "base" regime. That is somewhat a limitation of non-supervised learning, is that the learner will find clusters, but theses clusters need to be interpreted. In this case, the assumption the the base regime is a low vol, uptrending market and that the outlier regime is a higher vol. downtrending seems to hold by just looking at the notebook.

I hope I answer the question.

/Luc

Yes, thanks I understand i little better now.
For the SVM could you say that the assumption of clustered is detecting that they are similar? I.e. in a volatile downwards market (like we have now), many days will be up a lot or down a lot, whereas in a more tranquil market(your base case) most days will look alike (upwards tendency, but no big moves in either direction).
How do you program the SVM to see the latter as the base case? I can't suss that out from the code above.
Thanks!

Bjarke,

Good question. I am not sure. One could manually label various time series and use a classifier like SVC.

/Luc