Second Attempt at ML - Stochastic Gradient Descent Method Using Hinge Loss Function

Perceptron is still a bit too simplistic in that it traverses the data set in order and treat every misclassified point equally. SGD method using hinge loss function can do a better job: it randomly chooses training data, gradually decrease the learning rate, and penalize data points which deviate significantly from what's predicted. Here I used an average SGD method that is tested to outperform if I simply pick the last predictor value trained after certain iterations. All other ideas are pretty similar to my previous post on perceptron.

It turned out that randomness increases volatility but also increases highest gain. I would love to hear comment and other experiments on how to improve this idea. Note again that it took me a very long time to run backtests and debug the algo simply because the computation requirements are very high. Any optimization suggestions are also welcomed. Thank you!

Remember to just click the "clone" button if you want to make a copy to try and edit yourself.

1408
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import numpy as np
import random

def initialize(context):
set_universe(universe.DollarVolumeUniverse(97, 99))
context.bet_amount = 100000.0
context.max_notional=1000000.1
context.min_notional=-100000.0

def handle_data(context, data):

for stock in data:

if caltheta(data,stock,5) is None:
continue

(theta,historicalPrices) = caltheta(data,stock,5)
indicator=np.dot(theta,historicalPrices)
# normalize
hlen=sum([k*k for k in historicalPrices])
tlen=sum([j*j for j in theta])
indicator/=float(hlen*tlen) #now indicator lies between [-1,1]

current_Prices = data[stock].price
notional = context.portfolio.positions[stock].amount * current_Prices

if  indicator>=0 and notional<context.max_notional:
order(stock,indicator*context.bet_amount)
log.info("%f shares of %s bought." %(context.bet_amount*indicator,stock))

if  indicator<0 and notional>context.min_notional:
order(stock,indicator*context.bet_amount)
log.info("%f shares of %s sold." %(context.bet_amount*indicator,stock))

@batch_transform(refresh_period=1,window_length=60)
def caltheta(datapanel, sid, num):
prices=datapanel['price'][sid]
for i in range(len(prices)):
if prices[i] is None:
return None
testX=[[prices[i] for i in range(j,j+4)] for j in range(0,60,5)]
avg=[np.average(testX[k]) for k in range(len(testX))]
testY=[np.sign(prices[5*i+4]-avg[i]) for i in range(len(testX))]
theta=hlsgdA(testX, testY, 0.01, randomIndex, num)
priceh=prices[-4:] #get historical data for the last four days
return (theta,priceh)

# stochastic gradient descent using hinge loss function
def hlsgdA(X, Y, l, nextIndex, numberOfIterations):
theta=np.zeros(len(X[0]))
best=np.zeros(len(X[0]))
e=0
omega=1.0/(2*len(Y))
while e<numberOfIterations:
ita=1.0/(1+e)
for i in range(len(Y)):
index=nextIndex(len(Y))
k=np.dot(ita,(np.dot(l,np.append([0],[k for k in theta[1:]]))-np.dot((sgn(1-Y[index]*np.dot(theta,X[index]))*Y[index]),X[index])))
theta-=k
best=(1-omega)*best+omega*theta  #a recency-weighted average of theta: an average that exponentially decays the influence of older theta values
e+=1
return best

# sign operations to identify mistakes
def sgn(k):
if k<=0:
return 0
else:
return 1

def randomIndex(n):
return random.randint(0, n-1)
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.
10 responses
        if caltheta(data,stock,5) is None:
continue

(theta,historicalPrices) = caltheta(data,stock,5)


More optimal:

   thetaAndPrices = caltheta(data,stock,5)

if thetaAndPrices is None:
continue

theta, historicalPrices = thetaAndPrices


Why did you do classification instead of regression? It could have learned to avoid big losses if you do regression, no?

@fie offset: Thanks for your suggestions! Yeah that's more efficient since you only call the batch_transform once.

@Dennis: that's a great point! My original thought was, I could probably use a week's (5 days') trading data to predict whether the next day's price will rise/fall, because I still believe there's some kind of periodicity in the stock market. Yet I am not sure whether we have enough information to do linear regression, i.e. I was worrying we might run the risk of over-fitting if we use limited data for regression. What do you think? I think I can try to modify the update function using minimal square error function to achieve linear regression in the code.

Given your information set, first order approximation may be good enough. You will get a single layer neural network. But my strategy is to use some kind of step function with shrinkage. Less likely to over fit. Good enough to differentiate one percent change vs a ten percent. Robust to non linearity.

What specific security/asset is this algo trading?

@Dennis: could you share your strategy here? I am not sure I completely understand your step function with shrinkage idea, and it'd be awesome if we all can learn from your insights. Thanks!

@Tin: This algo is using the set_universe() method selecting the 97%~99% (~160) securities with the highest dollar value trading volume, which reselect the universe at each quarter end. You can also try other percentiles and I bet the result can be very different - in fact, I guess for less liquid securities, ML method can yield better (less volatile) performances. What do you think?

12000 views.....this might be a tipping point for the site and community. Hold onto your pants.

I'm new to Quantopian,so bear with me. If you had a drawdown of 144% in November, how did your trading algorithm come back from that? That kind of loss seems fatal.

And if this were a real account, and you were down 56% in October and it skidded to minus 144% in November, wouldn't you shut the account down before you ever saw the +200-300% numbers?

Hello Hugh - I think your concerns about this algo are right on. However, this isn't a fully formed strategy yet. I think Taibo is trying to explore different methodologies and see how they work, in general. It needs a lot more work before one would put any money behind it.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hi Hugh, thanks for your question; in fact I think that touches the very heart of trading algorithms, namely how to predict all possible situations without knowing them beforehand by generalizing and tweaking a few known cases. In this example, one would have set such bounds on max/min return which makes the algorithm less volatile but more likely to lose. It seems there always is a trade-off. What I am doing now is to raise some ideas as to what we can do. I think a more "realistic" trading algorithms that one can reliably use actually have a lot of finely tuned (or even constantly updated) control signals that combines some of the ideas shared on Quantopian community. What do you think?