MA cross-over w/ RSI

I'm planning to build up this algorithm, bit-by-bit here. This is the first piece. More details to come. --Grant

155
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import numpy as np

# globals for get_data batch transform decorator
R_P = 0  # refresh period
W_L = 5  # window length in days (390*W_L minutes)

def initialize(context):

context.stocks = [sid(21519),sid(8554)]

def handle_data(context, data):

# get shifted moving average
avg = get_avg(data,context.stocks)
if avg == None:
return

mavg_0 = data[context.stocks[0]].mavg(3)
mavg_1 = data[context.stocks[1]].mavg(3)

window = 3*390 # minutes

print data[context.stocks[0]].datetime
print avg.shape
print avg
print avg[window-1,0]-mavg_0
print avg[window-1,1]-mavg_1

@batch_transform(refresh_period=R_P, window_length=W_L) # set globals R_P & W_L above
def get_avg(datapanel,sids):
p = np.flipud(datapanel['price'].as_matrix(sids))
p_cumsum = np.cumsum(p,axis=0)
n = np.cumsum(np.ones(p_cumsum.shape),axis=0)
avg = np.divide(p_cumsum,n)
return avg
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.
49 responses

Hello Grant,

I don't know enough numpy to interpret what you are doing. Could you explain it, please?

P.

Peter,

Sure.

@batch_transform(refresh_period=R_P, window_length=W_L) # set globals R_P & W_L above
def get_avg(datapanel,sids):
p = np.flipud(datapanel['price'].as_matrix(sids))
p_cumsum = np.cumsum(p,axis=0)
n = np.cumsum(np.ones(p_cumsum.shape),axis=0)
avg = np.divide(p_cumsum,n)
return avg


The first line, p = np.flipud(datapanel['price'].as_matrix(sids)), creates a numpy ndarray of prices, and flips it about a horizontal axis ("up down") so that the most recent prices come first. Then, the next line, p_cumsum = np.cumsum(p,axis=0), performs a cumulative sum over the columns. The next line, n = np.cumsum(np.ones(p_cumsum.shape),axis=0), creates an ndarray of the denominators, required to compute the averages of each element of p_cumsum. Finally, an elementwise division is performed to compute the averages, with avg = np.divide(p_cumsum,n). So, what is returned is an ndarray of moving averages--columns correspond to sid, and rows correspond to the trailing window length of the average. So, by picking row and column, as I do in handle_data, the moving averages for each sid can be compared to those computed with mavg. In the example code, they agree, so I declare victory.

Just let me know if this doesn't make sense, and I'll work up a better example, with comments in the code.

Grant

Hello Grant,

Thanks. I probably need to get this into iPython console to see what it's doing. So it's a minutely moving average that has the same output as mavg? Later you will apply this to other timeframes?

Why not use ta.MA() instead? You can't recreate all the indicators. Well...you could....but TA-Lib took Mario Fortier and his co-collaborators quite a few man-years.

P.

Hi Peter,

With one call to the batch transform, I get, for each sid, a vector of moving averages, up to the window length of the batch transform. So, for example, if I need a 15 day and a 5 day moving average, I can just select them from corresponding rows of the column vector. ta.MA() or mavg could be used, but I like the idea of the single call to the batch transform to get the entire set of moving averages, up to a maximum window length.

Grant

Hello Peter,

Now that I've thought about it, it would be better to use a more flexible approach than I've shown above to compute the moving average. I've started to look at http://pandas.pydata.org/pandas-docs/dev/computation.html, which has lots of "bells and whistles" including a variety of statistics, and an exponentially weighted moving average. I'll work something up as an example and post it here.

I'm kinda leery of using TA-Lib, since I've yet to find any good technical documentation.

Grant

Here's a version that uses the pandas rolling_mean to compute the averages needed for a cross-over (across multiple securities, if desired). The code executes very slowly, despite the fact that I make only one call to the batch transform, per call to handle_data. When I get the chance, I'll try to figure out the source of the sluggishness. --Grant

155
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import numpy as np
import pandas as pd

W_L = 5  # batch transform window length in days (390*W_L minutes)
windows = [3,5] # moving average windows, days

def initialize(context):

context.stocks = [sid(21519),sid(8554)]

def handle_data(context, data):

# get averages
avgs = get_avgs(data,context.stocks,windows)
if avgs == None:
return

mavg_3_0 = data[context.stocks[0]].mavg(3)
mavg_3_1 = data[context.stocks[1]].mavg(3)
mavg_5_0 = data[context.stocks[0]].mavg(5)
mavg_5_1 = data[context.stocks[1]].mavg(5)

print data[context.stocks[0]].datetime
print np.subtract(avgs[0],np.asarray([mavg_3_0,mavg_3_1]))
print np.subtract(avgs[1],np.asarray([mavg_5_0,mavg_5_1]))

@batch_transform(refresh_period=0, window_length=W_L) # set globals R_P & W_L above
def get_avgs(data,sids,windows):

windows = [390*x for x in windows] # moving average windows, minutes
avgs = []
d = data.price[sids]

for window in windows:
avgs.append(pd.rolling_mean(d,window).as_matrix(sids)[-1:])

return avgs
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

Here's the pandas-based moving average w/ Apple as an example security.

For the RSI computation, it is not clear to me why using daily closing prices would be the best. Wouldn't it be better to have a rolling computation, that gets updated every minute, for minutely trading?

I took a look at http://en.wikipedia.org/wiki/Relative_strength_index and it is not clear why the RSI couldn't be computed every minute using TA-Lib, with an appropriately chosen value of n for the exponentially weighted moving average. In other words, the whole "timeframe" issue is handled by the exponential weighting, rather than by using daily closing prices. From a practical standpoint, as Jess has pointed out, the finite memory limitation can become a problem, but if we are only talking about a handful of securities, it should be manageable.

Grant

155
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import numpy as np
import pandas as pd
import time

W_L = 15  # batch transform window length in days (390*W_L minutes)
windows = [5,15] # moving average windows, days

def initialize(context):

context.stocks = [sid(24)]

def handle_data(context, data):

# get averages
avgs = get_avgs(data,context.stocks,windows)
if avgs == None:
return

price = data[context.stocks[0]].price

record(price = price)
record(avg_5 = float(avgs[0][0]))
record(avg_15 = float(avgs[1][0]))

@batch_transform(refresh_period=0, window_length=W_L) # set globals R_P & W_L above
def get_avgs(data,sids,windows):

windows = [390*x for x in windows] # moving average windows, minutes
avgs = []
d = data.price[sids]

for window in windows:
avgs.append(pd.rolling_mean(d,window).as_matrix(sids)[-1:])

return avgs
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

Here's the latest installment. I added the RSI, per the example on the help page. I still need to code the RSI for multiple securities.

What would be typical buy/sell logic for such an algorithm, based on the moving averages and the RSI? Also, would it be better to use exponentially weighted moving averages?

Thanks,

Grant

155
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import numpy as np
import pandas as pd

W_L = 15  # batch transform window length, days (390*W_L minutes)
windows = [5,15] # moving average windows, days

rsi = ta.RSI(timeperiod=14) # days (390*timeperiod minutes)

def initialize(context):

context.stocks = [sid(24)]

def handle_data(context, data):

# get RSI
rsi_data = rsi(data)
rsi_0 = rsi_data[context.stocks[0]]

# get averages
avgs = get_avgs(data,context.stocks,windows)
if avgs == None:
return

price = data[context.stocks[0]].price

record(price = price)
record(avg_5 = float(avgs[0][0]))
record(avg_15 = float(avgs[1][0]))
record(rsi_0 = rsi_0)

@batch_transform(refresh_period=0, window_length=W_L) # set globals R_P & W_L above
def get_avgs(data,sids,windows):

windows = [390*x for x in windows] # moving average windows, minutes
avgs = []
d = data.price[sids]

for window in windows:
avgs.append(pd.rolling_mean(d,window).as_matrix(sids)[-1:])

return avgs
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

I'm trying to sort out how to get the TA-Lib RSI to return a value running on minute data, with a time period of 14 days. The first part of the code is:

import numpy as np
import pandas as pd

W_L = 15  # batch transform window length, days
windows = [5,15] # moving average windows, days

rsi = ta.RSI(timeperiod=5460)

def initialize(context):
context.stocks = [sid(24)]
def handle_data(context, data):
# get RSI
rsi_data = rsi(data)
rsi_0 = rsi_data[context.stocks[0]]


But I get rsi_0 = 0 for the entire backtest.

Any idea what's going on?

Grant

155
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import numpy as np
import pandas as pd

W_L = 15  # batch transform window length, days
windows = [5,15] # moving average windows, days

rsi = ta.RSI(timeperiod=5460)

def initialize(context):

context.stocks = [sid(24)]

def handle_data(context, data):

# get RSI
rsi_data = rsi(data)
rsi_0 = rsi_data[context.stocks[0]]

# get averages
avgs = get_avgs(data,context.stocks,windows)
if avgs == None:
return

price = data[context.stocks[0]].price

record(price = price)
record(avg_5 = float(avgs[0][0]))
record(avg_15 = float(avgs[1][0]))
record(rsi_0 = rsi_0)

@batch_transform(refresh_period=0, window_length=W_L) # set globals R_P & W_L above
def get_avgs(data,sids,windows):

windows = [390*x for x in windows] # moving average windows, minutes
avgs = []
d = data.price[sids]

for window in windows:
avgs.append(pd.rolling_mean(d,window).as_matrix(sids)[-1:])

return avgs
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

Here's a more direct test. It appears that TA-Lib RSI can't handle too much data. If I change from timeperiod = 10 to timeperiod = 390 (1 day), I just get:

Waiting for logs...
This backtest didn't generate any logs.

Grant

import numpy as np
rsi = ta.RSI(timeperiod=10)

def initialize(context):
context.sid = sid(8554)
def handle_data(context, data):

rsi_result = rsi(data)[context.sid]
if not np.isnan(rsi_result):
log.info(get_datetime())
log.info(rsi_result)

4
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import numpy as np
rsi = ta.RSI(timeperiod=10)

def initialize(context):

context.sid = sid(8554)

def handle_data(context, data):

rsi_result = rsi(data)[context.sid]
if not np.isnan(rsi_result):
log.info(get_datetime())
log.info(rsi_result)
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

Hello Grant,

Brief inspection suggests that the timeperiod limit is 389. Using timeperiod = 5460 would be computationally very expensive anyway so I converted the minutes to days.

You seem to be taking each 390 minutes and computing the rolling mean for that day. I thought the The MA for those days should be computed from the closing price of the day i.e. we are not interested in the other 389 prices?

(EDIT: I know that's not what you are doing as it involves minutes 1950 and 5850 i.e. days 5 and 15.)

P.

3
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import numpy as np
import pandas as pd
import zipline.transforms.ta as ztt

W_L = 15  # batch transform window length, days
windows = [5,15] # moving average windows, days

def initialize(context):

context.stocks = [sid(24)]

def handle_data(context, data):

# get averages
results = get_avgs(data,context.stocks,windows)
if results == None:
return

avgs = results[0]
daily = results[1]

# get RSI
rsi =  float(ztt.talib.RSI(daily[context.stocks[0]], timeperiod=14)[-1:])

price = data[context.stocks[0]].price

record(price = price)
record(avg_5 = float(avgs[0][0]))
record(avg_15 = float(avgs[1][0]))
record(rsi_0 = rsi)

@batch_transform(refresh_period=0, window_length=W_L) # set globals R_P & W_L above
def get_avgs(data,sids,windows):
windows = [390*x for x in windows] # moving average windows, minutes
avgs = []
d = data.price[sids]
for window in windows:
avgs.append(pd.rolling_mean(d,window).as_matrix(sids)[-1:])
daily = d.resample('D', closed='right', label='left').dropna()
return (avgs , daily)
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

Thanks Peter,

It turns out that there may be a problem with feeding TA-Lib RSI multiples of 390. I just tried timeperiod=5461 and it works. Quantopian is aware of the problem.

I attached my code, with timeperiod=5461. Some form of re-sampling may be required, but my sense is that it should incorporate the full 390 minutes out of every day in a summary form (e.g. mean), rather than just the closing price. More on that topic later...

Cheers,

Grant

155
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import numpy as np
import pandas as pd

W_L = 15  # batch transform window length, days
windows = [5,15] # moving average windows, days

rsi = ta.RSI(timeperiod=5461)

def initialize(context):

context.stocks = [sid(24)]

def handle_data(context, data):

# get RSI
rsi_data = rsi(data)
rsi_0 = rsi_data[context.stocks[0]]

if not np.isnan(rsi_0):
print rsi_0

# get averages
avgs = get_avgs(data,context.stocks,windows)
if avgs == None:
return

price = data[context.stocks[0]].price

record(price = price)
record(avg_5 = float(avgs[0][0]))
record(avg_15 = float(avgs[1][0]))
record(rsi_0 = rsi_0)

@batch_transform(refresh_period=0, window_length=W_L) # set globals R_P & W_L above
def get_avgs(data,sids,windows):

windows = [390*x for x in windows] # moving average windows, minutes
avgs = []
d = data.price[sids]

for window in windows:
avgs.append(pd.rolling_mean(d,window).as_matrix(sids)[-1:])

return avgs
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

Hello Grant,

I would say it's universal - and has been for many decades - that calculations on OHLC bars of any duration use the first open/last close. My TA-Lib version of your algo completes in 7 minutes 26 seconds. Your's was around 8 minutes 9 seconds without any RSI calculation. It's now around 20 minutes.

What happens when you want a monthly MA in a back test? There are 31,200 minutes in 4-weeks! And MA is about the simplest indicator there is in terms of computations.

I've also just noticed that your RSI is very approximately in a range of 47 - 52 which is odd. Mine is around 4 to 94.

P.

3
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import pandas
import zipline.transforms.ta as ztt

W_L = 15  # batch transform window length, days

def initialize(context):
context.stocks = [sid(24)]

def handle_data(context, data):
# get averages
daily = get_avgs(data,context.stocks)
if daily is None:
return

# get RSI & MA
rsi   =  float(ztt.talib.RSI(daily[context.stocks[0]], timeperiod=14)[-1:])
ma_5  =  float(ztt.talib.MA(daily[context.stocks[0]], timeperiod=5)[-1:])
ma_15 =  float(ztt.talib.MA(daily[context.stocks[0]], timeperiod=15)[-1:])

price = data[context.stocks[0]].price

record(rsi_0 = rsi)
record(ma_5=ma_5)
record(ma_15=ma_15)
record(price = price)

@batch_transform(refresh_period=0, window_length=W_L) # set globals R_P & W_L above
def get_avgs(data,sids):
d = data.price[sids]
daily = d.resample('D', closed='right', label='left').dropna()
return daily
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

Thanks Peter,

I'll have to think about what the RSI is doing (http://en.wikipedia.org/wiki/Relative_strength_index). It appears that there is just an overall scale factor for the peak-to-peak amplitude, which depends on the number of time series data points. However, I don't see the smoothing effect I'd expect from using a lot more data. So, perhaps using daily closing prices would work just as well...hmm...not intuitive.

Grant

Hello Peter,

Here's your backtest from above, but using minute data rather than daily closings. Interestingly, it ran just fine, even though I used whole multiples of 390 minutes for the timeperiods. The RSI suffers from the same peak-to-peak range problem as my earlier algo.

Any idea how to set up the buy/sell criteria for this MA cross-over w/ RSI? Is there a standard recipe?

Grant

Hello Grant,

Did you mean to attach a backtest?

As far as a strategy I suppose use MA to confirm an RSI signal. When RSI goes below 30 this suggests over-sold, when crossing back above 30 from below this suggests leaving over-sold area but look for confimation from the MA. Vice versa for RSI 70.

P.

Here's the backtest that should have been attached. Thanks for the feedback. --Grant

1
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import pandas
import zipline.transforms.ta as ztt

W_L = 15  # batch transform window length, days

def initialize(context):
context.stocks = [sid(24)]

def handle_data(context, data):
# get averages
daily = get_avgs(data,context.stocks)
if daily is None:
return

# get RSI & MA
rsi   =  float(ztt.talib.RSI(daily[context.stocks[0]], timeperiod=14*390)[-1:])
ma_5  =  float(ztt.talib.MA(daily[context.stocks[0]], timeperiod=5*390)[-1:])
ma_15 =  float(ztt.talib.MA(daily[context.stocks[0]], timeperiod=15*390)[-1:])

price = data[context.stocks[0]].price

record(rsi_0 = rsi)
record(ma_5=ma_5)
record(ma_15=ma_15)
record(price = price)

@batch_transform(refresh_period=0, window_length=W_L) # set globals R_P & W_L above
def get_avgs(data,sids):
d = data.price[sids]
# daily = d.resample('D', closed='right', label='left').dropna()
# return daily
return d
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

Hello Peter,

Here's an approach that uses multiple timeframes--minutely closings for the moving averages and daily closings for the RSI. I haven't yet fully sorted out and tested your pandas re-sampling line:

daily = minutely.resample('D', closed='right', label='left').dropna()


Would you mind explaining what it does? I looked at http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.resample.html and it's kinda cryptic without examples.

Thanks,

Grant

155
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import pandas
import zipline.transforms.ta as ztt

W_L = 15  # batch transform window length, days

def initialize(context):
context.stocks = [sid(24)]

def handle_data(context, data):
# get data
data_minutely_daily = get_data(data,context.stocks)
if data_minutely_daily is None:
return

minutely = data_minutely_daily[0]
daily = data_minutely_daily[1]

# get RSI & MAs
rsi   =  float(ztt.talib.RSI(daily[context.stocks[0]], timeperiod=14)[-1:])
ma_5  =  float(ztt.talib.MA(minutely[context.stocks[0]], timeperiod=5*390)[-1:])
ma_15 =  float(ztt.talib.MA(minutely[context.stocks[0]], timeperiod=15*390)[-1:])

price = data[context.stocks[0]].price

record(rsi_0 = rsi)
record(ma_5=ma_5)
record(ma_15=ma_15)
record(price = price)

@batch_transform(refresh_period=0, window_length=W_L) # set globals R_P & W_L above
def get_data(data,sids):
minutely = data.price[sids]
daily = minutely.resample('D', closed='right', label='left').dropna()
return (minutely,daily)
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

Resample is used to convert time-series data from one periodicity to another -- in this case, it looks like it's converting time-series data of around a minute per bar into daily bars. It does this by the time/date stamps in the index, there's nothing fancy about it, but it's a tricky thing to get right in the presence of missing data, bar start/end alignments and so forth.

Hello Grant,

I'm no expert! This is in iPython console:

rng = pd.date_range('18/09/2013', periods=1440, freq='1Min')

ts = pd.Series(np.random.randn(len(rng)), index=rng)

ts
Out[144]:
2013-09-18 00:00:00    0.327609
2013-09-18 00:01:00   -1.822197
2013-09-18 00:02:00    0.171317
2013-09-18 00:03:00    0.691429
2013-09-18 00:04:00    0.258827
2013-09-18 00:05:00   -0.390128
2013-09-18 00:06:00   -0.006088
2013-09-18 00:07:00    0.712166
2013-09-18 00:08:00    1.270415
2013-09-18 00:09:00   -1.599647
2013-09-18 00:10:00    1.393080
2013-09-18 00:11:00   -0.080025
2013-09-18 00:12:00    0.036839
2013-09-18 00:13:00    0.659311
2013-09-18 00:14:00   -0.002810
...
2013-09-18 23:45:00   -0.334146
2013-09-18 23:46:00    0.754013
2013-09-18 23:47:00    1.269064
2013-09-18 23:48:00   -0.217482
2013-09-18 23:49:00    0.626549
2013-09-18 23:50:00    0.262481
2013-09-18 23:51:00    0.437834
2013-09-18 23:52:00    1.093870
2013-09-18 23:53:00    0.899267
2013-09-18 23:54:00    1.471195
2013-09-18 23:55:00    0.192442
2013-09-18 23:56:00    0.530187
2013-09-18 23:57:00   -0.748267
2013-09-18 23:58:00   -1.281159
2013-09-18 23:59:00    1.114334
Freq: T, Length: 1440, dtype: float64

ts.resample('60Min', label='left')
Out[145]:
2013-09-18 00:00:00    0.113956
2013-09-18 01:00:00   -0.014518
2013-09-18 02:00:00   -0.307613
2013-09-18 03:00:00   -0.113483
2013-09-18 04:00:00    0.066969
2013-09-18 05:00:00   -0.072938
2013-09-18 06:00:00   -0.116590
2013-09-18 07:00:00    0.019375
2013-09-18 08:00:00   -0.132091
2013-09-18 09:00:00    0.008221
2013-09-18 10:00:00    0.068405
2013-09-18 11:00:00    0.106199
2013-09-18 12:00:00    0.165218
2013-09-18 13:00:00   -0.001297
2013-09-18 14:00:00    0.025111
2013-09-18 15:00:00   -0.120673
2013-09-18 16:00:00   -0.002947
2013-09-18 17:00:00    0.172615
2013-09-18 18:00:00   -0.045660
2013-09-18 19:00:00    0.107736
2013-09-18 20:00:00   -0.099588
2013-09-18 21:00:00   -0.120418
2013-09-18 22:00:00   -0.052315
2013-09-18 23:00:00    0.189029
Freq: 60T, dtype: float64

ts.resample('60Min', label='right')
Out[146]:
2013-09-18 01:00:00    0.113956
2013-09-18 02:00:00   -0.014518
2013-09-18 03:00:00   -0.307613
2013-09-18 04:00:00   -0.113483
2013-09-18 05:00:00    0.066969
2013-09-18 06:00:00   -0.072938
2013-09-18 07:00:00   -0.116590
2013-09-18 08:00:00    0.019375
2013-09-18 09:00:00   -0.132091
2013-09-18 10:00:00    0.008221
2013-09-18 11:00:00    0.068405
2013-09-18 12:00:00    0.106199
2013-09-18 13:00:00    0.165218
2013-09-18 14:00:00   -0.001297
2013-09-18 15:00:00    0.025111
2013-09-18 16:00:00   -0.120673
2013-09-18 17:00:00   -0.002947
2013-09-18 18:00:00    0.172615
2013-09-18 19:00:00   -0.045660
2013-09-18 20:00:00    0.107736
2013-09-18 21:00:00   -0.099588
2013-09-18 22:00:00   -0.120418
2013-09-18 23:00:00   -0.052315
2013-09-19 00:00:00    0.189029
Freq: 60T, dtype: float64


The first parameter is the size of the resample i.e. '30Min', 'H', '2H', 'D', where '2H' == '120Min'. I think upsampling is possible but I have not tried it. The 'closed' argument works like 'label' and specifies in which sample the boundary values are included but I'm not qualified to discuss it's subleties.

The second example is confusing since '2013-09-19 00:00:00' is not strictly in the original time series but it makes sense for display purposes. It doesn't really help that '60Min' as an input results in '60T' in the output.

The 'dropna' was included in case the resample decided to create periods that weren't in the Quantopian data as I was worried that minute data resampled to hourly would result in overnight hours populated with NaNs. That behaviour is very much TBC.

P.

Hello Peter,

I've attempted to verify that your resampling code actually captures daily closing prices. I added some code to record the daily closing prices, and unless I've made a mistake, the values don't match (see the log output). I kinda rushed though the coding, so it is possible that I made a mistake. Any ideas?

Grant

155
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import pandas
import zipline.transforms.ta as ztt

W_L = 15  # batch transform window length, days

def initialize(context):
context.stocks = [sid(24)]

context.price_prior = 0
context.day_prior = 0

def handle_data(context, data):

day = data[context.stocks[0]].datetime.day
price = data[context.stocks[0]].price

# get data
data_minutely_daily = get_data(data,context.stocks)
if data_minutely_daily is None:
context.day_prior = day
context.price_prior = price
return

minutely = data_minutely_daily[0]
daily = data_minutely_daily[1]

# get RSI & MAs
rsi   =  float(ztt.talib.RSI(daily[context.stocks[0]], timeperiod=14)[-1:])
ma_5  =  float(ztt.talib.MA(minutely[context.stocks[0]], timeperiod=5*390)[-1:])
ma_15 =  float(ztt.talib.MA(minutely[context.stocks[0]], timeperiod=15*390)[-1:])

record(rsi_0 = rsi)
record(ma_5=ma_5)
record(ma_15=ma_15)
record(price = price)

if day != context.day_prior:
print "Recorded: "+str(context.price_prior)
print "Resampled: "+str(daily[context.stocks[0]][-1])

context.day_prior = day
context.price_prior = price

@batch_transform(refresh_period=0, window_length=W_L) # set globals R_P & W_L above
def get_data(data,sids):
minutely = data.price[sids]
daily = minutely.resample('D', closed='right', label='left').dropna()
return (minutely,daily)
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

Hello Grant,

Thanks - I'll have a look. My first thoughts are that if you pick a day on your backtest with 'calculated' MAs you get the same values for the MAs on that day in my backtest with the resampled data used by TA-Lib.

P.

Hello Grant,

Also consider that batch_transform in a minutely algo has some problems. I have raised issues publicly and privately concerning this but I get little feedback.

Firstly, move the start date of this backtest away from 2013-01-01. On this day "missing" data results in a variable length DataFrame being returned by batch_transform until it catches up with the "missing" minutes.

Secondly, batch_transform initially returns one price for the first day and it then returns 390 minutely prices for each subsequent day. Until your algo addresses both of these issues your results, as well as mine, are of questionable veracity.

P.

Without knowing exactly how quantopian generates the minute bars and daily bars, it's unlikely that you'll ever get them to match using resample. Arguably, if your strategy performance is materially different depending on precisely how bars are built, it's probably too fragile for actual trading anyway.

Hello Simon/Grant,

One issue is that DataFrame.resample() defaults to how='mean' but it needs to be specified as how='last' for my purposes here i.e.

daily = minutely.resample('D', closed='right', label='left', how='last').dropna()


Beyond that I'm still investigating if only to enhance my limted knowledge of pandas. For example it seems how can be 'first', 'last', 'min', 'max', 'mean' or 'sum' but that list may not be exhaustive.

P.

For the programmers, perhaps there needs to be a rethink of how Quantopian passes data. Here is a suggestion:
1. Move support from two time periods, daily and minute, to three, daily, minute, and 15minute periods These reduces the factor between periods from 390, to just over 26.
2. Pass all three data frames into the algorithm, so that the user can choose which to pass. Something like CalcSMA(dailydata,20) for a 20 day moving average, or CalcSMA(minutedata,20) for a 20 minute moving average. This whole concept of choosing the timeframe of an algorithm externally is a poor design which doesn't reflect how people design trading systems.
3. Update the current period of both the 15 minute and daily time frames with the current market values each minute. This allows a user to calculate exactly what stochcharts.com does, which is give the up to the minute values of larger timeframes. If you had to wait for the end of the day, no one would use stockcharts.
4.Eventually, I think having your own basic indicator libraries makes sense, because you can find out what people really use, and then optimize them. For example, if you precalculate a summary of the running totals of the closes for a stock, then an sma because a single subtraction and division, rather than a long series of additions. In this way, you can find out the time intensive ares which suck cpus, reduce server cost, and speed up the run time, which is a win for you and your customers.

Rich

Hello Grant,

With my addition of how='last' we do have the same daily values although your code doesn't agree!

P.

1
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import pandas
import zipline.transforms.ta as ztt

W_L = 15  # batch transform window length, days

def initialize(context):
context.stocks = [sid(24)]

context.price_prior = 0
context.day_prior = 0

context.first_time = True

def handle_data(context, data):

day = data[context.stocks[0]].datetime.day
price = data[context.stocks[0]].price

# get data
data_minutely_daily = get_data(data,context.stocks)
if data_minutely_daily is None:
context.day_prior = day
context.price_prior = price
return

minutely = data_minutely_daily[0]
daily = data_minutely_daily[1]
if context.first_time:
print minutely.ix['2013-01-02 21:00:00']
print daily.ix['2013-01-02 00:00:00']
context.first_time = False

# get RSI & MAs
rsi   =  float(ztt.talib.RSI(daily[context.stocks[0]], timeperiod=14)[-1:])
ma_5  =  float(ztt.talib.MA(minutely[context.stocks[0]], timeperiod=5*390)[-1:])
ma_15 =  float(ztt.talib.MA(minutely[context.stocks[0]], timeperiod=15*390)[-1:])

record(rsi_0 = rsi)
record(ma_5=ma_5)
record(ma_15=ma_15)
record(price = price)

if day != context.day_prior:
print "Recorded: "+str(context.price_prior)
print "Resampled: "+str(daily[context.stocks[0]][-1])

context.day_prior = day
context.price_prior = price

@batch_transform(refresh_period=0, window_length=W_L) # set globals R_P & W_L above
def get_data(data,sids):
minutely = data.price[sids]
daily = minutely.resample('D', closed='right', label='left', how='last').dropna()
return (minutely,daily)
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

Hello Rich,

This has the three timeframes. It looks like the indicators do update each minute - for example this is the 1D MA:

2012-01-24 PRINT 2012-01-24 21:00:00+00:00 423.66
2012-01-25 PRINT 2012-01-25 14:31:00+00:00 426.583
2012-01-25 PRINT 2012-01-25 14:32:00+00:00 426.41
2012-01-25 PRINT 2012-01-25 14:33:00+00:00 426.215
2012-01-25 PRINT 2012-01-25 14:34:00+00:00 426.37
2012-01-25 PRINT 2012-01-25 14:35:00+00:00 426.208
2012-01-25 PRINT 2012-01-25 14:36:00+00:00 426.261
2012-01-25 PRINT 2012-01-25 14:37:00+00:00 426.274


P.

111
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import pandas
import zipline.transforms.ta as ztt

W_L = 15  # No. of days available to indicators

def initialize(context):
context.stocks = [sid(24)]
context.first_time = True

def handle_data(context, data):
# get data
bar_data = get_data(data, context.stocks)
if bar_data is None:
return
data_1M  = bar_data[0]
data_15M = bar_data[1]
data_1D  = bar_data[2]
if context.first_time:
context.first_time = False

MA_1M   =  float(ztt.talib.MA( data_1M[context.stocks[0]], timeperiod=60)[-1:])
MA_15M  =  float(ztt.talib.MA(data_15M[context.stocks[0]], timeperiod=26)[-1:])
MA_1D   =  float(ztt.talib.MA( data_1D[context.stocks[0]], timeperiod=10)[-1:])

record(MA_1M=MA_1M)
record(MA_15M=MA_15M)
record(MA_1D=MA_1D)

@batch_transform(refresh_period=0, window_length=W_L)
def get_data(data,sids):
data_1M = data.price[sids]
data_15M = data_1M.resample('15Min', closed='right', label='left', how='last').dropna()
data_1D =  data_1M.resample(    'D', closed='right', label='left', how='last').dropna()
return (data_1M, data_15M, data_1D)
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

Hi Peter,

I added a line to your code posted above (associated with your comment "With my addition of how='last' we do have the same daily values although your code doesn't agree!"):

if day != context.day_prior:
print "Recorded: "+str(context.price_prior)
print "Resampled: "+str(daily[context.stocks[0]][-1])
print minutely.tail(3)


By inspecting the last three values of the minutely data (for the trailing window at the new day opening), I see that I capture the prior day's closing price in context.price_piror. However, it seems that your resampling does not return the closing price from the prior day. If I am understanding correctly, it should be daily[context.stocks[0]][-1], right? I don't know how to fix it, but presumably whatever the Quantopian guys are working on will make it really easy to accumulate daily closing prices in a minutely backtest.

Grant

4
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import pandas
import zipline.transforms.ta as ztt

W_L = 15  # batch transform window length, days

def initialize(context):
context.stocks = [sid(24)]

context.price_prior = 0
context.day_prior = 0

context.first_time = True

def handle_data(context, data):

day = data[context.stocks[0]].datetime.day
price = data[context.stocks[0]].price

# get data
data_minutely_daily = get_data(data,context.stocks)
if data_minutely_daily is None:
context.day_prior = day
context.price_prior = price
return

minutely = data_minutely_daily[0]
daily = data_minutely_daily[1]
if context.first_time:
print minutely.ix['2013-01-02 21:00:00']
print daily.ix['2013-01-02 00:00:00']
context.first_time = False

# get RSI & MAs
rsi   =  float(ztt.talib.RSI(daily[context.stocks[0]], timeperiod=14)[-1:])
ma_5  =  float(ztt.talib.MA(minutely[context.stocks[0]], timeperiod=5*390)[-1:])
ma_15 =  float(ztt.talib.MA(minutely[context.stocks[0]], timeperiod=15*390)[-1:])

record(rsi_0 = rsi)
record(ma_5=ma_5)
record(ma_15=ma_15)
record(price = price)

if day != context.day_prior:
print "Recorded: "+str(context.price_prior)
print "Resampled: "+str(daily[context.stocks[0]][-1])
print minutely.tail(3)

context.day_prior = day
context.price_prior = price

@batch_transform(refresh_period=0, window_length=W_L) # set globals R_P & W_L above
def get_data(data,sids):
minutely = data.price[sids]
daily = minutely.resample('D', closed='right', label='left', how='last').dropna()
return (minutely,daily)
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

Hello Grant,

When I was playing with three timeframes above I realised that

daily[context.stocks[0]][-1]


is the last minutely price i.e.

ts
Out[203]:
2013-09-18 14:31:00    1.425262
2013-09-18 14:32:00   -0.598867
2013-09-18 14:33:00    1.774188
2013-09-18 14:34:00    0.164616
2013-09-18 14:35:00    0.479529
2013-09-18 14:36:00    1.116959
2013-09-18 14:37:00    1.330623
2013-09-18 14:38:00   -0.464643
2013-09-18 14:39:00    1.307154
2013-09-18 14:40:00    1.603750
2013-09-18 14:41:00   -0.447813
2013-09-18 14:42:00    0.677137
2013-09-18 14:43:00   -0.451986
Freq: T, dtype: float64

ts.resample('5Min', label='right', closed='left',how='last')
Out[204]:
2013-09-18 14:35:00    0.164616
2013-09-18 14:40:00    1.307154
2013-09-18 14:45:00   -0.451986
Freq: 5T, dtype: float64


I can imagine that in some cases this would be a good thing but also that at other times it would be better to have just daily closes in the DataFrame.

I hope the 'refactoring' of batch_transform happens soon.

P.

Hello Peter,

It seems that the sure-fire way to accumulate daily closing prices is to store a trailing minutely price. Then, when a daily market open is indicated, the stored price will be the prior day's closing price. In the example above, I just store the prices, but I think one could append a pandas timeseries, as well, to build up the trailing window of closing prices while running a minutely backtest.

For 15-minute, 5-minute, etc. bars, it's perhaps more tricky, since one has to deal with missing data, etc.

Grant

Hello Peter,

Here's my latest effort. The batch transform is now:

@batch_transform(refresh_period=0, window_length=W_L) # set globals R_P & W_L above
def get_data(data,sids):
minutely = data.price[sids]
daily = pd.rolling_mean(minutely,390)[389:][::390]
return (minutely,daily)


The idea is to use a rolling mean to generate a trailing window of rolling means, that can be fed to the TA-Lib RSI. This way, if a trading decision is to be made every minute, all of the data up to that point are used. This makes more sense to me than using daily closing prices, since all of the data are not incorporated into the decision--just one point each day.

Grant

6
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import pandas as pd

W_L = 14  # batch transform window length, days

def initialize(context):
context.stocks = [sid(8554),sid(24)]

def handle_data(context, data):

# get data
data_minutely_daily = get_data(data,context.stocks)
if data_minutely_daily is None:
return

minutely = data_minutely_daily[0]
daily = data_minutely_daily[1]

print minutely[context.stocks[1]].tail(5)
print daily[context.stocks[1]]

@batch_transform(refresh_period=0, window_length=W_L) # set globals R_P & W_L above
def get_data(data,sids):
minutely = data.price[sids]
daily = pd.rolling_mean(minutely,390)[389:][::390]
return (minutely,daily)
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

Here's the updated code, with the MA & RSI plots. --Grant

155
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import pandas as pd
import zipline.transforms.ta as ztt

W_L = 15  # batch transform window length, days

def initialize(context):
context.stocks = [sid(24),sid(8554)]

def handle_data(context, data):

# get data
data_minutely_daily = get_data(data,context.stocks)
if data_minutely_daily is None:
return

price = data[context.stocks[0]].price

minutely = data_minutely_daily[0]
daily = data_minutely_daily[1]

# print minutely
# print daily

# get RSI & MAs
rsi   =  float(ztt.talib.RSI(daily[context.stocks[0]], timeperiod=14)[-1:])
ma_5  =  float(ztt.talib.MA(minutely[context.stocks[0]], timeperiod=5*390)[-1:])
ma_15 =  float(ztt.talib.MA(minutely[context.stocks[0]], timeperiod=15*390)[-1:])

record(rsi = rsi)
record(ma_5=ma_5)
record(ma_15=ma_15)
record(price = price)

@batch_transform(refresh_period=0, window_length=W_L) # set globals R_P & W_L above
def get_data(data,sids):
minutely = data.price[sids]
daily = pd.rolling_mean(minutely,390)[389:][::390]
return (minutely,daily)  
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

Here's the algorithm with trading rules:

if ma_5 > ma_15 and rsi > 70 and notional < context.max_notional:
order(context.stocks[0],100)
elif ma_5 < ma_15 and rsi < 30 and notional > context.min_notional:
order(context.stocks[0],-100)


It's cheating to use Apple, but might as well pick a winner for testing!

Grant

155
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import pandas as pd
import zipline.transforms.ta as ztt

W_L = 15  # batch transform window length, days

def initialize(context):
context.stocks = [sid(24),sid(8554)]

context.max_notional = 1000000.1
context.min_notional = -1000000.0

def handle_data(context, data):

# get data
data_minutely_daily = get_data(data,context.stocks)
if data_minutely_daily is None:
return

price = data[context.stocks[0]].price

minutely = data_minutely_daily[0]
daily = data_minutely_daily[1]

# print minutely
# print daily

# get RSI & MAs
rsi   =  float(ztt.talib.RSI(daily[context.stocks[0]], timeperiod=14)[-1:])
ma_5  =  float(ztt.talib.MA(minutely[context.stocks[0]], timeperiod=5*390)[-1:])
ma_15 =  float(ztt.talib.MA(minutely[context.stocks[0]], timeperiod=15*390)[-1:])

record(rsi = rsi)
record(ma_5=ma_5)
record(ma_15=ma_15)
record(price = price)

notional = context.portfolio.positions[context.stocks[0]].amount*price

if ma_5 > ma_15 and rsi > 70 and notional < context.max_notional:
order(context.stocks[0],100)
elif ma_5 < ma_15 and rsi < 30 and notional > context.min_notional:
order(context.stocks[0],-100)

@batch_transform(refresh_period=0, window_length=W_L) # set globals R_P & W_L above
def get_data(data,sids):
minutely = data.price[sids]
daily = pd.rolling_mean(minutely,390)[389:][::390]
return (minutely,daily)  
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

Hi Grant or anyone,

I am trying to run this algo and get this error:

There was a runtime error.
TypeError: Argument 'real' has incorrect type (expected numpy.ndarray, got Series)
... USER ALGORITHM:28, in handle_data
rsi = float(ztt.talib.RSI(daily[context.stocks[0]], timeperiod=14)[-1:])

Any help debugging? I really want to use this algo since it is daily MACD and RSI using minute data! What I've been looking for!

I believe this is not running since it is not using 'history' and 'batch transform' is depreciated. Can someone update this a bit to get it working again? I really want to work on it, but don't know enough about how to get updated data feeds or debugging this yet to be able to fix it properly.

thanks,
Andrew

Hi Andrew,

I'd definitely move away from the batch_transform in favor of history. Also, you shouldn't need to do the zipline import to use TA-LIB.

I don't have time now to dig into it, but have a look at the help page, https://www.quantopian.com/help#ide-history. You could do:

minutely = history(15*390, '1m', 'price')


I need to check it, but I think you can just copy the code for smoothed "daily" prices:

daily = pd.rolling_mean(minutely,390)[389:][::390]


Grant

I did all of your suggestions and got it to run without errors, but for some reason no trades are being initiated. i made some edits to make sure there are no undefined variables, but since I'm new to this, I might be doing something stupid, perhaps you could take a peak and see if anything glaring is wrong?

Thanks,
Andrew

I suggest posting the code to this discussion thread, since I can't access the link you shared. --Grant

Sorry I thought those links were public. Here's the code:


import pandas as pd
import talib

def initialize(context):
context.stocks = [sid(24),sid(8554)]

def handle_data(context, data):
minutely = history(15*390, '1m', 'price')
daily = pd.rolling_mean(minutely,390)[389:][::390]
return (minutely,daily)
# get data
data_minutely_daily = handle_data(data,context.stocks)
if data_minutely_daily is None:
return
price = data[context.stocks[0]].price
minutely = data_minutely_daily[0]
daily = data_minutely_daily[1]
# print minutely
# print daily
# get RSI & MAs
rsi   =  float(talib.RSI(daily[context.stocks[0]], timeperiod=14)[-1:])
ma_5  =  float(talib.MA(minutely[context.stocks[0]], timeperiod=5*390)[-1:])
ma_15 =  float(talib.MA(minutely[context.stocks[0]], timeperiod=15*390)[-1:])
record(rsi = rsi)
record(ma_5=ma_5)
record(ma_15=ma_15)
record(price = price)

if ma_5 > ma_15 and rsi > 70:
order(context.stocks[0],100)
elif ma_5 < ma_15 and rsi < 30:
order(context.stocks[0],-100)


Here's some code that runs. No guarantee that it is correct, but it is a start. --Grant

37
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import pandas as pd
import talib

def initialize(context):

context.stock = symbol('AAPL')
context.spy = symbol('SPY')

def handle_data(context, data):

minutely = history(15*390, '1m', 'price')[context.stock]
daily = pd.rolling_mean(minutely,390)[389:][::390]

rsi   =  talib.RSI(daily, timeperiod=14)[-1]
ma_5  =  talib.MA(minutely, timeperiod=5*390)[-1]
ma_15 =  talib.MA(minutely, timeperiod=15*390)[-1]

if get_open_orders():
return

if ma_5 > ma_15 and rsi > 70:
order_target_percent(context.stock,1)
elif ma_5 < ma_15 and rsi < 30:
order_target_percent(context.stock,-1)
else:
order_target_percent(context.stock,0)

record(leverage = context.account.leverage)
record(rsi = rsi)
record(ma_5 = ma_5)
record(ma_15 = ma_15)
There was a runtime error.

Awesome thanks!! this seems a lot simpler.

A

Have a look at schedule_function() on the help page. Rather than making a trading decision minutely, you could consider picking a fixed time every day. --Grant

I was able to tweak the MAs and RSI thresholds a bit and generate 80-100% yearly returns, even in downmarket environments!

Why would a fixed time be more accurate? A trade being made at 10:32AM versus 2:42PM could mean a massive price difference. The reason I find this so important is because I used to design trading algorithms on a desktop program called Prodigio RTS. They are still around, but my main issue was the lack of minutely data, and the lack of more than 3 years of daily data. I found that 2 things killed any seemingly good momentum quant strategy, the main culprit was Daily data. Thus if a trigger was created by MAs crossing in the morning, the trade for that movement would go in the next trading day, thus eliminating the majority of the profit from the trade, sometimes all the profit, or loss of capital, since the delayed timing killed the trade, where it would have been profitable if the trades were executed on the exact minutes the triggers were pulled. Though it is super computer intensive to calculate daily MAs with minutely data and slower to test, it is the only way to achieve good timing, and with trading, timing is everything.

The second culprit I alluded to is commissions, which can be solved using Quantopian's integration with Robinhood!

I will check out the schedule_function()

Is is possible to write an algorithm in Quantopian that optimizes over time using a feed of updated historical data? Is it possible to calculate values based on backtests made on quantopian? While this seems programatically hard, I believe self-optimizing algos which keep current and leverage results of other algo backtests could be super cool and actually work a lot better. For example an algo in live trading could rebalance MA period values every week based on running new backtests to determine which periods where the best fit for the stock or sector.

Why would a fixed time be more accurate?

It is a crude way of limiting the amount of trading. As it stands, there's nothing to prevent going in and out of the stock multiple times a day, since the trailing window from minutely = history(15*390, '1m', 'price')[context.stock] gets updated every minute. You might look through the backtest results to see if there is much "churn" in the trading.

Is is possible to write an algorithm in Quantopian that optimizes over time using a feed of updated historical data?

I'm not sure what you mean. The data feed is minutely, and you can do computations every minute.

Is it possible to calculate values based on backtests made on quantopian?

I'm not aware of a way to do this sort of thing in the trading platform without doing it manually (e.g. there's no way to call a backtest function from within the backtester itself), but in the Quantopian research platform, you might be able to simulate it. You could do it manually, though. For example, every week/month/quarter, re-optimize and tweak your algo parameters and then re-start the algo (or figure out how to feed them in using fetcher).

Cool. So rebalancing has to be done externally and re-applied.

I meant an algo that can self-optimize over time by recalculating and comparing different parameters based on newer history (ie current real time data) since the last calculation. Basically, while a 5 day MA is calculated every minute for trades, what if the fast MA was a range from 3-15 and slow MA was a range of 20-200, and you had something that automatically calculated the best performing MA crossover combination (IE 5/50, 3/10, 9/26, 14/100, etc) based on historical data, and redid that calculation every week or so based on new historical data.

I did not see any churn or intraday trading in the backtest. Most positions were held for a few days, few weeks, or few months before being sold. This is ideal for commissions and having a 'low frequency' quant algo.

I'm not aware of any way to do the automatic optimization you describe, but it might be possible if you wrote your own mini efficient backtester function. The Quantopian backtester is event-driven, however if your algo could be vectorized (without introducing bias), then you might be able to pull it off. The time-out for handle_data plus any scheduled functions is about 50 seconds. For before_trading_start the time-out is 5 minutes (for example, run the code I posted on https://www.quantopian.com/posts/test-of-before-trading-start-time-out ).

Your best bet, I think, would be to get up the learning curve for the Quantopian research platform, if you want to explore the parameter space. It used to be that you'd need to request access. I'm not sure if that is still the case. Note that you can manually launch as many backtests as you want (they run in parallel), and then pull the results into the research platform for analyses. Doing your study ("fast MA was a range from 3-15 and slow MA was a range of 20-200") would be tedious, but possible.

Finally back on the algo after a hiatus. How would one vectorize an algo and what would that do? Your post about before_trading_start mentioned being able to run things 5 mins before the trading day starts. Could that theoretically contain a whole mini backtester rebalancing function? Is the research platform you refer to a separate platform from the quantopian community like docs or a wiki? I didn't realize I could run multiple backtests concurrently. So basically I could clone it a bunch of times and put different numbers in each and have them run them all at the same time? Is there a rate limit on how many backtests you can run simultaneously? The next thing I want to test is swapping out the fast/slow daily MA for a daily MACD with minute data.

How would one vectorize an algo and what would that do?

Certain operations can be done very quickly in numpy (similar to MATLAB). Pandas supports this sort of thing, too. Rather than looping over computations, everything is done at once, essentially. If you do a search on "vectorized backtesting" it should give you an idea of how this might apply. As an example, say you wanted to simulate buying and holding SPY & TLT in various ratios over a 20 day trailing window of OHLCV minute bars, computing performance statistics of hypothetical portfolios. This should be doable in a vectorized fashion, so that computations would be highly efficient.

Is the research platform you refer to a separate platform from the quantopian community like docs or a wiki?

It is a separate platform, but backtest results can be imported (see the help docs). Also, just try My Code --> Notebooks, which should take you to the research platform.

I didn't realize I could run multiple backtests concurrently.

Yes, but really only from the backtester. In the research platform, I think you can have multiple notebooks running simultaneously, but there is an overall memory limit, which I don't think exists for the backtester. Also in the research platform, I'm not sure that one notebook can "talk" to another one, so comparing backtests might not be feasible. In any case, if you are interested in running a whole bunch of backtests, just fire them off. You can pull the results into the research platform, once they finish. I don't think there is any limit to the number you can run simultaneously.