Back to Posts
Listen to Thread

I set refresh_period=0 and it either hangs in the quick daily backtest or outputs unrepeatable results. It seems to hang consistently in the full backtest.

I realize that refresh_period=0 may not be a valid setting. If so, the code should report an error to the user.

Clone Algorithm
3
Loading...
Backtest from to with initial capital ( data)
Cumulative performance:
Algorithm Benchmark
Custom data:
Week
Month
All
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Information Ratio
--
Benchmark Returns
--
Volatility
--
Max Drawdown
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Information Ratio 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

@Grant, great find!

Fawce,

The full backtest (01/03/2012 to 01/06/2012) actually completes (slowly) with:

def initialize(context):  
    context.stock = sid(16841)

def handle_data(context, data):  
    avg = get_avg(data,context.stock)  
    event_time = data[context.stock].datetime  
    log.debug(event_time)  
    log.debug(avg)  
@batch_transform(refresh_period=0, window_length=1)  
def get_avg(datapanel,sid):  
    prices = datapanel['price']  
    avg = prices[sid].mean()  
    return avg  

A sample of the log output:

2012-01-03handle_data:8DEBUG2012-01-03 20:59:00+00:00
2012-01-03handle_data:9DEBUGNone
2012-01-03handle_data:8DEBUG2012-01-03 21:00:00+00:00
2012-01-03handle_data:9DEBUGNone
2012-01-04handle_data:8DEBUG2012-01-04 14:31:00+00:00
2012-01-04handle_data:9DEBUG177.443721228
2012-01-04handle_data:8DEBUG2012-01-04 14:32:00+00:00
2012-01-04handle_data:9DEBUG177.454245524
2012-01-04handle_data:8DEBUG2012-01-04 14:33:00+00:00
2012-01-04handle_data:9DEBUG177.461867008
2012-01-04handle_data:8DEBUG2012-01-04 14:34:00+00:00
2012-01-04handle_data:9DEBUG177.470076726

Note that avg changes every minute. So, what's happening? Is it an N-tick moving price average that gets updated every minute? If so, what is N? Is N a fixed value regardless of the datetime?

See https://www.quantopian.com/posts/mavg-days-transform-details for more of the confusion around computing moving averages for minutely trading.

@Grant,

For the mavg transform available on the data parameter in handle_data, the calculation is always done over all the available bars within the trailing trading-day range, regardless of the duration of the bar. So, a 3 day mavg for daily bars will have 3 day bars. A 3 day mavg for minute bars will have 3 * 390 minute bars. What I hear from all the confusion is the user voice saying: let me stipulate the granularity of the bars used in these moving calculations. In fact, I hear the request for more direct control over the granularity of bars in many areas. I'm starting to think we should add a magic method for initialize to stipulate the granularity desired.

In the code above, the value of zero fro refresh_period is an edge case that should raise an exception. My hunch is that zero is triggering a run on the batch transform for every single event, which would explain why the test took a long time to run. I think the right thing here is to just raise an exception to the user, or better, a validation warning.

Thanks Fawce,

I'll stay tuned...

I think what is required for a true minutely moving average is to recalculate the average every minute, right? This is what I'd like to do--turning out to be a bit tricky! If it takes a long time to run a backtest, that's o.k. (so long as I have a sense for the approximate total time it'll take). For live/paper trading, would you expect the computation each minute to take long? So long as it is well under a minute, trades can be executed, correct?

For the batch transform decorator, it would seem to make more sense to specify refresh_period and window_length in minutes, rather than in days, since the full backtest data are down to the minute. As a first step, just give the user access to the raw data going back as far as he wants and as frequently as he wants. You can build up the tools from there.

@Grant, you are right! The transforms, such as mavg, should update as frequently as the data is sent to handle_data. The transforms used from handle_data have reasonably efficient iterative implementations, so they are not free but they are less expensive than a batch.

Specifying the window length in terms of the number of bars (be they minute or day) makes some sense. I'm not totally sold because there are a lot of details we handle under the hood with respect to trading days and ensuring sequential batches are comparable. I'm pretty sure the refresh period should always have the same units as the window_length. Maybe we have the unit default to days, but allow a new parameter to specify an override to minute?

Thanks for explaining so clearly what you'd like to see, helps immensely!

Fawce,

Well, you could leave it as is, but let refresh_period = 0 go down to the minute (if that's not what it's doing now...I can't tell). Then the user would have to write logic if a longer refresh period were desired.

In Matlab, there is a way of determining the number of arguments to a function (see http://www.mathworks.com/help/matlab/ref/nargin.html). This is a standard approach to extending the functionality of a Matlab function. Would something like this work in Python with the decorator (e.g. @batch_transform(refresh_period=0, window_length=30, refresh_period_minutes=5))?

Controlling the functionality with the number of arguments (in this case "switches") has to be well-documented, since it can get confusing and become mistake-prone.

Log in to reply to this thread.
Not a member? Sign up!