Back to Posts
Listen to Thread

An example of how to compute a moving price average on multiple securities (sids). I welcome questions and suggestions to improve the code. Thanks to all who helped with various elements.

Have a look at the quick and full backtest logs to understand what is going on.

Sometimes the log output hangs under the quick backtest, so it'd be helpful if folks cloned the code to see if this problem persists.

Clone Algorithm
26
Loading...
Backtest from to with initial capital ( data)
Cumulative performance:
Algorithm Benchmark
Custom data:
Week
Month
All
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Information Ratio
--
Benchmark Returns
--
Volatility
--
Max Drawdown
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Information Ratio 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

I just ran the code, and it executed under both the quick and full backtests. I then went back to the quick backtest and ran it again. The log output was "End of logs." Upon running it again, I get "Waiting for logs..." indefinitely. So, the code posted above may be flagging a bug/weakness in the backtester.

Seems to work consistently under the full backtester, but when I flipped back to the quick one, I got a log output of:

2012-01-06handle_data:16DEBUG[ 48.72 73.75 62.735 22.945 28.6275 59.195 31.37 73.455 54.86 46.515 ]
End of logs.

Hello Grant. I was able to reproduce this one and write the bug on the logging behavior. Thanks for the pointer. Dan

Grant,

As a quick suggestion, you should be able to vectorize the mean computation and reduce the code to

@batch_transform(refresh_period=R_P, window_length=W_L)  
def get_avg(datapanel, sids):  
    return datapanel.price.mean()  

Thanks Thomas,

Some questions/comments:

  1. What data type/format is returned by the function you provide above? Is it a vector?
  2. It seems that the second get_avg argument, sids, is not used. Isn't it necessary to specify what datapanel data to operate on? Or does the function only work if the datapanel contains sid data (as I understand, in the future, other data/indicators could be present)?
  3. Is mean() actually the numpy mean? Or something else?
  4. I'd prefer the flexibility of numpy average, since weights can be incorporated.

Grant,

  1. A pandas series (1D vector).
  2. The SID information is already present in the data panel as the column names. No need to supply it twice. That should be added to the documentation now that I think of it.
  3. It's the pandas mean function. I'm not 100% sure if it might not call numpy.mean() under the hood. I don't think so.
  4. You can check if pandas has a weighted mean function -- it should.

Hello Thomas,

If you have a reference to the pandas mean function, please pass it along. I did some poking around on the web, and nothing came up easily.

Hi Grant,

It seems weighted mean is not yet implemented in pandas (https://github.com/pydata/pandas/issues/886). However, you can still use vectorization with the numpy function if you prefer. E.g.:

@batch_transform(refresh_period=R_P, window_length=W_L)  
def get_avg(datapanel):  
    avg = np.average(datapanel.price.values, axis=0)  
    return pd.Series(avg, index=datapanel.minor_axis)  

This will keep the sids as the index to the Series. Of course, you can also just return the numpy array and work with that.

Thomas,

In case another Python newbie reads this, your last example above requires:

import numpy as np  
import pandas as pd  

Took me some googling to sort out that "pd" is conventional shorthand for "pandas".

Hello Thomas,

Thanks for your tips. Attached, please find an update to my "multi-sid batch transform moving average" effort. I'm trying to sort out what these functions return and why I'd want to use one of them versus returning a vector of the average sid prices:

@batch_transform(refresh_period=R_P, window_length=W_L)  
def get_avgs_3(datapanel, sids):  
    return datapanel.price.mean()

@batch_transform(refresh_period=R_P, window_length=W_L)  
def get_avgs_4(datapanel):  
    avgs = np.average(datapanel.price.values, axis=0)  
    return pd.Series(avgs, index=datapanel.minor_axis)  

As I mentioned above, once I get the basic framework down for doing the moving average via the batch transform, I plan to add weights.

Also, I'm still sorta confused about the Quantopian thinking regarding moving averages and the batch transform. It seems that the trailing window length is not actually a fixed number of ticks under the full/minutely backtest.

Clone Algorithm
26
Loading...
Backtest from to with initial capital ( data)
Cumulative performance:
Algorithm Benchmark
Custom data:
Week
Month
All
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Information Ratio
--
Benchmark Returns
--
Volatility
--
Max Drawdown
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Information Ratio 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.
Log in to reply to this thread.
Not a member? Sign up!