Tearsheet Analysis of Algo Performance in our Research Environment

strong text
With our recent release enabling cloning of notebooks from the Quantopian Research environment, we can now more easily share some of our in-house analytics tools with our community of users. What I’m sharing today is a notebook that you can clone into your own Research environment to analyze and visualize your backtest’s performance across a variety of dimensions.

The notebook contains all of the functions to compute all performance statistics, so it is very self contained, and thus also means a tearsheet can be computed for any timeseries you pass to it. So, you can upload a CSV of the daily returns of your favorite mutual fund and see how it looks, or simply pass in the timeseries of a few stocks. An example of how to accomplish this using a single stock is also included at the end of this notebook.

### Performance Statistics

• Similar to what is reported in the Quantopian Backtester but computes a few others.
• Ability to specify an “out-of-sample” or “live trading” date. So, say you’ve started to live papertrade an algo, or started trading it with real money and you’d like to easily perform a side-by-side comparison of the algo’s performance -- this is now as simple as specifying the date which will slice your backtest into 2 pieces, generating performance stats for each.
• If you don’t specify an out-of-sample date, just by default the function will split your backtest into 2 equally sized time periods allowing you to see how consistently the 2nd timeframe matched the 1st timeframe
• Similarity between the distributions of your backtest vs. live trading daily returns. E.g. How closely does the out-of-sample trading of your algo match the expectations set from the backtest.

### VaR (Value-At-Risk) Table

• Simple VaR calculation of 1 day and 1 week 2-stdev VaR
• We are working on more extensive building out of VaR metrics

### Stress Events Table

• How your algo performed during specific macroeconomic periods of “stress”
• The columns mean/min/max refer to the average daily return of the algo (min), the worst daily return of your algo (min), and best daily return of your algo (max)

### Cumulative Returns Plot

• Green: Backtest performance
• Red: Live, out-of-sample, performance
• A couple of broad indexes for comparison purposes.
• S&P500 (does not include dividends paid)
• Intermediate Term Bonds ETF (does not include dividends paid)
• These should not necessarily be used as benchmarks for performance so much as they should be used to see when large market/macro shock events occurred and how your algo fared.
• “Cone Charts” (within the Cumulative Returns Plot)
• The cone is meant to serve as a guide post for how we might expect an algorithm to perform in the future based solely upon its backtest.
• RED cone: The live papertrading period (e.g. out-of-sample period). The cone drawn around this live period was based solely on the daily returns data from the backtest period.
• BLUE cone : A blue cone is drawn going out one-year into the future showing the 1.0 stdev expected range of the algo's performance if we were to start trading it today. As well, this blue cone is computed solely on the backtest (GREEN) daily returns/volatility data.
• Math Behind the Cone
1. Compute Algo Volatility Expectation from the Backtest.
1. Uses only the backtest data to compute the daily volatility of the algo.
2. Compute Algo Profit Expectation from the Backtest
1. From the backtest data only, calculate the linear trend of the algo's profit (e.g. the slope of how it goes up-and-to-the-right as it generates profits)
2. Use this linear trend established from the backtest daily returns and if the algo performs similarly in live trading, we should expect the live performance to fall within the cone drawn around this linear trend +/- the volatility computed in #1. (The linear trends are shown as dashed lines in the above plot)
3. Compute the Cone based on Backtest Returns & Volatility
1. The midpoint line of the cone (e.g. average profit expectation) is the linear trend established in #2 above.
2. Then draw the cone using the daily volatility computed in #1 above by scaling the daily value based on the number of days that the algo has been live, using the conventional standard rule of volatility scaling. E.g. If I was to scale a daily volatility to an annual volatility I take the daily value and multiply it by the sqrt(252) to annualize it.
3. So, for example, an algo with a 1% daily volatility, means it has a ~16% annual volatility: 1% * sqrt(252). This will result in a cone that has a width of +/-16% at a point in time 1-year from the start of the cone. Similarly, the cone width at month 6 would be +/-11 %, since 1% * sqrt(126)=11%. This calculation is simply done every N days from the start of the cone to create the width of the cone at each N-day period from the start of the cone and shapes the cone as time progresses forward.
• This is another area where we are spending more time researching advanced methods for modeling the future returns and variance expectations of an algo based solely on its backtest data. The more accurately we model this will allow us to have greater confidence understanding when an algorithm has stopped working after investing in it, and trigger a warning that the algorithm may have to be turned off.

### Rolling Beta Plot

• Your algo’s beta computed over a rolling 6-month window of time
• See how your algo’s exposure to the SP500 market fluctuates over time or if it is steadily market-neutral.

### Rolling Sharpe Ratio

• Allows you to see how your algo’s Sharpe Ratio fluctuates over time or if it is very steady and earns profit consistently over time.

### Heatmap of Monthly Returns

• Each month’s returns of your algo, heatmapped so you can see the months you made the most or lost the most money.

### Annual Returns Bar Chart

• Allows easy visualization of how steadily your algo performs on a yearly basis

### Distribution of Monthly Returns

• Easy inspection of whether your algo has a consistent mean profit each month and whether the algo experienced any large “fat-tailed” events

### Daily Returns Similarity

• The distribution of your live, out-of-sample, daily returns overlaid upon the distribution of daily returns from your backtest. If they perfectly alighn the result would be 100% similarity.
• This allows you to quickly see if your live out-of-sample performance is aligned with the expectations of your backtest.
• These distributions are determined after scaling the observations underlying each distribution and using a Gaussian kernel density estimator to estimate the probability density function.
• The reason behind scaling the returns observations is to account for the fact that there are most likely many more observations in your backtest than in your live out-of-sample period.
• To see how the scaling was done check out the function in the notebook.
• We are doing much more work in this area since it is a very important aspect of choosing algorithms for the hedge fund.
• We would love to hear others’ thoughts about ways to improve or revise this current methodology – if you have ideas leave them in the comments!

### Box and Whisker Plots

• Show the average, 25th, 75th percentile returns + outliers across three different timeframes: daily, weekly, monthly

### “Stress Event” (Grid of Small Plots)

• How well did your algo perform during large stress events.
• Gray: SP500
448
Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

47 responses

Here is the algo/backtest that the above notebook references so that you can have access to it when cloning the notebook.

-Justin

Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import numpy as np
import statsmodels.api as sm
import pandas as pd
import pytz

def initialize(context):
# Quantopian backtester specific variables
context.y = symbol('USO')
context.X = symbol('GLD')

# just disregard this
context.useHRlag = True
context.HRlag = 2

# strategy specific variables
# based on 'offline' research
context.lookback = 20 # used for regression
context.z_window = 20 # used for zscore calculation, must be <= lookback

context.hedgeRatioTS = np.array([])
context.inLong = False
context.inShort = False

# based on 'offline' research
context.entryZ = 1.0
context.exitZ = 0.0

if not context.useHRlag:
# a lag of 1 means no-lag, this is used for np.array[-1] indexing
context.HRlag = 1

# Will be called on every trade event for the securities you specify.
def handle_data(context, data):
if get_open_orders():
return

now = get_datetime()
exchange_time = now.astimezone(pytz.timezone('US/Eastern'))

if not (exchange_time.hour == 15 and exchange_time.minute == 30):
return

prices = history(35, '1d', 'price').iloc[-context.lookback::]

y = prices[context.y]
X = prices[context.X]

try:
except ValueError as e:
log.debug(e)
return

context.hedgeRatioTS = np.append(context.hedgeRatioTS, hedge)
# Calculate the current day's spread and add it to the running tally
if context.hedgeRatioTS.size < context.HRlag:
return
# Grab the previous day's hedgeRatio
hedge = context.hedgeRatioTS[-context.HRlag]

# Keep only the z-score lookback period

if context.inShort and zscore < context.exitZ:
order_target(context.y, 0)
order_target(context.X, 0)
context.inShort = False
context.inLong = False
record(stock_Y_pct=0, stock_X_pct=0)
return

if context.inLong and zscore > context.exitZ:
order_target(context.y, 0)
order_target(context.X, 0)
context.inShort = False
context.inLong = False
record(stock_Y_pct=0, stock_X_pct=0)
return

if zscore < -context.entryZ and (not context.inLong):
y_target_shares = 1
X_target_shares = -hedge
context.inLong = True
context.inShort = False

(y_target_pct, x_target_pct) = computeHoldingsPct( y_target_shares,X_target_shares, y[-1], X[-1] )
order_target_percent(context.y, y_target_pct)
order_target_percent(context.X, x_target_pct)
record(stock_Y_pct=y_target_pct, stock_X_pct=x_target_pct)
return

if zscore > context.entryZ and (not context.inShort):
y_target_shares = -1
X_target_shares = hedge
context.inShort = True
context.inLong = False

(y_target_pct, x_target_pct) = computeHoldingsPct( y_target_shares, X_target_shares, y[-1], X[-1] )
order_target_percent(context.y, y_target_pct)
order_target_percent(context.X, x_target_pct)
record(stock_Y_pct=y_target_pct, stock_X_pct=x_target_pct)

def is_market_close(dt):

model = sm.OLS(y, X).fit()
return model.params[1]
model = sm.OLS(y, X).fit()
return model.params.values

def computeHoldingsPct(yShares, xShares, yPrice, xPrice):
yDol = yShares * yPrice
xDol = xShares * xPrice
notionalDol =  abs(yDol) + abs(xDol)
y_target_pct = yDol / notionalDol
x_target_pct = xDol / notionalDol
return (y_target_pct, x_target_pct)


There was a runtime error.

I wonder if it would be helpful to augment the in/out-of-sample distribution comparisons with a Kolmogorov-Smirnov test? I don't know much about it, but I recall seeing a presentation wherein it seemed useful.

@Simon: Thanks for proposing the KS-Test. I've just done a little digging around to learn about it, and it certainly feels like it could be a useful addition to our testing procedure for analyzing backtest vs. out of sample performance.

Its fantastic research notebook!
I wonder the Notebook whether could analyze Zipline algorithm backtest result ,
Have suggestion how to do it on Zipline platform?

I clone the notebook and modify the tearsheet on a single stock's to 'EEM', why got these error responsed ?

You can also easily run a tearsheet on a single stock's or ETF's daily returns timeseries
In [15]:

stock = securities_panel.loc['price']['EEM'].dropna()
stock_rets = stock.pct_change().dropna()
In [16]:

analyze_single_algo( df_rets=stock_rets, algo_live_date='2014-1-1', cone_std=1.0 )
Entire data start date: 2003-04-14 00:00:00+00:00
Entire data end date: 2015-06-11 00:00:00+00:00
Out-of-Sample Months: 17
Backtest Months: 128
Backtest Out_of_Sample All_History
max_drawdown -0.67 -0.18 -0.67
calmar_ratio 0.26 0.07 0.23
annual_return 0.18 -0.01 0.15
stability 0.57 0.00 0.53
sharpe_ratio 0.54 -0.08 0.50
annual_volatility 0.33 0.16 0.31
alpha 0.05 -0.11 0.02
beta 1.46 0.95 1.44

82% :Similarity between Backtest vs. Out-of-Sample (daily returns distribution)

TypeError Traceback (most recent call last)
in ()
----> 1 analyze_single_algo( df_rets=stock_rets, algo_live_date='2014-1-1', cone_std=1.0 )

in analyze_single_algo(df_rets, algo_live_date, cone_std)
138 plt.title("Daily Returns Similarity = " + str(consistency_pct) + "%" )
139
--> 140 dd_table = gen_drawdown_table(df_rets,top=10)
141
142 dd_table['peak date'] = map( extract_date, dd_table['peak date'])

in gen_drawdown_table(df_rets, top)
48 def gen_drawdown_table(df_rets, top=10):
49 df_cum = cum_returns(df_rets, 1.0)
---> 50 drawdown_periods = get_top_draw_downs(df_rets, top=top)
51 df_drawdowns = pd.DataFrame(index=range(top), columns=['net drawdown in %',
52 'peak date',

in get_top_draw_downs(df_rets, top)
11 #if not np.isnan(recovery):
12 underwater = pd.concat(
---> 13 [underwater.loc[:peak].iloc[:-1], underwater.loc[recovery:].iloc[1:]])
14 #else:
15 # drawdown has not ended yet

/usr/local/lib/python2.7/dist-packages/pandas/core/indexing.pyc in getitem(self, key) 1178 return self._getitem_tuple(key)
1179 else:
-> 1180 return self._getitem_axis(key, axis=0)
1181
1182 def _getitem_axis(self, key, axis=0):

/usr/local/lib/python2.7/dist-packages/pandas/core/indexing.pyc in getitem_axis(self, key, axis) 1293 if isinstance(key, slice):
1294 self._has_valid_type(key, axis)
-> 1295 return self.
get_slice_axis(key, axis=axis)
1296 elif is_bool_indexer(key):
1297 return self._getbool_axis(key, axis=axis)

/usr/local/lib/python2.7/dist-packages/pandas/core/indexing.pyc in get_slice_axis(self, slice_obj, axis) 1200 labels = obj._get_axis(axis)
1201 indexer = labels.slice_indexer(slice_obj.start, slice
obj.stop,
-> 1202 slice_obj.step)
1203
1204 if isinstance(indexer, slice):

/usr/local/lib/python2.7/dist-packages/pandas/tseries/index.pyc in slice_indexer(self, start, end, step, kind) 1325
1326 try:
-> 1327 return Index.slice_indexer(self, start, end, step)
1328 except KeyError:
1329 # For historical reasons DatetimeIndex by default supports

/usr/local/lib/python2.7/dist-packages/pandas/core/index.pyc in slice_indexer(self, start, end, step, kind) 2344 This function assumes that the data is sorted, so use at your own peril
2345 """
-> 2346 start_slice, end_slice = self.slice_locs(start, end, step=step, kind=kind)
2347
2348 # return a slice

/usr/local/lib/python2.7/dist-packages/pandas/core/index.pyc in slice_locs(self, start, end, step, kind) 2488 start_slice = None
2489 if start is not None:
-> 2490 start_slice = self.get_slice_bound(start, 'left', kind)
2491 if start_slice is None:
2492 start_slice = 0

/usr/local/lib/python2.7/dist-packages/pandas/core/index.pyc in get_slice_bound(self, label, side, kind) 2426 # For datetime indices label may be a string that has to be converted
2427 # to datetime boundary according to its resolution.
-> 2428 label = self._maybe_cast_slice_bound(label, side, kind)
2429
2430 # we need to look up the label

/usr/local/lib/python2.7/dist-packages/pandas/tseries/index.pyc in maybe_cast_slice_bound(self, label, side, kind) 1280 """
1281 if is_float(label) or isinstance(label, time) or is_integer(label):
-> 1282 self.
invalid_indexer('slice',label)
1283
1284 if isinstance(label, compat.string_types):

/usr/local/lib/python2.7/dist-packages/pandas/core/index.pyc in invalid_indexer(self, form, key) 933 klass=type(self),
934 key=key,
--> 935 kind=type(key)))
936
937 def get
duplicates(self):

TypeError: cannot do slice indexing on with these indexers [nan] of <type 'float'

Hi Novice TAI,
Looks like I accidentally left an old function definition for get_top_draw_downs() at the bottom of the notebook when I was testing. Its in the cell directly above the "APPENDIX." The correct version of this get_top_draw_downs() function is at the beginning of the notebook which is why the notebook probably worked up until that point. Can you try deleting that single cell with this function definition at the bottom then "Run All..." the notebook to see if this fixes it?

Hi Justin,

Yes, as your command, that is workable now, thanks your kindly help.

Hi Justin,

The notebook contains all of the functions to compute all performance
statistics, so it is very self contained, and thus also means a
tearsheet can be computed for any timeseries you pass to it. So, you
can upload a CSV of the daily returns of your favorite mutual fund and
see how it looks, or simply pass in the timeseries of a few stocks. An
example of how to accomplish this using a single stock is also
included at the end of this notebook.

I wonder whether could upload a CSV of the Zipline backtest result to the notebook to see how it looks,?

Hi Novice TAI,
It's very easy to see the Quantopian/Zipline backtest performance statistics simply by scrolling up this thread to where I attached the backtest (it's available directly after the original post). Just click on the "Risk Metrics" tab of the backtest and you can view the Zipline results.

Justin - I playes with it and this is just amazing work you did that helps a lot. If you can soon improve it so the analyze_single_algo can also except parameters to feed our backtests this could be a real game changer of our algo optimization.

Hi Justin ,

sorry my poor english.
My mean is whether have the solution to upload the strategy backtest result from https://github.com/quantopian/zipline project's output into "Tearsheet Analysis notebook".

Hi Novice TAI,

If you're using a zipline object in the Quantopian Research environment, the TradingAlgorithm object has a value in it called "daily_stats" as well as a "returns" attribute. This "returns" field is just a Pandas Series that you can use to pass to the analyze_single_algo() function that makes the tearsheet in the same way that you ran the example tearsheet previously using the stock EEM. In this case you go to the bottom of the tearsheet notebook that I shared where I show how to pass in the daily returns of a single stock -- I save the daily returns of the stock to the variable 'stock_rets.' If you want to use your Zipline algo's daily returns you could just change that to:

hi Justin,

Please see my suggestion on the consistency calculation. I posted in another thread, but perhaps this is a better place. I will continue here:

I understand nothing is perfect and there will be trial and error.
Using daily return distribution is a start. However
- Practically consistency should also be about the magnitude of the total return. The 0.957778084 consistency score for 4.5% vs 100.8% is
a good example of why magnitue matter. Maybe the distribution match,
but if the magnitude are way off, it is not a consistent algo.
- Why daily? Why not hourly, weekly? What if algo only rebalance weekly or monthly? In those cases, daily returns might not reflect the
fundamental characteristics of the algo.

Plus, with limited number of data points for out of sample, which is
the situation that we face now, perhaps a different approach would be
more appropriate than comparing return distribution.

How about rank percentage difference between backtest vs out sample
metrics (return, dd, sharpe, ....) and average out the rankings, just
like the overall algo scoring process? Maybe give different weights
for metrics that are better / worse out of sample? I think this will
work better with smaller set of data.

How do I find my backtest ID?

When you're on the backtest results page, the ID is the last part of the URL (everything after the last slash -- 24 hexadecimal digits).

Thank you, MVK!

I was just genuinely wondering why did you decide to use Python? I can tell the language is Python if I'm wrong correct me. Or was it just your preferred language? Thanks.

Madona: There are many reasons for why we chose Python. I won't list them all but here are a few key ones:

• it is completely open and free (unlike e.g. matlab);
• it is a scripting language (unlike C# or Java);
• (related to the above point) it is relatively easy to learn (unlike C++);
• and, there is a large and healthy ecosystem of support libraries for numerical computing and data analysis.
Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Thomas: thanks for the answer but what if I would want to do it in C or C++ because I understand the Languages am i limited or should i just learn Python? I want to get into this and i enjoy C++ and C and Java but i don't know Python. I know this might not be the best place to ask this but i will appreciate your feedback. So is Python that fast? .. Thanks Thomas.

I don't think you will regret learning Python, and it's very easy to get going. There's many good tutorials out there. Here's one http://learnpythonthehardway.org/

Wow, I'm really glad I tried this took less than 10min start to finish which was great. I really wish there was a better way to track dividend distributions. I've got one distribution in 2012 that is especially difficult to understand.

I'm wondering about the compounded annual returns, shouldn't

return pow((1 + ts.mean()), 252) - 1

actually be

return pow((ts.iloc[-1]/ts.iloc[0]), (252 / len(ts))) - 1

?

the variable 'ts' contains daily percent returns, not the portfolio values. This is why the compounding formula implemented is: return pow((1 + ts.mean()), 252) - 1

Okay I see, I just don't understand why I get different results between the two methods. Probably your calculation is right but this is driving me nuts and I'd be happy to understand what's going on :/

symbol_list = ['SPY']

securities_panel = get_pricing(symbol_list, fields=['price']
, start_date='2000-01-01', end_date='2015-06-11')
securities_panel.minor_axis = map(lambda x: x.symbol, securities_panel.minor_axis)

stock = securities_panel.loc['price']['SPY'].dropna()
stock_rets = stock.pct_change().dropna()

v1 = (pow((1 + stock_rets.mean()), 252) - 1)

v2 = pow((stock.iloc[-1]/stock.iloc[0]), (252 / len(stock))) - 1

assert v1 == v2, '{} != {}'.format(v1,v2)


I get AssertionError: 0.0664323133186 != 0.0460561004688 which is a big difference

Marco,

The difference seems to be the frequency over which the compounding is occurring. v1 is compounding returns daily (e.g. 252) whereas v2 compounds what seems to be annually. e.g. if len(stock) = 252, then the base of the expression (ie: the % change between day 1 and day 252) is only raised to the power of 1.

Both approaches are sound mathematically, just 1 incorporates daily compounded growth, which can add up significantly over time when an asset is increasing over time.

Hi Justin,

thanks a lot!

Okay but if v2 calculates the compounded returns, shouldn't:

price = stock[0]

for i in list(stock_rets):
price *= (1+(v1/252))
print(price)


Unfortunately I think this is an example of where there are many different mathematical approaches to accomplish a financial calculation. You can calculate returns either arithmetically, geometrically (compounded), or in log-returns space. If taking the geometric approach then you have to choose a compounding frequency, or continuous frequency (e.g. future value = present value * e^rt). Each of the approaches will yield a different result when computing something like annual return. The important thing is to just choose one approach and use it consistently if your intention is to compare 2 different trading strategies, 2 stocks, etc. I've chosen 1 method. If you feel more comfortable with a different approach please feel free to swap it in as the body of the function in the tearsheet.

Ok, thanks Justin, that helps!

one little improvement I'd like to suggest:

If you use fmt=".1f" in the heat map settings you get one decimal shown in the monthly results heat map. Otherwise it gets rounded to full integers (sometimes) which can be misleading.

The complete line would be:

sns.heatmap(monthly_ret_table.fillna(0), annot=True, fmt=".1f",annot_kws={"size": 12}, alpha=1.0, center=0.0, cbar=False, cmap='RdYlGn')


Thanks Marco for that formatting tip in sns.heatmap(). Seems much cleaner for sure. We definitely plan on continuing to release new/updated/increased-functionality tearsheet notebooks in the future, so if you continue to have ideas/suggestions feel free to keep posting them in this thread.

Justin,

Performence metric anualized return should not be calculated in different ways depending on mathematical approaches of compounding.

Yahoo Finance defines

Annualized Return=(Period Ending Price/Period Beginning Price)^(1/t) - 1

http://ycharts.com/glossary/terms/annualized_returns

In financial industry it is known as Compound annual return.

Compound annual return = (Ending Value / Beginning Value)^((1 / n) - 1)

http://financial-dictionary.thefreedictionary.com/CAGR

This is the only way we should calculate Annualized Returns.

Vladimir: That's what I thought too and it's the formula I'm using now. This way it's comparable to results outside of Quantopian, and not only this but also the other metrics that depend on it like Sharpe-Ratio.

The work that Justin original shared here has been launched as an open source project - Pyfolio. It has also been incorporated into the research environment for easy backtest analysis.

Take a look at the attached notebook to get the details.

68
Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hello,

Has anyone uploaded a csv to analyze mutual fund returns? I'm trying to figure out how to do this--via local_csv, I assume.

Thanks,
Jamie

@Marco: Thank you for your work on the CAGR and the Heat Maps in Notebook. Solid.

I notice that they still have not incorporated CAGR or decimals in Heat Maps in the Notebook Stats. If you have a version that does this would you mind sharing it? This generosity would be greatly appreciated...!

Also, I notice the Sharpe Ratio they produce in Notebooks has a quite large variance to that in the Backtest Result Summary. Do you know why this is the case? Please advise.

@Vladimir: You are correct to point out that all investment professionals use CAGR in their overview analysis of competing investments (a convention whose utility is perhaps simply its ubiquity) and the use of this stat in the Notebook and the Backtest Result Summary should not be considered optional but an important feature.

Also, we should lobby for inclusion of Bayesian T-Sharpes as well...! (video: http://blog.quantopian.com/probabilistic-programming-for-non-statisticians/)

Cheers gentlemen!

Hi John,

I'm using my own backtesting environment that I've built for myself since I'm mostly trading futures. So I've just taken some parts from the pyfolio that I need, therefor can't give you a complete code but you should be able to figure it out on your own, it's fairly easy. If not let me know.

for the heat map, it's fairly simple, just add
fmt=".1f"
to the sns.heatmap() Parameters, my version looks like this:

sns.heatmap(monthly.fillna(0), annot=True, fmt=".1f",annot_kws={"size": 12}, alpha=1.0, center=0.0, cbar=False, cmap='RdYlGn')



Regarding CAGR for me this works (and at the end you just need to stick with something to be able to compare your models). I measure everything in regards to volatility instead of using USD only anyway, but here's the USD version:

# Calc annualized % results (CAGR)
annualized_result_pct_usd = ((((end_net_balance_usd/start_net_balance_usd) ** (1 / years)) - 1) * 100)



To figure out the years I use

years = (equity.index[-1] - equity.index[0]).total_seconds() / 31557600



So you will need to figure out how these variable are called in the world of PyFolio. My guess is that there's an equity-series somewhere and you'd simply take the first value of that (unless the first one already is involved in a trade potentially, be careful here) and the last one. I personally have two of these, one uses the gross balance only considering closed trades, without margin/open positions, and a net balance that includes everything.

Don't know about sharpe, I'm using the "simple" version of it:


annual_volatility_pct_usd =net_balance_usd.pct_change().dropna().std() * np.sqrt(252) * 100
sharpe_pct_usd = annualized_result_pct_usd / annual_volatility_pct_usd

which does the job for me.

I also have a RAR% version of it, google around and you'll find out what it is.

Hope this helps you somehow!

Marco


@John Jay,
I know we exchanged emails a bit earlier today, but now just noticing you posted your questions here in this thread as well. I just wanted to make sure you are using the most up to date version of pyfolio that is now available fully integrated in our research environment without having to re-use the notebook I shared as the original message in this thread.

To this reply I have attached the simple version of generating a pyfolio tearsheet (always using the most recent stable of pyfolio) with just 2 lines of code in the notebook. Hope this helps! I think it addresses some of your questions, as well as some of the color/formatting items you mentioned (e.g. heatmap decimal places, etc).

Thanks,
Justin

24

Justin,
this is an awesome tool - even better now.

I get several errors running this now - didn't used to - , just tried the latest version above but still get them.
Paul

/usr/local/lib/python2.7/dist-packages/pyfolio/utils.py:157: UserWarning: Could not update cache /usr/local/lib/python2.7/dist-packages/pyfolio/data/factors.csv.Exception: [Errno 13] Permission denied: '/usr/local/lib/python2.7/dist-packages/pyfolio/data/factors.csv' UserWarning)
/usr/local/lib/python2.7/dist-packages/matplotlib/cbook.py:133: MatplotlibDeprecationWarning: The "loc" positional argument to legend is deprecated. Please use the "loc" keyword instead. warnings.warn(message, mplDeprecation, stacklevel=1)
/usr/local/lib/python2.7/dist-packages/pyfolio/tears.py:459: UserWarning: Unable to generate turnover plot. warnings.warn('Unable to generate turnover plot.', UserWarning)

Justin ,

Is there any updates on CAGR as main feature in pyfolio and backtester?

PaulB: Those are harmless warnings.

@Vladimir, yes we'll move to making CAGR the default. It's already in pyfolio master as an option but I agree that we should move this to default behaviour that is propagated to all the plots.

Is there any updates on CAGR as main feature in pyfolio and backtester?

It was added here: https://github.com/quantopian/empyrical/pull/19 and will be on Quantopian when we next ship updates.

When will be the next ship date?

I think it will be wrapped up in the upgrade to pandas 18: https://www.quantopian.com/posts/soon-upgrade-to-pandas-0-dot-18

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

I can't get tersheet for my algo

bt = get_backtest('58526ebeb57b354818da6b17')

# Create all tear sheets
bt.create_full_tear_sheet()
11% ETA:  0:15:34|######                                                     |
`

It loads only part of the back-test and hangs. Waited for hours and nothing. May be someone knows some workaround. Back-test is from 1.1.2010 till 30.11.2016, it is long but still it should load, right?

Volodymyr: Should definitely load much faster. Can you rerun the backtest and see if that helps?

tried, the same result. Research Memory goes to around 80%, then the kernel is reloaded and process stops. I will try to do it on a different PC, though I do not expect any change since the one i'm running it now is quite powerful.