Back to Community
Pyfolio tear sheet with my own return

Hi everyone,

I was wondering how I can use my own list of stock's return with pyfolio tear sheet.
I tried to do somethings but I think I am missing something.

import pandas as pd  
import pyfolio as pf

# load dataset  
prices = pd.read_excel("SPY_pyfolio.xlsx", skiprows=1)

# add returns  
stock_rets  = prices.pct_change()  
stock_rets = prices.dropna()

pf.create_returns_tear_sheet(stock_rets, live_start_date='2016-1-12')  

Thank you in advance.
Chris

12 responses

Hi Chris,

Are you trying this from quantopian's notebook? I don't think they allow you to read local files. My guess is that you could use fetcher or better still use local installation of pyfolio as explained here

Hi Aqua,

Thank you for your response.
I am trying to use local data to create a tear sheet with pyfolio on jupyter notebook. I have install pyfolio and all the dependencies but I don't know if it is even possible.
Any idea ?

Thank you.

Hi Chris,

Its definitely possible to run pyfolio with algo returns in an excel spreadsheet! As Aqua Rooster said, the easiest thing to do would be to run a local installation of pyfolio (see Installation and Usage in the pyfolio readme) and to run create_full_tear_sheet with the returns you read in from your excel file.

The code snippet you posted looks correct! Just a word of caution though: in your code, make sure that stock_rets is a pandas DataFrame formatted as described in the pyfolio docstrings.

Hope that helps!

Thank you George.

But I am still getting this error:

requests.exceptions.ConnectionError: HTTPConnectionPool(host='ichart.finance.yahoo.com', port=80): Max retries exceeded with url: /table.csv?s=SPY&a=0&b=1&c=1970&d=7&e=8&f=2017&g=d&ignore=.csv (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 11001] getaddrinfo failed',))

Any idea ?

Does anyone tried pyfolio with is own return ?

Thanks

Try this:

create_full_tear_sheet(my_returns, benchmark_rets=my_bmark)  

Thank you Aqua.

Adding the benchmark resolved the issue with HTTPConnection but I am still getting some error.
Here is my code:

import pandas as pd  
import pyfolio as pf  
from datetime import datetime

# load dataset  
my_returns = pd.read_excel("NVDA_pyfolio.xlsx")  
bmark = pd.read_excel("SPY_pyfolio.xlsx")

# create tear sheet  
pf.create_full_tear_sheet(my_returns, benchmark_rets=bmark)  

My input data are in this format:

        Date    Return  
2016-01-04 -0.013979  
2016-01-05  0.001691  
2016-01-06 -0.012614  
2016-01-07 -0.023992  
2016-01-08 -0.010977  

But now I get this error:

print("Entire data start date: %s" % returns.index[0].strftime('%Y-%m-%d'))  
AttributeError: 'int' object has no attribute 'strftime'  

After some research, it appears that it's a string to date conversion issue.
So I tried to change the date format with different solution like the one below:

my_returns['Date'] = pd.to_datetime(my_returns.Date)  
bmark['Date'] = pd.to_datetime(bmark.Date)  

But nothing seems to work...

@Chris,

I don't know how pandas is reading your excel. Can you try to print the type of index after you read from excel?

Best regards,
Aqua

I was getting same error. with following code:

pd.read_csv('../pyfolio.test.csv')  

Error:

print("Entire data start date: %s" % returns.index[0].strftime('%Y-%m-%d'))  
AttributeError: 'numpy.int64' object has no attribute 'strftime'  

On debugging tears.py, I found the reason was that returns.index contains array indices instead of date (which is first column of my csv), when I changed it to

pd.read_csv('../pyfolio.test.csv', index_col=0)  

I got error:

print("Entire data start date: %s" % returns.index[0].strftime('%Y-%m-%d'))  
AttributeError: 'str' object has no attribute 'strftime'  

Finally I decided to comment out that print statement in tears.py and then I got this:

ValueError: Shape of passed values is (1, 13), indices imply (1, 12)  

My csv has 14 rows with no header. Any clue what is wrong now?

It looks like your CSV file's date column is in a format pandas isn't recognizing well — I think that's an issue with pandas, not pyfolio.

I would make sure the date column is in the correct format, and probably also add headers to your CSV. For example, it should look like this:

date,returns  
2017-09-05, 0.01  
2017-09-06, -0.005  
2017-09-07, -0.007  
...

You would then read that into pandas as a Series with returns = pd.read_csv('path_to_file').returns.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hi all, I am new to the Quantopian environment and programming in general..I stumbled upon this thread while trying to do something similar...

I created a dummy pandas series with dummy dates as the index:

rng = pd.date_range('1/1/2011', periods=72, freq='H')  
ds = pd.Series(np.random.randn(len(rng)),index=rng)

pf.create_full_tear_sheet(ds)  

And this was the traceback I got:
Traceback (most recent call last):
File "C:\Users\Admin\Desktop\vfsdvf - Copy.py", line 21, in
pf.create_full_tear_sheet(ds)
File "C:\Users\Admin\AppData\Local\Programs\Python\Python35\lib\site-packages\pyfolio\tears.py", line 181, in create_full_tear_sheet
set_context=set_context)
File "C:\Users\Admin\AppData\Local\Programs\Python\Python35\lib\site-packages\pyfolio\plotting.py", line 52, in call_w_context
return func(*args, **kwargs)
File "C:\Users\Admin\AppData\Local\Programs\Python\Python35\lib\site-packages\pyfolio\tears.py", line 448, in create_returns_tear_sheet
print("Entire data start date: %s" % returns.index[0].strftime('%Y-%m-%d'))
File "C:\Users\Admin\AppData\Roaming\Python\Python35\site-packages\pandas\core\indexes\datetimelike.py", line 271, in getitem
val = getitem(key)
File "C:\Users\Admin\AppData\Roaming\Python\Python35\site-packages\pandas\core\indexes\datetimelike.py", line 271, in getitem
val = getitem(key)
IndexError: index 0 is out of bounds for axis 0 with size 0

For more additional information:
ds.shape returns (72,)

ds.values returns:
array([ 0.08757005, -1.64412178, 1.15709941, -1.40537281, -0.80301942,
-0.29192278, 0.79053923, 0.55517532, 1.31618401, 0.03126079,
1.78018401, 0.09569176, 1.69468017, -0.15662507, 1.485865 ,
1.569107 , -1.09783896, 0.69884955, -0.20707769, 1.66601113,
-0.81318518, 1.15245775, 0.71799803, -1.48395817, 0.01577633,
0.1703878 , 0.37311332, -1.34285804, 0.46429777, 0.46173365,
0.23048052, 0.92245259, 1.21407195, 0.60010088, -0.53904102,
0.45358215, -0.62620881, -0.18269723, 0.23558362, -0.54613381,
1.09732697, 1.16196203, 1.19318663, -0.76692525, 0.09429309,
2.69763578, 1.4291762 , -0.64370414, -1.2974987 , 0.6832096 ,
2.59073877, 0.62590452, -0.88126909, -0.35983966, -0.92860179,
1.91045484, 0.35842119, -0.64363602, -0.74889649, 0.18456287,
0.5578629 , -0.01115962, -1.59089233, 0.87541127, 0.2891369 ,
-0.97425595, -0.01576838, -0.95861673, 0.42297133, 0.55638081,
0.52022179, 2.21036512])

ds.index returns:

DatetimeIndex(['2011-01-01 00:00:00', '2011-01-01 01:00:00',
'2011-01-01 02:00:00', '2011-01-01 03:00:00',
'2011-01-01 04:00:00', '2011-01-01 05:00:00',
'2011-01-01 06:00:00', '2011-01-01 07:00:00',
'2011-01-01 08:00:00', '2011-01-01 09:00:00',
'2011-01-01 10:00:00', '2011-01-01 11:00:00',
'2011-01-01 12:00:00', '2011-01-01 13:00:00',
'2011-01-01 14:00:00', '2011-01-01 15:00:00',
'2011-01-01 16:00:00', '2011-01-01 17:00:00',
'2011-01-01 18:00:00', '2011-01-01 19:00:00',
'2011-01-01 20:00:00', '2011-01-01 21:00:00',
'2011-01-01 22:00:00', '2011-01-01 23:00:00',
'2011-01-02 00:00:00', '2011-01-02 01:00:00',
'2011-01-02 02:00:00', '2011-01-02 03:00:00',
'2011-01-02 04:00:00', '2011-01-02 05:00:00',
'2011-01-02 06:00:00', '2011-01-02 07:00:00',
'2011-01-02 08:00:00', '2011-01-02 09:00:00',
'2011-01-02 10:00:00', '2011-01-02 11:00:00',
'2011-01-02 12:00:00', '2011-01-02 13:00:00',
'2011-01-02 14:00:00', '2011-01-02 15:00:00',
'2011-01-02 16:00:00', '2011-01-02 17:00:00',
'2011-01-02 18:00:00', '2011-01-02 19:00:00',
'2011-01-02 20:00:00', '2011-01-02 21:00:00',
'2011-01-02 22:00:00', '2011-01-02 23:00:00',
'2011-01-03 00:00:00', '2011-01-03 01:00:00',
'2011-01-03 02:00:00', '2011-01-03 03:00:00',
'2011-01-03 04:00:00', '2011-01-03 05:00:00',
'2011-01-03 06:00:00', '2011-01-03 07:00:00',
'2011-01-03 08:00:00', '2011-01-03 09:00:00',
'2011-01-03 10:00:00', '2011-01-03 11:00:00',
'2011-01-03 12:00:00', '2011-01-03 13:00:00',
'2011-01-03 14:00:00', '2011-01-03 15:00:00',
'2011-01-03 16:00:00', '2011-01-03 17:00:00',
'2011-01-03 18:00:00', '2011-01-03 19:00:00',
'2011-01-03 20:00:00', '2011-01-03 21:00:00',
'2011-01-03 22:00:00', '2011-01-03 23:00:00'],
dtype='datetime64[ns]', freq='H')
```

The series above behaves in more or less the same way as if I use
pf.utils.get_symbol_rets('SPY') which generates the tear sheet. So I'm not really sure what I am doing different here..will appreciate any inputs and discussion! =)

Sorry for wakening up an old thread here, but also struggling with the time format.
Doing create_simple_tear_sheet with my own return and index return series works just fine and it plots:

  1. Cumulative returns
  2. Rolling portfolio beta to Index
  3. Rolling Sharpe ratio (6-month)
  4. Underwater plot

It's mapping the dates fine to the timeline and does the X axis conversion from input Y-M-D to just Y-M.

EDIT: Seems to be working OK in unit = 'ms'...