Back to Community
Campbell, Hilscher, Szilagyi (CHS) Model - Probability of corporate failure

Implementation of the method describe in the paper "In Search of Distress Risk" by John Campbell, Jens Hilscher, and Jan Szilagyi, where they comprehensively explore the determinants of corporate failure.

Link to the updated Paper: http://scholar.harvard.edu/campbell/publications/search-distress-risk

Loading notebook preview...
Notebook previews are currently unavailable.
53 responses

Hello everyone,

In the Notebook above, I implemented the algorithm described in the paper "In Search of Distress Risk" by John Campbell, Jens Hilscher, and Jan Szilagyi .

As outlined in the book "Quantitative Value" and in their blog post
http://blog.alphaarchitect.com/2011/07/23/stop-using-altman-z-score/
this algorithm is much better than the previous ones in predicting bankruptcy.

Maybe I did something wrong, because my implementation gives Apple a probability of bankruptcy equals to 100%
Would you help me to verify and improve my implementation?

The Research Notebook is here linked above.

Comments and feedbacks are very welcome!
Costantino

Hi Costantino,
You probably have a scaling issues on one of the variables. You can check your summary stats and benchmark them to the stats in the paper...that way you'll know you are scaling them correctly. Best of luck and awesome work.
Wes

Hi Wes!

nice to meet you! I'm a reader and admirer of your blog! I got the idea to implement this algorithm while reading your excellent book "Quantitative Value"!

Thanks a lot for the useful suggestion, I compute the descriptive statistics for my variables:

          NIMTA     TLMTA      EXRET       RSIZE      SIGMA   CASHMTA        MB     PRICE  
mean  -0.006445  0.175122  -0.036809  -12.331102  57.132348  0.098740  1.784606  2.238480  
50%    0.003908  0.110371  -0.013899  -12.314694   5.262007  0.058600  1.475452  2.708050  

and compared with those ones reported on the table II of the paper:

NIMTA   TLMTA    EXRET     RSIZE   SIGMA  CASHMTA     MB   PRICE  
Mean   0.000   0.445   -0.011   -10.456   0.562    0.084  2.041   2.019  
Median 0.006   0.427   -0.009   -10.570   0.471    0.045  1.557   2.474  

The problem was Sigma! I redefined the computations as follows:

returns_daily = ((price_history - price_history.shift(1, axis=0)) / price_history.shift(1, axis=0)).iloc[1:]  
n = len(returns_daily)  
sigma = returns_daily.sub(returns_daily.mean()).pow(2).sum().multiply(252.0/(n-1.0)).pow(0.5)  

and the new summary statistics are now in line with the paper:

mean        0.405274  
std         0.286608  
min         0.000000  
50%         0.338579  
max         5.577249  

and... the most important thing, AAPL's probability of failure is now only 0.000056 ;-)

Thanks a lot!

P.S.: as of July 2014, this is the list of bankruptcy candidates:
- COCO CORINTHIAN COLLEGES INC 99.99%
- CRMB 57TH STREET GENERAL ACQUISITION CORP 99.93%
- KIOR KIOR INC 99.91%
- SPEX SPHERIX INC 99.85%
- BODY BODY CENTRAL CORP 99.59%
- NIHD NII HOLDINGS INC 98.01%
- EDMC EDUCATION MANAGEMENT CORP 97.90%
- MELA MELA SCIENCES INC 92.36%
- UNTK UNITEK GLOBAL SERVICES INC 91.90%
- END ENDEAVOUR INTERNATIONAL CORP 84.23%
- RSH RADIOSHACK CORP 81.16%
- BAXS BAXANO SURGICAL INC 67.35%
- CACH CACHE INC 65.92%
- CERE CERES INC 59.12%
- ARO AEROPOSTALE INC 52.30%

I checked some of the companies above and have to say that this algorithm is impressive.
The list is based on the data as of July, 1st 2014 (one year ago) and almost all those companies entered later in Chapter 11 protection!

  • COCO on May, 2015
  • CRMB on July 2014
  • KIOR on Nov, 2014
  • BODY on Jan 2015
  • NIHD on Sep 2014
  • EDMC on Jan 2015
  • UNTK on Oct 2014
  • END on Oct 2014
  • RSH on Feb 2015
  • BAXS on Nov 2014
  • CACH on Feb 2015

Wes, thanks again to let me discover this "magic formula" through your book ;-)

no problem. happy to share and glad you like our blog.

Constantino,

Do you mind sharing with us the corrected notebook?

is this notes for quantopian? or other python platform?

Does this also work in zipline... or... we still have to modify this thanks. :)

I don't think, it works in zipline... there is no support for fundamentals data.

Morningstar has fundamentals in other markets like emerging markets however... only US market seems to work.. # Only NYSE, AMEX and Nasdaq .filter(fundamentals.company_reference.primary_exchange_id.in_(['NYSE', 'NAS', 'AMEX'])) id like to try it out in other maket aside from this three cheers. ;)

hi John,

Our license with Morningstar is restricted to North American Equities. We give you all the companies we can.

We'll keep this in mind as uses for the data have expanded - I can definitely understand the desire for other geographic regions.

Thanks
Josh

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

the updated version doesnt work out of the box. try to run it maybe its just me.

In [5]:

symbols = fundamental_data.minor_axis

price_history = get_pricing(symbols, fields='close_price', start_date=t1, end_date=t0)

price_history


NameError Traceback (most recent call last)
in ()
1 symbols = fundamental_data.minor_axis
----> 2 price_history = get_pricing(symbols, fields='close_price', start_date=t1, end_date=t0)
3 price_history

NameError: name 't1' is not defined

Looks like you're using research. be sure to execute previous cells to populate your values like t1 and t0

I reviewed the Notebook and made some minor changes.
There are the current results:
lag = 1

Companies whose Probability of Financial Distress is > 50.0 % as of 2014-07-01 

KIOR  KIOR INC                                 99.88%  
CRMB  57TH STREET GENERAL ACQUISITION CORP     99.81%  
SPEX  SPHERIX INC                              99.76%  
COCO  CORINTHIAN COLLEGES INC                  99.62%  
BODY  BODY CENTRAL CORP                        97.57%  
EDMC  EDUCATION MANAGEMENT CORP                93.93%  
UNTK  UNITEK GLOBAL SERVICES INC               93.19%  
NIHD  NII HOLDINGS INC                         76.85%  
END   ENDEAVOUR INTERNATIONAL CORP             71.73%  
RSH   RADIOSHACK CORP                          69.86%  
MELA  MELA SCIENCES INC                        67.83%  
CACH  CACHE INC                                63.72%  
HDY   HYPERDYNAMICS CORP                       62.87%  
BAXS  BAXANO SURGICAL INC                      59.49%  

lag = 0

Companies whose Probability of Financial Distress is > 50.0 % as of 2015-07-01 

LOCM  LOCAL CORP                               100.00%  
DXM   DEX MEDIA INC                            99.48%  
LOOK  LOOKSMART LTD                            99.44%  
USEG  U S ENERGY CORP                          98.88%  
RLJE  RLJ ENTERTAINMENT INC                    96.75%  
VPCO  VAPOR CORP                               96.32%  
ASTI  ASCENT SOLAR TECHNOLOGIES INC            92.99%  
XGTI  XG TECHNOLOGY INC                        87.95%  
INPH  INTERPHASE CORP                          85.50%  
HERO  HERCULES OFFSHORE INC                    84.87%  
VRNG  VRINGO INC                               84.63%  
WRES  WARREN RESOURCES INC                     80.55%  
FREE  FREESEAS INC                             77.51%  
JOEZ  JOE'S JEANS INC                          67.25%  
VGGL  VIGGLE INC                               61.30%  
SPEX  SPHERIX INC                              54.33%  
NETE  NET ELEMENT INC                          53.09%  
ESCR  ESCALERA RESOURCES CO                    50.61%  

How can we... substitute this say...**. local csv** since ive havnt seen a sample research fetching quandl that matter... since its not found on quantopian database # Only NYSE, AMEX and Nasdaq .filter(fundamentals.company_reference.primary_exchange_id.in_([**'NYSE', 'NAS', 'AMEX**'])) thanks ;)

Josh... are there... any plans... to add or include... OTCBB and OTCPK exchange... in the near future? morning star has OTCPK available in their site... tnxs.

John, OTC is not high on the list currently. Our focus in terms of expanded markets coverage is currently on futures.

thanks
Josh

Altman’s original five-ratio model was designed for manufacturers, or sectors with high capital intensity, such as mining.

The problem is it uses the sales/total assets ratio, which can skew the result in sectors that are not capital intensive. A low total assets figure brings this ratio, and the resulting Z score, down too far and can generate a number that suggests financial distress when there may be none. Low capital intensity sectors include many service sector firms where people, rather than physical assets, are the main source of added value.

The paper was updated on Januany, 2010 with new weights.

I find this comment a bit odd - it makes the numbers seem to be gotten via data mining/curve fitting. Does anyone have any comments about this? Has anyone made any quantification of how effective the score has been in the past/last few years?

Has anyone modeled this as a strategy yet? It would be really interesting to see results of shorting the most probable failures.

I've tried a dedicated shorting strategy based on these sort of algorithms in the past. A good idea in theory, but not necessarily in practice. You get destroyed on rebate costs and operational risks. I think the idea works in expectation, but you need iron will and an ability to fund the short when things go against you. Not "easy money"...but in the actual market there is no such thing...

I'm not trying to trade this as a sole strategy, I'm basically only looking to offset my other mainly long strategies so I would use this as a market negative hedge that I would activate with some weight at some points to reduce drawdown (or let always on depending on what kind of correlation the curve has to my other strategies).. I think the curve would be quite a lot better than just shorting the momentum etc but seeing the curve in combination with sp500 would really help.

I mainly asked because the code is quite complicated so I don't have to rewrite everything if someone has already implemented this with pipeline and want to share the code.

@Wes Gray -- big fan of your work
@Mikko Mattila

In this article: https://dash.harvard.edu/bitstream/handle/1/9887619/JOIM_predicting_financial_11.pdf?sequence=2
Prof. Campbell even writes "Although probably quite accurate, it may not be useful to predict a heart attack
with a person clutching their hand to their chest".

Also, I removed the bugs from the program. Attached is the working one.

Loading notebook preview...
Notebook previews are currently unavailable.

On page 2911 of In Search of Distress Risk, the EXRETAVG factor has different weights than the NIMTAAVG. Instead of the same 0.5333, 0.2666, 0.1333, etc., I think they would be 0.22, 0.11 and so on for 12 months (halving the weight each month). The paper shows:

NIMTAAVG = (1-φ^3) / (1 - φ^12)...

EXRETAVG = (1 - φ) / (1 - φ^12)...

where φ = 2^(-1/3)

Hello,

Quick question if someone can help: how can I change the date manually in the algo and find for example the companies with distress risk corporate failure of September 2015 ? I guess I have to change that here :

(2): return datetime.date(y, m, 1)
lag = 0
today = datetime.datetime.now()
t0 = datetime.date(today.year - lag, today.month, 1)

Could you please tell me what to change exactly ? Do I nee to change other lines in the code or is the date I want to choose only define in the (2)?

Thank you very much for your time
Elliott

@Elliott - if you want to change the date, you change t0 in the snippet you posted.

t0 = date time.date(2015,9,1)  

@Matt - you are correct, although I'm not sure how important it is. Both calculations give a weighted average over the course of the year. Using monthly data might capture companies that are melting down a little faster. If I get time tonight I'll try it and report back.

Hi @Xvzf @Tar , thank you very much for the answer.
After replacing to, I tried to run the notebook but come up with some problems at line (19), all securities show the result NaN at the end of the notebook.

I was wondering, at the beginning in the notebbok there is :
today = datetime.datetime.now()
t0 = datetime.date(today.year - lag, today.month, 1)

Shouldn't I replace "today = dateime.datetime.now" by "today = date time.date(2015,9,1) " and to with something else ?
I tried to run it this time with "today = datetime.date(2016,9,1)
t0 = datetime.date(today.year - lag, today.month, 1)"
but it doesn't seem to work neither.

This is really a fantastic tool, if someone has a solution to this I would really be grateful!

@Elliott the notebook doesn't take into account that some days don't have pricing data. If you set t0 to (2015, 9, 1), it attempts to get pricing data for 2015-09-01, 2015-06-01, 2015-03-01, 2014-12-01, and 2014-09-01. 2014-09-01 was Labor Day and the markets were closed, so all the prices for that day are NaN, which affects all the downstream calculations.

There are a couple of places in the code that need to be changed to account for this.

Okay, I worked through the date problems. You should be able to just set t0 to the end date you want using this one.

Loading notebook preview...
Notebook previews are currently unavailable.

Waou huge thanks @Xvzf @Tar (I'm not sur which one of the two to use haha..). You have no idea how much you're helping me here, really amazing contribution thanks!

I'm learning python but since I have to give back my thesis in less than 2 months I wanted to ask you @Tar @Xvzf (or anyone who could bring help :) ) : do you think it would be possible to include this notebook in a backtesting : then use the symbols of the result (For example for the date 01 Septembre 2015), SPEX, PRSN, XGTI ... (only the symbols were probability of failure>80%) and create a backtest where it creates a short sell order for all these securities, and cover the position say one year later or when the price dropped more than 70% (to be sure to be able to cover our position)
So really to create a backtest than verify if the model is reliable and if profits can be made on the long term (even if in the real world, going short on a security for several months would cost huge commisions costs)

Hey @Elliott (you can call me Tar, btw),

There's a couple of problems. The first is a technical problem with Quantopian. Q is still really slow with fundamental data, and this is really super fundamentals heavy. I have a research notebook that has a pipeline version of this, and it's quite a bit slower. I suspect that it would just timeout in the backtester. I originally wrote it to be pipelined, then rewrote the section that calls run_pipeline to call get_fundamentals.

The second problem is that the couple of people who've written about this paper, including, I believe, one of the authors, have said that it isn't really as good for trading as it is for avoiding mistakes when buying. I think Wes Gray is one of the people who wrote that. He wrote most of a chapter in his book Quantitative Value about this paper. You might make contact with him and ask him what resources he has. He's a very approachable guy. He helped Constantino with the original version of this notebook.

@Matt Jensen

On page 2911 of In Search of Distress Risk, the EXRETAVG factor has different weights than the NIMTAAVG. Instead of the same 0.5333, 0.2666, 0.1333, etc., I think they would be 0.22, 0.11 and so on for 12 months (halving the weight each month). The paper shows:

NIMTAAVG = (1-φ^3) / (1 - φ^12)...

EXRETAVG = (1 - φ) / (1 - φ^12)...

where φ = 2^(-1/3)

I'm better at programming than math, so I can't figure this out. 0.22 + 0.11... for 12 months adds up to a number that's less than 0.5. For this to work, it needs to add up to 1. If I'm going to try to implement this for EXRETAVG, I need someone else to figure out the multipliers for me.

@Tar
Thank you very much for your answer. I see the issue here, I will send an e-mail to Wes Gray tomorrow and ask about his opinion. I'm also trying to use the notebook and change it so you can enter the ticker you want and the algo gives you at the end the %chance of bankruptcy of the firm. But I'm not sure what to change and if this only takes a few chances and if it's too complicated to do. Does anyone has some ideas about that ?
It would be really terrific to be able to know if the company you are willing to buy show financial distress and exactly know its probability of failure thanks to this notebook

Also quick question: I replaced on the last cell of the note book >0.50 to <0.50 "distressed_companies = pfd[pfd < 0.50].order(ascending=False)" which gives me a big list of companies (capped at $15) with probabilities of bankruptcy inferior to 50%. So my question : do you think the companies that appear on the list with 0,01% probability or very small % is a sign of very strong financial structure of the firm ? Can the notebook be used also in the reversed way ?

I didn't have time to look into detail into the list of companies but I will definitively do it

can the notebook be used in the reversed way?

I think what the score tells you is whether or not the company is on the brink of disaster or not. Companies that score 0.00001 are not likely to go bankrupt within the next year, but that's all this really says about them. If you look at the largest components of the regression, the companies that score the worst have negative income, and a stock price that just dropped through the bottom of the chart. Positive income says something about the strength of the company, but a quickly rising stock price might or might not say anything about the soundness of the company's finances.

Could you guys please show me how can i get the fundamental data for CHS Model (RISK OF FINANCIAL DISTRESS) test within these list stock:
SPEX SPHERIX INC 100.00%
PRSN PERSEON CORP 100.00%
XGTI XG TECHNOLOGY INC 100.00%
ASTI ASCENT SOLAR TECHNOLOGIES INC 100.00%
DRYS DRYSHIPS INC 99.98%
ESCR ESCALERA RESOURCES CO 99.88%
VPCO VAPOR CORP 99.75%
INPH INTERPHASE CORP 99.70%
UNXL UNI-PIXEL INC 99.60%
DXM DEX MEDIA INC 99.33%
USEG U S ENERGY CORP 99.02%
AQXP AQUINOX PHARMACEUTICALS INC 98.83%
CNIT CHINA INFORMATION TECHNOLOGY I 98.70%
FNCX FUNCTION(X) INC 98.11%
LINC LINCOLN EDUCATIONAL SERVICES CORP 93.73%
WRES WARREN RESOURCES INC 91.21%
SWSH SWISHER HYGIENE INC 90.38%
LNCO LINN CO LLC 89.25%
XOMA XOMA CORP 88.76%
SFXE SFX ENTERTAINMENT INC 86.57%
NETE NET ELEMENT INC 83.78%
ZAZA ZAZA ENERGY CORP 83.24%
VTL VITAL THERAPIES INC 81.90%
ADAT AUTHENTIDATE HOLDING CORP 79.91%
RJET REPUBLIC AIRWAYS HOLDINGS INC 77.62%
AMCF ANDATEE CHINA MARINE FUEL SERVICES CORP 76.29%
LINE LINN ENERGY LLC 75.20%
INTX INTERSECTIONS INC 74.69%
ETRM ENTEROMEDICS INC 72.46%
PRKR PARKERVISION INC 62.73%
TRCH TORCHLIGHT ENERGY RESOURCES INC 61.84%
MPET MAGELLAN PETROLEUM CORP 55.14%
ESSX ESSEX RENTAL CORP 53.95%

thank you so much!

Could you please show me how can i get the fundamental data for CHS Model (RISK OF FINANCIAL DISTRESS) test within those list stock:
thank you so much!

I ran this notebook for a few specific dates by making the following alteration (for example):

today = datetime.datetime(2014,7,1,12,1,0)

and then running the rest of the notebook as-is.

This worked fine for a number of time periods, but not for any where I used January first.
For example, when I ran it using today = datetime.datetime(2014,1,1,12,1,0) I got the error below:

Could the OP or someone else investigate this. I apologize, but I know almost nothing about python so I can't fix this myself.
Thanks so much!

exret = (np.log(returns.add(1))).sub(np.log(returns_sp.add(1)).as_matrix())
exret


ValueError Traceback (most recent call last)
in ()
----> 1 exret = (np.log(returns.add(1))).sub(np.log(returns_sp.add(1)).as_matrix())
2 exret

/usr/local/lib/python2.7/dist-packages/pandas/core/ops.pyc in f(self, other, axis, level, fill_value) 1079 # casted = self.constructor_sliced(other,
1080 # index=self.columns)
-> 1081 casted = pd.Series(other, index=self.columns)
1082 return self.
combine_series(casted, na_op, fill_value, axis,
1083 level)

/usr/local/lib/python2.7/dist-packages/pandas/core/series.pyc in init(self, data, index, dtype, name, copy, fastpath) 227 raise_cast_failure=True)
228
--> 229 data = SingleBlockManager(data, index, fastpath=True)
230
231 generic.NDFrame.init(self, data, fastpath=True)

/usr/local/lib/python2.7/dist-packages/pandas/core/internals.pyc in init(self, block, axis, do_integrity_check, fastpath) 3815 if not isinstance(block, Block):
3816 block = make_block(block, placement=slice(0, len(axis)), ndim=1,
-> 3817 fastpath=True)
3818
3819 self.blocks = [block]

/usr/local/lib/python2.7/dist-packages/pandas/core/internals.pyc in make_block(values, placement, klass, ndim, dtype, fastpath) 2516 placement=placement, dtype=dtype)
2517
-> 2518 return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)
2519
2520 # TODO: flexible with index=None and/or items=None

/usr/local/lib/python2.7/dist-packages/pandas/core/internals.pyc in init(self, values, placement, ndim, fastpath) 88 raise ValueError('Wrong number of items passed %d, placement '
89 'implies %d' % (len(self.values),
---> 90 len(self.mgr_locs)))
91
92 @property

ValueError: Wrong number of items passed 3, placement implies 4

I haven't worked with this notebook in awhile, but if I recall, you have to pick a date when the market is open, or it throws up.

Thanks for your note.

I have identified the following dates as years and quarters when the markets are open (at least they are not New Year's day and they are weekdays for the other dates). But when I try to had code these, the program still throws up. I tried to define them as datetimes, but that errored out. So then I tried to define them as dates, but I don't know how to define a date instead of datetime in python and get it to accept it. Could I ask you to look into why the program doesn't work if the following dates are used?

t0 = date.date(2014,1,2)
t1 = date.date(2013,10,1)
t2 = date.date(2013,7,1)
t3 = date.date(2013,4,1)
t4 = date.date(2013,1,2)

Does anyone have an algorithm with this as an exclusion filter in pipeline?

One of our Ph.D. students (Zhouwang Li, who know A LOT more about python and analysis of financial data than I do) suggested that if you change the first line of the 6th block of code to

prices_sp_all = get_pricing('SPY', fields='close_price',start_date=t4+datetime.timedelta(days=-5),end_date=t0+datetime.timedelta(days=5))

then this will give the program some flexibility to look for price data on days when the market is open and the program will run. I did this, changed the date parameters at the top of the program back to as the original but hard coded t0 as Jan 1, 2014 and the notebook runs fine.

Got an error:

KeyErrorTraceback (most recent call last)
in ()
7 date_str = date_str + "-01"
8 dates.append(pd.Timestamp(date_str))
----> 9 prices_sp = 10.0*prices_sp_all.loc[dates]
10 print prices_sp
11

/usr/local/lib/python2.7/dist-packages/pandas/core/indexing.pyc in getitem(self, key) 1294 return self.getitem_tuple(key)
1295 else:
-> 1296 return self._getitem
axis(key, axis=0)
1297
1298 def _getitem_axis(self, key, axis=0):

/usr/local/lib/python2.7/dist-packages/pandas/core/indexing.pyc in getitem_axis(self, key, axis) 1454 raise ValueError('Cannot index with multidimensional key')
1455
-> 1456 return self.
getitem_iterable(key, axis=axis)
1457
1458 # nested tuple slicing

/usr/local/lib/python2.7/dist-packages/pandas/core/indexing.pyc in getitem_iterable(self, key, axis) 1020 def getitem_iterable(self, key, axis=0):
1021 if self.
should_validate_iterable(axis):
-> 1022 self.
has_valid_type(key, axis)
1023
1024 labels = self.obj._get_axis(axis)

/usr/local/lib/python2.7/dist-packages/pandas/core/indexing.pyc in has_valid_type(self, key, axis) 1377
1378 raise KeyError("None of [%s] are in the [%s]" %
-> 1379 (key, self.obj.
get_axis_name(axis)))
1380
1381 return True

KeyError: "None of [[Timestamp('2015-04-01 00:00:00'), Timestamp('2015-07-01 00:00:00'), Timestamp('2015-10-01 00:00:00'), Timestamp('2016-01-01 00:00:00'), Timestamp('2016-04-01 00:00:00')]] are in the [index]"

Anyone have a working version?

Anyone?

I'm just trying to run the notebooks without any changes. I restarted it and run the cells in order, until I get to the cell with prices_sp_all.

prices_sp_all = get_pricing('SPY', fields='close_price', start_date=t4, end_date=t0)  
prices_sp_index = pd.date_range(prices_sp_all.index[0], prices_sp_all.index[-1])  
prices_sp_all = prices_sp_all.reindex(prices_sp_index, method='ffill')  
dates = []  
items = fundamental_data.items  
for date_str in items[-5:]:  
    date_str = date_str + "-01"  
    dates.append(pd.Timestamp(date_str))  
prices_sp = 10.0*prices_sp_all.loc[dates]  
print prices_sp

returns_sp = ((prices_sp - prices_sp.shift(1)) / prices_sp.shift(1))[1:]  
returns_sp  

Can anyone figure out what's wrong?

looks like i just hit the same problem.

i am getting the error.

KeyError: "None of [[Timestamp('2015-04-01 00:00:00'), Timestamp('2015-07-01 00:00:00'), Timestamp('2015-10-01 00:00:00'), Timestamp('2016-01-01 00:00:00'), Timestamp('2016-04-01 00:00:00')]] are in the [index]"

because there is a problem with timezones.
Removing timezone information from prices_sp_all.loc fixed the problem.
add the line:
prices_sp_all.index.tz = None
before:
prices_sp = 10.0*prices_sp_all.loc[dates]

@Matt Jensen

On page 2911 of In Search of Distress Risk, the EXRETAVG factor has different weights than the NIMTAAVG. Instead of the same 0.5333, 0.2666, 0.1333, etc., I think they would be 0.22, 0.11 and so on for 12 months (halving the weight each month). The paper shows:

NIMTAAVG = (1-φ^3) / (1 - φ^12)...

EXRETAVG = (1 - φ) / (1 - φ^12)...

where φ = 2^(-1/3)
It is not halved for EXRETAVG. Anyway, it is 0.220052772, 0.174656001, 0.13862456, 0.110026386, 0.087328001, 0.06931228, 0.055013193, 0.043664, 0.03465614, 0.027506597, 0.021832, and 0.01732807. These all add up to 1.

Best,

Cheema

Hi all,
Sorry for bringing an old thread up. I have not been on here in a while.

is there any easy way to convert this notebook to pipeline or would I have to rewrite this whole thing?