Question on Lecture 15: Multiple Linear Regression

Hello,

When I attempt to run the following cell from Lecture 15 I get an output of nan. I was not able to figure out why so it would be much appreciated if anyone could explain what is happening and what has to be changed.

# Load pricing data for two arbitrarily-chosen assets and SPY

start = '2014-01-01'
end = '2015-01-01'
asset1 = get_pricing('DTV', fields='price', start_date=start, end_date=end)
asset2 = get_pricing('FISV', fields='price', start_date=start, end_date=end)
benchmark = get_pricing('SPY', fields='price', start_date=start, end_date=end)

# First, run a linear regression on the two assets

print 'SLR beta of asset2:', slr.params[1]

1 response

Never a dull moment when it comes to stock symbols. The NaN output is because of the following method

asset1 = get_pricing('DTV', fields='price', start_date=start, end_date=end)



It returns all NaNs for prices. The stock symbol 'DTV' is the culprit. This is one of those interesting (ie frustrating) examples of the challenges we quants face with backtesting when stock symbols change.

Here's the story. In early 2015, the ticker DTV represented the publicly traded company DirectTV. However, as of 2015-07-24, DirectTV was purchased by ATT (https://investors.att.com/stockholder-services/cost-basis-guide/acquisitions/directv), the stock delisted, and the ticker DTV was retired. Normally that doesn't cause issues. Quantopian would pull all the prices it has and there just wouldn't be any after 2015-07-24.

However, subsequently, DTE Energy Co. Un began using the ticker 'DTV'. By default, Qauntopian associates the ticker with the current/last company to use a ticker. DTE Energy Co. Un doesn't have any pricing before 2016-09-01. The pricing that (as of 2019-06-07) the get_pricing('DTV')returned is all NaNs because there weren't any prices between 2014-2015.

Probably more information than needed here but I did want to illustrate the problem. The solution is very simple but often overlooked. Most methods where a ticker can be inputted allow for the optional parameter symbol_reference_date. As of a given date, there will be only one company using a ticker though, over time, it may be used by different companies. Use this parameter to uniquely identify a ticker. So, update the above method to be

asset1 = get_pricing('DTV', fields='price', start_date=start, end_date=end, symbol_reference_date=end)



Everything will work fine (and the way it ran back in 2015). It's probably a 'best practice' to always use the symbol_reference_date parameter but most of the time it's not necessary.

@Michael Severance. Thank you for pointing this out. We'll update that notebook. In the meantime, attached is one which has the added parameter.

2