Trying to Understand Cross Sectional Momentum Factor

Can anyone help me understand how this CrossSectionalMomentum factor works? The shift(100) should result in the first 100 entries in R to be nan right? Also, R.T - R.T.mean() should result in zero?

class CrossSectionalMomentum(CustomFactor):
inputs = [USEquityPricing.close]
window_length = 252
def compute(self, today, assets, out, prices):
prices = pd.DataFrame(prices)
R = (prices / prices.shift(100))
out[:] = (R.T - R.T.mean()).T.mean()

6 responses

The easiest and best way to troubleshoot and/or figure out python is to use the interactive iPython research environment. I've attached a notebook which steps through the compute function of the the custom factor.

Yes, you are correct that shift(100) results in the first 100 entries in R being NaN. However the pandas mean method defaults to skipna=True so it ignores those (see http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.mean.html). Therefore, R.T - R.T.mean() won't necessarily be zero. It will be NaN where R.T is NaN, but a value otherwise. Not necessarily zero.

The notebook should explain each step. The real power of using an interactive iPython notebook is that you can see the results of each step along the way. I'd strongly recommend using this environment to develop and debug factors before then copying them into an algorithm.

14
Loading notebook preview...
Notebook previews are currently unavailable.

Thank you so much for your help!!!

Thanks a lot great explanation!!!!!

Can I clarify a few things? I'm a bit slow. I'm assuming that day 0 is the most recent day.

I would've thought momentum (M) would be P(n)/P(n-x). So in your shift 5 example M(0) = P(0) / P(5) = 1 / 1.5 = 0.67

However your code seems to do it the other way around whereby M(5) = P(n-5) / P(n) = P(5) / P(0) = 1.5 / 1.0 = 1/5.

This confuses me because surely we want to 'look back' to see what prices were in the past relative to today to gauge momentum.

I would have thought for a 5-day momentum it would be P[-1] / P[-5].

I'm not sure why we're shifting prices backwards instead of simply referencing a historic price?

First off, the data at index 0 is the EARLIEST days data and the data at the last index (ie index -1) is the MOST RECENT days data. The data is indexed in ascending date order (as the index increases so do the dates). This is the convention in Quantopian for the custom factor data as well as the data returned by the 'data.history' and other methods.

The explanation in the notebook may not have been clear (and maybe misleading) about shifting and the data ordering. You are correct in everything you say except the assumption that day 0 is the most recent day. It's actually the earliest day. Everything gets reversed now, and your logic makes sense and the shifting is correct.

Ok, day 0 being the earliest day clarifies everything. Thanks