How to create function that returns "time since last crossover"?

I want to find out how long it has been since a cross-over of two moving averages. Inputs would be (data, context, ma_short, ma_long) where the last two are each a series of moving averages. Here is an example of what I want to accomplish, but I think it can be written a whole lot better.

    # Gather historical prices (last 400 days)
prices = history(400, '1d', 'price')
# Calculate moving averages
ma_short = talib.EMA(prices[context.stock], timeperiod=20)
ma_long = talib.EMA(prices[context.stock], timeperiod=50)

# Determine how many days since the last cross-over
last_crossover_time_difference = time_since_last_crossover(context,data,ma_short,ma_long)

def last_crossover_days(data, context, ma_short, ma_long):
for i in range[1,399]:
if ma_short[-i] > ma_long[-i] and ma_short[-(i+1) < ma_long[-(i+1)]:
# Return time difference from today to the last crossover
return ma_short[-1] - i.index
elif ma_short[-i] < ma_long[-i] and ma_short[-(i+1) > ma_long[-(i+1)]:
# Return time difference from today to the last crossover
return ma_short[-1] - i.index
else:
continue


BTW, with the help of James, I created a similar time-based function that returned "time since price was higher".

15 responses

Interesting hmm. Lemme try.

sign_changes = np.sign(ma_short - ma_long).diff()
only_sign_changes = sign_changes[sign_changes != 0]
time_since = sign_changes.index[-1] - only_sign_changes.index[-1]


There may yet be a better way, but that is probably quite a lot quicker than looping, which I try to avoid...

I discovered what "sign()" does, it simply returns a (-1, 1, or 0) based on the original number.

"diff()" is kinda neat, in that it returns the difference between consecutive numbers in a series.

How does Python select what order these operations are done? Is it left to right? "sign() first, then diff()"? That must be the case, since your function does this logic:

1) Find the difference between the two moving averages, on all periods in the series
2) Use the sign() function to only save the sign of the difference (positive, negative, or 0)
3) Use the diff() function to compare the signs (1,-1,0). If the signs stay the same, the moving averages didn't swap positions, and the difference is 0.
4) Create a new series that only contains periods when the moving averages swap position (difference != 0)
5) Find the time difference from the current time to the time of the latest moving average cross-over.

@Simon: This solution is 100x more elegant than mine, so thanks for teaching me by example!

@Simon: I think we need to troubleshoot your function (or at least my implementation of it)

1) I got an "numpy.ndarray object has no attribute .diff()" error, so I split the sign() and diff() function into two steps, which seemed to work fine:

    only_signs = np.sign(ma_short - ma_long)
sign_changes = np.diff(only_signs)

# Calculate how long since the latest crossover
# NOTE: This is the step that breaks
time_since = only_signs.index[-1] - only_sign_changes.index[-1]


2) But then I get a similar error that "numpy.ndarray" object has no attribute "index". Did we flatten and lose the date information when we ran sign() or diff()?

I tested it in isolation assuming ma_short/ma_long were pandas Series/DataFrames. If they are numpy arrays, you could put them into pandas structures with an index.

This should work for you, just pass in the series/dataframe that's already differenced (centered around 0), and it returns the timedeltas from one cross to the next.

def get_sign_change_timedeltas(timeseries):
sign_changes = np.sign(timeseries).diff()
only_sign_changes = sign_changes[sign_changes != 0]
return pd.DataFrame(only_changes.index).diff()


I figured out how to do it for all stocks at once as well:

    # Calculate moving averages
# operate on all prices at once, and stay in pandas so we don't have to keep
# wrapping talib numpy arrays into pandas stuff
ma_short = pd.ewma(prices, span=20)
ma_long = pd.ewma(prices, span=50)

# Find the difference between moving averages
# Only save the sign() of these differences (-1,1)
only_signs = np.sign(ma_short - ma_long)
sign_changes = only_signs.diff().abs()
# floating point test
true_if_sign_changed = sign_changes > 0.5
# this applies the function to each column (stock) in turn
# the little function's return value is the last element of the index intersected
# with the truth of whether that date had a sign change
times_of_last_sign_change = true_if_sign_changed.apply(lambda s: s.index[s][-1])
last_row_time = prices.index[-1]
time_since = last_row_time - times_of_last_sign_change


If anyone can get rid of the .apply, that would speed things up.

Hmm, this is a bit odd, I don't know how you'd get rid of the apply. The function I posted above should be changed to return a series because it doesn't work for DataFrames. The fact that the size of the frame changes is what throws it off.

You can eliminate the ".apply" by replacing that line with this one:

    times_of_last_sign_change = true_if_sign_changed[::-1].idxmax()



I'd like to calculate "days_since_cross" for all stocks in my universe. Can anyone help to adapt this for use in a pipeline / dataframe?

There may be a faster and more concise solution, but here is custom factor which returns the trading days since the last time the 20 day ewma crossed over the 50 day ewma. I didn't check what would happen if there never was a crossover (my hunch is it will return 252 days ago). Anyway, a place to start.

class Last_Crossover(CustomFactor):
# Define inputs
inputs = [USEquityPricing.close]

# Set window_length to the lookback days. 252 (a year) may be a good value.
window_length = 252

def compute(self, today, assets, out, close):
# First lets turn the numpy array 'close' into a pandas dataframe
# We can then use the pandas ewm method
closes_df = pd.DataFrame(
data = close,
columns = assets)

# Make two new dataframes with short and long ewma
ma_short = closes_df.ewm(span=20).mean()
ma_long = closes_df.ewm(span=50).mean()

# Find where the short ewma is greater than the long ewma
# All we really care about is the sign (ie is the short ewma above or below the long)
# Need to go back to numpy for an easy sign method
delta = np.sign(ma_short - ma_long)

# Almost there.. lets just find the changes or crossovers. This is where the diff is <> 0
crossovers = delta.diff(axis = 0)

# A crossover from below (-1) to above (+1) will have a diff of +2 (+1 - -1)
# Lets find last index where this occurs (ie find the last +2)
# Use the fact that +2 is the max value so use the idxmax method
# However, this finds the first occurance, so first reverse our data using a loc trick
crossovers_rev = crossovers.loc[::-1].reset_index(drop=True)
last_crossover_up_index = crossovers_rev.idxmax(axis=0)

# Output the indexes. 0 will be yesterday. 1 will be the day before yesterday etc.
# If you want yesterday to be 1 then simply add one below (out[:] = last_crossover_up_index+1)
out[:] = last_crossover_up_index



I've also attached a notebook with the code in action.

3
Notebook previews are currently unavailable.

Dan - thank you for the response on this. I have been stuck for weeks trying to figure out how (and why) the close prices needed to be turned into a dataframe. Great notes in the explanation as well. I suspect I'lll be back with more questions.

Thanks again.

I am having trouble using this CustomFactor in additional calculations. My beginner Python skills are exposed here..

This is what I am trying to accomplish, working off the code that Dan W. provided above:

1. Identify the closing price on the day of the moving average cross
2. Calculate the price change from the day of the cross until the most recent close

I have attached my current notebook, with some changes I made to expand the universe of stocks and notes where I am stuck. Most of this code looks pretty similar to what Dan W. presented, but I have been through a lot of unsuccessful efforts in the meantime.

1
Notebook previews are currently unavailable.

One handy feature of CustomFactors is they can return multiple outputs. It sounds like you want not only the days since a crossover but the price at the crossover and then the price change since. Since you're doing a lot of the calculations inside the existing CustomFactor it makes sense to tack on these values as additional outputs.

See the attached notebook. It looks like the main problem was how to get the price at the crossover. This can be neatly done using the pandas 'lookup' method (https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.lookup.html). Something like this.

    # Assume the crossovers are in a numpy array called 'last_crossover_index' and prices in 'closes_df'.
# Now lookup all the close prices at each crossover
# One little issue is that the crossovers are indexed from the end.
# To lookup the close prices we need to get the index from the beginning.
# This is easily done by simply subtracting those indexes from the length.
# Then subtract 1 because of 0 based indexing
length = closes_df.shape[0]
last_crossover_index_from_begin = length - last_crossover_index - 1

# Now simply 'lookup' the close price at each index. This handy lookup method returns a series.
crossing_close = closes_df.lookup(last_crossover_index_from_begin.values, last_crossover_index_from_begin.index)


For some reason the notebook didn't attach. Here it is.

1