How to make factors be the input of CustomFactor calculation?

Hi, In pipeline, we can use built-in factor to calculate two factors like these:
FastMA = SimpleMovingAverage(inputs=[USEquityPricing.close], window_length=8)
SlowMA = SimpleMovingAverage(inputs=[USEquityPricing.close], window_length=34)
I want to continue calculating the moving average of these two results:
Diff = np.abs(FastMA - SlowMA)
AvgDiff = SimpleMovingAverage(inputs=[Diff], window_length=110)
#clearly Diff isn't a MxN array(don't have time period, just the result for one day) so it can't be used to calculate average with the window_length of 110 days.

So, How can I calculate the factor based on the result of other factors?

11
Loading notebook preview...
8 responses

Hi Michael,

You're on the right track. In fact, you got the syntax right and you're thinking about the shape of the data, which is good. When you pass a factor as input to a CustomFactor with a window_length = n, the factor value as it would have been computed over the last n days is populated into the MxN array. Essentially, the CustomFactor knows how to turn the input term into the MxN array that you need.

However, this is only true for pipeline terms that are 'window safe'. A pipeline term that is 'window safe' is a term that is robust to pricing adjustments from splits or dividends - a value that will be the same no matter what day you are looking back from. This is true for normalized values such as returns. SimpleMovingAverage is not a window_safe factor because the result can change depending on whether you are applying an adjustment or not. This is important because the MxN matrix is generated using data as it would have been seen on each day in the lookback window.

As an example, say we have a stock XXX with the following unadjusted price history:

$5,$5, $1,$1

and let's say that on the third price, the stock dropped to $1 because of a 1-to-5 split. If we do a lookback after the 1st day of the data, the adjusted history would look like this:$5

after the 2nd day it would look like this:

$5,$5

after the 3rd day (1-to-5 split occurred) it would look like this:

$1,$1, $1 after the 4th day:$1, $1,$1, $1 If I ask for the rolling 2-day SMA, you can see how it would change based on the day i'm asking from, because the split was applied later on. This is what the 2day SMA would look like as an input to a CustomFactor:$5, $5,$1, $1 Note how it looks like there is a lot of volatility in the price, when in reality, the adjusted price didn't change at all. However, returns are normalized. This is what the 2-day returns would look like as computed each day:$0, $0,$0, \$0 regardless of what day i'm asking from.

Does this makes sense?

I'm wondering, what are you looking to use the Diff_StdDev computation for? I'm wondering if there's a better way to compute a similar statistic that is 'window safe'.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hi Jamie,
Thanks so much for your reply. It's very clear and reasonable. In fact, I want to use Diff_StdDev to generate some boundary signal like:
RM_Lww= SlowMA + (AvgDiff + Diff_StdDev * 1.3)
RM_Upp= SlowMA + (AvgDiff + Diff_StdDev * 2.3)
is there a way to do that?
Thank you so much :)

Hi Michael,

Because of the fact that taking the SMA of the price is not 'window safe', you will need to perform all of the computations in a single CustomFactor so that everything gets computed from the same reference date.

You will need to define the moving average logic in the custom factor as well. I'd recommend using something like pandas.DataFrame.rolling to get the rolling slow and fast MAs.

Hi Jamie,

Thanks for your reply. I am trying to use pandas.DataFrame.rolling to calculate the FastMA and SlowMA. But since the data I used to calculate the moving average is a calculation result: Px = spy.close/(stock price) . Now I use a custom factor : "class SPY_Close_Price(CustomFactor):" to get the close price of SPY every day, but it will only give me one value which I can't use to calculate the rolling moving average. Do you have a good suggestion to solve this problem? Thanks!

6
Loading notebook preview...

Hi Michael,

You will need to get a trailing price history in your CustomFactor instead of just a single value. Once you have the series of prices, you can perform the rolling computation over that data. Does that make sense?

Hi Jamie, thank you.
Yes, I need to get a trailing price history in my CustomFactor. But I don't know how to get these trailing price in pipeline(notebooks). In Algorithms environment, for example, I can use: "history"
Asset_Px = history(500, "1d", "close_price")
Asset = np.log(Asset_Px)
spy = history(sid(8554), 500, "1d", "close_price")
Underlying = np.log(spy)
Px = Underlying/Asset **#Px will be a trailing price history, so I can use Px to calculate FastMA which will also be the trailing price history
FastMA = pd.rolling_mean(Px, 8)
SlowMA = pd.rolling_mean(Px, 34)
Diff = np.abs(FastMA - SlowMA)
AvgDiff = np.mean(Diff, 110) #now I just need one value so no rolling_mean
stddev = np.std(Diff,110)

So the problem is in notebooks(pipeline) the price data source is USEquityPricing which I don't know how to use it to calculate the FastMA that need to be a series of trailing data so that I can calculate the moving average of FastMA.
Thanks so much for your help!

Hi Michael,
Thanks for this thread. I was just asking Ernesto Perez a very similar type of question and he directed me here.

Hi Jamie,
Thanks for your good explanation. Certainly makes sense.

I'm new to python and most of the time I still don't have much idea about basics like what type of brackets go where, but hopefully that problem will pass soon enough.

Ernesto made the point to me about needing to specify window_safe = True.
He also suggest that a possible workaround in my case would be just to re-write my code to combine the two individual factors into a single CustomFactor class, so all the data is queried and adjusted at the same time.

With regard to calculating a factor based on the result of other factors, can you provide any general guidelines about when it is preferable to use a combined_factor = Factor_1(inputs=[Factor_2()]) type of approach, and when it is preferable to just re-write the whole thing?

Thanks, best regards,
Tony

Hi Tony,
It seems "combined_factor = Factor_1(inputs=[Factor_2()]) type of approach" is not working well in the pipeline. So I just use one customfactor to combine everything together to get the final factor.