Back to Community
Conditional Z-Score

I am looking for a way to create a conditional z-score in the following context:

def make_ml_pipeline(factors, universe):
factors_pipe = OrderedDict()
for name, f in factors.iteritems():
factors_pipe[name] = f().zscore()
pipe = Pipeline(screen=universe, columns=factors_pipe)
return pipe

Now the z-score substracts the mean and divides by the standard deviation. When the standard deviation is 0 for a particular date, the z-score fills in np.NaNs into the column for all assets for that particular date.
I would like to have the condition:
If standard deviation == 0 then use original value, else calculate the zscore across all assets for that particular date.

For the people who want to know the reason why one would need that... If you do any automated machine learning, where you are creating a nullhandling indicator column with zeros and ones, you have the occational column where no data is missing (i.e. the whole column will be filled with zeros). For that column the standard deviation will be 0 as well and therefore the zscore np.NaN. Since algorithms cannot calculate with np.NaNs, I want it to write 0 instead for that particular situation - in an automated way.

Any ideas?

3 responses

I have now tried to use sklearns StandardScaler() and MinMaxScaler() inside a Custom Factor function to be able to impose if conditions.

However, as can be seen in the attached notebook, for both formulas I cannot get the correct scaling with the StandardScaler() and MinMaxScaler().

I would appreciate it if somebody could tell me where my formula is wrong. Again I am trying to scale across assets per timestamp.

Loading notebook preview...
Notebook previews are currently unavailable.

The issue is not supplying the final screen as a mask to the factors. Without a mask ALL securities are passed to a factor. The enterprise_value_minmaxscaled factor therefore scales across all securities then, when the pipeline is run, some are excluded from the results because of the QTradableStocksUSscreen. These excluded securities happen to be the min values.

So, add something like this

# Add the same factor mask as the screen to ensure the same 'universe' for both  
ml_pipe=make_ml_pipeline(make_factors(QTradableStocksUS()), QTradableStocksUS())

# Add a mask parameter when creating each of the factors  
def make_factors(mask):  
    def enterprise_value_original():  
        return Fundamentals.enterprise_value.latest  
    def enterprise_value_minmaxscaled():  
        return Nullfilling_minmaxscaled([Fundamentals.enterprise_value], mask=mask)  
    def enterprise_value_standardscaled():  
        return Nullfilling_standardscaled([Fundamentals.enterprise_value], mask=mask)  
    def enterprise_value_nulls():  
        return Nullindicator([Fundamentals.enterprise_value], mask=mask)  
    def enterprise_value_nulls_minmaxscaled():  
        return Nullindicator_minmaxscaled([Fundamentals.enterprise_value], mask=mask)

    def enterprise_value_nulls_standardscaled():  
        return Nullindicator_standardscaled([Fundamentals.enterprise_value], mask=mask)

    return all_factors

See attached notebook. The enterprise_value_minmaxscaled min value is now zero after applying the mask.

Hope that helps.

Loading notebook preview...
Notebook previews are currently unavailable.

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Thank you Dan, easy and elegant solution!