Back to Community
Piotroski F-Score Alphalens notebook

Feedback and improvements welcome.

I grabbed the Pipeline custom factor posted by Praveen Bhushan here:

https://www.quantopian.com/posts/piotroskis-f-score-algorithm

Note that the output of the factor is put through this function, with WIN_LIMIT = 0.0:

def preprocess(a):  
    a = np.nan_to_num(a - np.nanmean(a))  
    a = winsorize(a, limits=[WIN_LIMIT,WIN_LIMIT])

    return preprocessing.scale(a)  
Loading notebook preview...
Notebook previews are currently unavailable.
8 responses

Hi Grant, what is the purpose of the preprocess function?
Also your Piotroski score isn't 1-9, have you normalised it?

Hello Kaya -

The preprocess function allows for the possibility of winsorizing (removes outliers - see https://en.wikipedia.org/wiki/Winsorizing). It also transforms the data, converting to z-scores, via sklearn.preprocessing.scale (for a single factor, this does nothing, but when combining factors, all of the factors need to be similarly scaled in some fashion).

Thanks Grant.
I made it into an algo and it doesn't seem to perform anywhere near what some others have got with f-score and what is blogged about. Can you see what I am doing wrong?

Clone Algorithm
3
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# import pipeline methods 
from quantopian.algorithm import attach_pipeline, pipeline_output
from quantopian.pipeline import Pipeline

# import built in factors and filters
import quantopian.pipeline.factors as Factors
import quantopian.pipeline.filters as Filters
from quantopian.pipeline.filters import QTradableStocksUS
from quantopian.pipeline.factors import CustomFactor

# import any datasets we need
from quantopian.pipeline.data.builtin import USEquityPricing 
from quantopian.pipeline.data import Fundamentals

from scipy.stats.mstats import winsorize
from sklearn import preprocessing
# import numpy and pandas just in case
import numpy as np
import pandas as pd

WIN_LIMIT = 0.0

def preprocess(a):
    a = np.nan_to_num(a - np.nanmean(a))
    a = winsorize(a, limits=[WIN_LIMIT,WIN_LIMIT])
    return preprocessing.scale(a)

class Piotroski(CustomFactor):
        inputs = [
                Fundamentals.roa,
                Fundamentals.operating_cash_flow,
                Fundamentals.cash_flow_from_continuing_operating_activities,
                Fundamentals.long_term_debt_equity_ratio,
                Fundamentals.current_ratio,
                Fundamentals.shares_outstanding,
                Fundamentals.gross_margin,
                Fundamentals.assets_turnover,
                ]

        window_length = 100

        def compute(self, today, assets, out,roa, cash_flow, cash_flow_from_ops, long_term_debt_ratio, current_ratio, shares_outstanding, gross_margin, assets_turnover):

            profit = (
                        (roa[-1] > 0).astype(int) +
                        (cash_flow[-1] > 0).astype(int) +
                        (roa[-1] > roa[0]).astype(int) +
                        (cash_flow_from_ops[-1] > roa[-1]).astype(int)
                     )

            leverage = (
                        (long_term_debt_ratio[-1] < long_term_debt_ratio[0]).astype(int) +
                        (current_ratio[-1] > current_ratio[0]).astype(int) + 
                        (shares_outstanding[-1] <= shares_outstanding[0]).astype(int)
                        )

            operating = (
                        (gross_margin[-1] > gross_margin[0]).astype(int) +
                        (assets_turnover[-1] > assets_turnover[0]).astype(int)
                        )

            out[:] = preprocess(profit + leverage + operating)
 
def initialize(context):
    set_benchmark(symbol('SPY'))
    # Create and attach pipeline to get data
    attach_pipeline(my_pipeline(context), name='my_pipe')
    
    # Rebalance monthly on the first day of the month at market open
    schedule_function(rebalance,                      
                      #date_rule=date_rules.week_start(),
                      date_rule=date_rules.month_start(),
                      time_rule=time_rules.market_open())
    
def my_pipeline(context):
    base_universe = QTradableStocksUS()
    piotroski = Piotroski()

    return Pipeline(
        columns={
            'piotroski': piotroski
        },

        screen=(
           piotroski.top(15, mask=base_universe)
        )
    )

def before_trading_start(context, data): 
    context.output = pipeline_output('my_pipe')
    record (cash = context.portfolio.cash, asset = context.portfolio.portfolio_value)
 
def rebalance(context, data):    
    # Exit all positions before starting new ones
    for stock in context.portfolio.positions:
        if stock not in context.output.index:
            order_target(stock, 0)
                
    # Create weights for each stock
    weight = create_weights(context, context.output.index, data)

    # Rebalance all stocks to target weights
    for stock in context.output.index:
        if data.can_trade(stock):
            order_target_percent(stock, weight);
                
    
def create_weights(context, stocks, data):
    """
        Takes in a list of securities and weights them all equally 
        Pipeline only returns securitites that can trade. No need to
        check data.can_trade.
    """
    cantradecount = len(stocks)
     
    if cantradecount == 0:
        return 0 
    else:
        # Buy only 0.9 of portfolio value to avoid borrowing
        weight = 0.99/cantradecount
        return weight
There was a runtime error.

Not sure. Whatever you are using as your benchmarks ("it doesn't seem to perform anywhere near what some others have got with f-score and what is blogged about"), you'll need to do a detailed comparison, to make sure you've implemented your algo in the same fashion. Keep in mind that inevitably, anything in the public domain claiming to provide good performance is somehow biased (sometimes intentionally!). So, to make fair comparison, you need to apply the same bias.

Here are a couple of benchmarks that got me interested.. although i want to use it as part of a larger strategy.

Piotroski1

Piotroski sorted by ebita

Piotroski 3

Also the winsorize is very cool too, thanks for that.

As part of a larger strategy, if the factor has even a little bit of "alpha" you should be able to combine it with other factors.

thanks for the model @Leo