Magic Formula

Magic formula has been discussed in this forum in the past, but not much backtest results have been shared so far. Here I've implemented Greenblatt's strategy, with minor modifications such as filtering out mining and pharmaceutical companies. I've run backtests in segments with different market cap ranges, which showed that eliminating small cap stocks under a billion dollar cap improves the overall return. Small cap baskets tend to get destroyed by a number of companies losing more than 30 percent of their values. As per Greenblatt's remark, the strategy has periods of underperformance compared to S&P, but in long run it does seem to come out slightly ahead.

I'm now interested in comparing the predictive power of fundamental ratios. The original formula weighs return on investment (ROI) and earnings yield (EY) equally, but I found some discussions arguing EY should be weighed more heavily. Ideally I would like to do some regression analysis on these two ratios and other fundamental metrics, but It's difficult to find free historical fundamental data to test this thesis, so I'm wondering if someone here with some experience can chime in.

213
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
"""
This is a template algorithm on Quantopian for you to adapt and fill in.
"""
from quantopian.algorithm import attach_pipeline, pipeline_output
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.factors import AverageDollarVolume
from quantopian.pipeline import CustomFactor
from quantopian.pipeline.data import morningstar
import pandas as pd
import numpy as np

def initialize(context):
"""
Called once at the start of the algorithm.
"""
# Rebalance every day, 1 hour after market open.
#schedule_function(my_rebalance, date_rules.every_day(), time_rules.market_open(hours=1))

# Record tracking variables at the end of each day.
#schedule_function(my_record_vars, date_rules.every_day(), time_rules.market_close())

# Create our dynamic stock selector.
context.capacity = 25.0
context.weight = 1.0/context.capacity
set_long_only()

#schedule for buying a week after the year start
date_rule=date_rules.month_start(4),
time_rule=time_rules.market_open())
#schedule for selling losers a week before the year start
date_rule=date_rules.month_end(4),
time_rule=time_rules.market_open())
#schedule for selling winners on the 7th day of year start
date_rule=date_rules.month_start(3),
time_rule=time_rules.market_close())

today = get_datetime('US/Eastern')
if today.month == 1:
for stock in context.portfolio.positions:
#print stock
print stock
for stock in context.stocks.index:
order_target_percent(stock, context.weight)

#selling losers
today = get_datetime('US/Eastern')
if today.month == 12 and context.portfolio.positions_value != 0:
for stock in context.portfolio.positions:
if context.portfolio.positions[stock].cost_basis > data[stock].price:
order_target_percent(stock, 0)
print today, 'losers sold'

#selling winners
today = get_datetime('US/Eastern')
if today.month == 1:
for stock in context.portfolio.positions:
order_target_percent(stock, 0)

"""
A function to create our dynamic stock selector (pipeline). Documentation on
pipeline can be found here: https://www.quantopian.com/help#pipeline-title
"""

fundamental_df = get_fundamentals(

query(
#min market cap at 50 mil, finance and foreign stocks excluded,
#earnings yield, return on capital
fundamentals.asset_classification.morningstar_sector_code,
fundamentals.income_statement.ebit,
fundamentals.valuation.enterprise_value,
fundamentals.operation_ratios.roic,
fundamentals.income_statement.ebitda
)

#    .filter(fundamentals.valuation.market_cap > 50000000)
#    .filter(fundamentals.valuation.market_cap < 500000000)

.filter(fundamentals.valuation.market_cap > 1000000000)
.filter(fundamentals.valuation.market_cap < 10000000000)
#.filter(fundamentals.valuation_ratios.ev_to_ebitda > 0)
.filter(fundamentals.asset_classification.morningstar_sector_code != 103)
.filter(fundamentals.asset_classification.morningstar_sector_code != 207)
.filter(fundamentals.asset_classification.morningstar_sector_code != 206)
.filter(fundamentals.asset_classification.morningstar_sector_code != 309)
.filter(fundamentals.asset_classification.morningstar_industry_code != 20533080)
.filter(fundamentals.asset_classification.morningstar_industry_code != 10217033)
.filter(fundamentals.asset_classification.morningstar_industry_group_code != 10106)
.filter(fundamentals.asset_classification.morningstar_industry_group_code != 10104)
.filter(fundamentals.valuation.shares_outstanding != None)
.filter(fundamentals.valuation.market_cap != None)
.filter(fundamentals.valuation.shares_outstanding != None)
.filter(fundamentals.company_reference.primary_exchange_id != "OTCPK") # no pink sheets
.filter(fundamentals.company_reference.primary_exchange_id != "OTCBB") # no pink sheets
.filter(fundamentals.company_reference.country_id == "USA")
.filter(fundamentals.asset_classification.morningstar_sector_code != None) # require sector
.filter(fundamentals.share_class_reference.is_primary_share == True) # remove ancillary classes
.filter(((fundamentals.valuation.market_cap*1.0) / (fundamentals.valuation.shares_outstanding*1.0)) > 10.0)  # stock price > $1 .filter(fundamentals.share_class_reference.is_depositary_receipt == False) # !ADR/GDR .filter(~fundamentals.company_reference.standard_name.contains(' LP')) # exclude LPs .filter(~fundamentals.company_reference.standard_name.contains(' L P')) .filter(~fundamentals.company_reference.standard_name.contains(' L.P')) .filter(fundamentals.balance_sheet.limited_partnership == None) # exclude LPs #.order_by(fundamentals.valuation_ratios.ev_to_ebitda.asc()) ) fundamental_df.loc['earnings_yield'] = fundamental_df.loc['ebit']/fundamental_df.loc['enterprise_value'] #print fundamental_df.loc['ebit'], fundamental_df.loc['enterprise_value'], fundamental_df.loc['earnings_yield'] #rank the companies based on their earnings yield earnings_yield = fundamental_df ey = earnings_yield.loc['earnings_yield'] rank_ey = ey.rank(ascending = 0) #rank the companies based on the return on capital rank_roic = fundamental_df.loc['roic'].rank(ascending = 0) total_rank = rank_ey + rank_roic sorted_rank = total_rank.sort_values() print ey, rank_ey print rank_roic, total_rank print sorted_rank #get 100 best stocks context.stocks = sorted_rank[0:int(context.capacity)]  There was a runtime error. 14 responses Hi, Jonh. Thank you for sharing your code. It was the great starting point for me in learning Quantopian. But, It seems that we can't use any more get_fundamentals() which is in your code. I tried to modify your code that uses pipeline and I backtested it for same period. I just modified 3 lines and add make_pipeline() in your code. modified below. 1. for stock in context.stocks.index: to for stock in context.output.index: (line 52 in your code) (To use pipeline's output.) 2. data[stock].price: to data.current(stock, 'price'): (line 60 in your code) ( Because it looks that data[stock].price was deprecated.) 3. context.output=pipeline_output('my_pipeline').sort_values(by='MF_rank', ascending=True).head(int(context.capacity)) (before_trading_start() in your code) added below. 1. my_pipe = make_pipeline() 2. attach_pipeline(my_pipe, 'my_pipeline') (initialize() in your code) 3. make_pipeline(): ... But the result had some quite difference.(My result got low returns versus Benchmark's and got high MaxDrawdown.) Could you please look at my code and help me? 34 Loading... Total Returns -- Alpha -- Beta -- Sharpe -- Sortino -- Max Drawdown -- Benchmark Returns -- Volatility --  Returns 1 Month 3 Month 6 Month 12 Month  Alpha 1 Month 3 Month 6 Month 12 Month  Beta 1 Month 3 Month 6 Month 12 Month  Sharpe 1 Month 3 Month 6 Month 12 Month  Sortino 1 Month 3 Month 6 Month 12 Month  Volatility 1 Month 3 Month 6 Month 12 Month  Max Drawdown 1 Month 3 Month 6 Month 12 Month """ This is a template algorithm on Quantopian for you to adapt and fill in. """ from quantopian.algorithm import attach_pipeline, pipeline_output from quantopian.pipeline import Pipeline from quantopian.pipeline.data.builtin import USEquityPricing from quantopian.pipeline.factors import AverageDollarVolume from quantopian.pipeline import CustomFactor from quantopian.pipeline.data import morningstar, Fundamentals from quantopian.pipeline.filters.morningstar import IsPrimaryShare, IsDepositaryReceipt import pandas as pd import numpy as np def initialize(context): """ Called once at the start of the algorithm. """ # Rebalance every day, 1 hour after market open. #schedule_function(my_rebalance, date_rules.every_day(), time_rules.market_open(hours=1)) # Record tracking variables at the end of each day. #schedule_function(my_record_vars, date_rules.every_day(), time_rules.market_close()) # Create our dynamic stock selector. context.capacity = 25.0 context.weight = 1.0/context.capacity context.buy = True my_pipe = make_pipeline() attach_pipeline(my_pipe, 'my_pipeline') set_slippage(slippage.FixedSlippage(spread=0.02)) set_long_only() #schedule for buying a week after the year start schedule_function(func=schedule_task_a, date_rule=date_rules.month_start(4), time_rule=time_rules.market_open()) #schedule for selling losers a week before the year start schedule_function(func=schedule_task_b, date_rule=date_rules.month_end(4), time_rule=time_rules.market_open()) #schedule for selling winners on the 7th day of year start schedule_function(func=schedule_task_c, date_rule=date_rules.month_start(3), time_rule=time_rules.market_close()) def schedule_task_a(context, data): today = get_datetime('US/Eastern') if today.month == 1: for stock in context.portfolio.positions: #print stock print stock for stock in context.output.index: order_target_percent(stock, context.weight) #selling losers def schedule_task_b(context, data): today = get_datetime('US/Eastern') if today.month == 12 and context.portfolio.positions_value != 0: for stock in context.portfolio.positions: if context.portfolio.positions[stock].cost_basis > data.current(stock, 'price'): order_target_percent(stock, 0) print today, 'losers sold' #selling winners def schedule_task_c(context, data): today = get_datetime('US/Eastern') if today.month == 1: for stock in context.portfolio.positions: order_target_percent(stock, 0) def make_pipeline(): not_lp_name = ~Fundamentals.standard_name.latest.matches('.* L[. ]?P.?$')
is_primary_share = IsPrimaryShare()
is_not_depositary_receipt = ~IsDepositaryReceipt()

filter_market_cap = (Fundamentals.market_cap.latest > 1000000000) & (Fundamentals.market_cap.latest < 10000000000)

filter_sectors = (
(Fundamentals.morningstar_sector_code.latest != 103) &
(Fundamentals.morningstar_sector_code.latest != 207) &
(Fundamentals.morningstar_sector_code.latest != 206) &
(Fundamentals.morningstar_sector_code.latest != 309) &
(Fundamentals.morningstar_industry_code.latest != 20533080) &
(Fundamentals.morningstar_industry_code.latest != 10217033) &
(Fundamentals.morningstar_industry_group_code != 10106) &
(Fundamentals.morningstar_industry_group_code != 10104)
) & (filter_market_cap)

filter_plus = (
Fundamentals.shares_outstanding.latest.notnull() &
Fundamentals.market_cap.latest.notnull() &
(Fundamentals.primary_exchange_id != "OTCPK") &
(Fundamentals.primary_exchange_id != "OTCBB") &
Fundamentals.country_id.latest.matches("USA") &
Fundamentals.morningstar_sector_code.latest.notnull() &
(USEquityPricing.close.latest > 10.0) &
not_lp_name &
is_primary_share &
is_not_depositary_receipt
) & (filter_sectors)

earnings_yield = Fundamentals.ebit.latest/Fundamentals.enterprise_value.latest
EY_rank = earnings_yield.rank(ascending=False)
roic = Fundamentals.roic.latest
roic_rank = roic.rank(ascending=False)
MF_rank = EY_rank + roic_rank

pipe = Pipeline(columns = {
'earnings_yield': earnings_yield,
'roic': roic,
'MF_rank': MF_rank,
} ,screen = filter_plus )
return pipe

"""
A function to create our dynamic stock selector (pipeline). Documentation on
pipeline can be found here: https://www.quantopian.com/help#pipeline-title
"""
# context.stocks = sorted_rank[0:int(context.capacity)]
There was a runtime error.

Best to rank using the mask:

EY_rank = earnings_yield.rank(ascending=False, mask=filter_plus)


Remove close.latest not in original.

This is a little different.

184
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
from quantopian.pipeline.classifiers.fundamentals import Sector
from quantopian.algorithm import attach_pipeline, pipeline_output
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.factors import AverageDollarVolume
from quantopian.pipeline import CustomFactor
from quantopian.pipeline.filters import Q500US, Q1500US, Q3000US, QTradableStocksUS
from quantopian.pipeline.data import morningstar, Fundamentals
from quantopian.pipeline.filters.morningstar import IsPrimaryShare, IsDepositaryReceipt
import pandas as pd
import numpy as np

def initialize(context):
# Rebalance every day, 1 hour after market open.
#schedule_function(my_rebalance, date_rules.every_day(), time_rules.market_open(hours=1))

# Record tracking variables at the end of each day.
#schedule_function(my_record_vars, date_rules.every_day(), time_rules.market_close())

# Create our dynamic stock selector.
context.capacity = 25

my_pipe = make_pipeline()
attach_pipeline(my_pipe, 'my_pipeline')

set_long_only()

#schedule for buying a week after the year start
date_rule=date_rules.month_start(4),
time_rule=time_rules.market_open())
#schedule for selling losers a week before the year start
date_rule=date_rules.month_end(4),
time_rule=time_rules.market_open())
#schedule for selling winners on the 7th day of year start
date_rule=date_rules.month_start(3),
time_rule=time_rules.market_close())

today = get_datetime('US/Eastern')
if today.month == 1:
for stock in context.portfolio.positions:
#print stock
print stock
for stock in context.output.index:
order_target_percent(stock, context.output.T[stock]['weight']) # T is Transform, there is a better way.  .iloc or .ix or something
#order_target_percent(stock, context.weight)

#selling losers
today = get_datetime('US/Eastern')
if today.month == 12 and context.portfolio.positions_value != 0:
for stock in context.portfolio.positions:
if context.portfolio.positions[stock].cost_basis > data.current(stock, 'price'):
order_target_percent(stock, 0)
print today, 'losers sold'

#selling winners
today = get_datetime('US/Eastern')
if today.month == 1:
for stock in context.portfolio.positions:
order_target_percent(stock, 0)

def make_pipeline():
m &= (Fundamentals.market_cap.latest > 1000000000)
m &= (Fundamentals.market_cap.latest < 10000000000)
m &= (
(Fundamentals.morningstar_sector_code.latest != 103) &
(Fundamentals.morningstar_sector_code.latest != 207) &
(Fundamentals.morningstar_sector_code.latest != 206) &
(Fundamentals.morningstar_sector_code.latest != 309) &
(Fundamentals.morningstar_industry_code.latest != 20533080) &
(Fundamentals.morningstar_industry_code.latest != 10217033) &
(Fundamentals.morningstar_industry_group_code != 10106) &
(Fundamentals.morningstar_industry_group_code != 10104)
)

earnings_yield = Fundamentals.ebit.latest/Fundamentals.enterprise_value.latest
roic      = Fundamentals.roic.latest
MF_rank   = EY_rank + roic_rank

pipe = Pipeline(columns = {
'earnings_yield': earnings_yield,
'roic'          : roic,
'MF_rank'       : MF_rank,
}, screen = m )
return pipe

# context.stocks = sorted_rank[0:int(context.capacity)]

# weight as rank normalize 0 to 1
context.output['weight'] = context.output['MF_rank'] / context.output['MF_rank'].sum()

#context.weight = 1.0 / len(context.output)


There was a runtime error.

Thanks, Blue Seahawk.

@K, thank you for mentioning it, and here are some more possibilities then too.
Mainly run this and take a look at the log window. Visibility into the pipeline values.
This is just a start toward adding short if you wish, it would need some work.

The focus here is to provide options, tools & flexibility. For example, class Wild() for quick development in trying things, normalization of positive and negative weights separately to be able to add shorting if you wish (that's where things went south with that last minute addition of norm()), logging of pipeline min, mean, max and some highs & lows, forward filling of nans (addition of class was necessary there, to have a window to work with, rather than just latest), an example of percentile_between you might want to try some numbers in, examples of zscore, demean (as one way to obtain some negative values for short shares), a little bit more efficient route for 'today', the efficient pnl determination for long & short simultaneously helps makes an addition of shorting easier to work with if interested in moving toward qualifying in the contest for example.

Returns were not the point so this backtest is only a few days. Rather than going with this algo, you could use it to copy/paste various bits over to yours in trying some things.

184
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
''' https://www.quantopian.com/posts/magic-formula
Bit of a mess made this time and yet some things to think about, raw materials to work with.
'''

from quantopian.algorithm import attach_pipeline, pipeline_output
from quantopian.pipeline  import CustomFactor
from quantopian.pipeline  import Pipeline
from quantopian.pipeline.data         import Fundamentals
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.filters      import Q500US, Q1500US, Q3000US, QTradableStocksUS
from quantopian.pipeline.classifiers.fundamentals import Sector
import numpy  as np
import pandas as pd

def initialize(context):
# Trade every day, 1 hour after market open.

# Record tracking variables at the end of each day.
#schedule_function(records, date_rules.every_day(), time_rules.market_close())

# Create dynamic stock selector.
context.capacity = 125

pipe = make_pipeline()
attach_pipeline(pipe, 'pipeline')

#set_long_only()

# Buying a week after the year start
schedule_function(opens,         date_rules.month_start(4), time_rules.market_open())
# Selling losers a week before the year start
schedule_function(close_losers,  date_rules.  month_end(4), time_rules.market_open())
# Selling winners on the nth day of year start
schedule_function(close_winners, date_rules.month_start(3), time_rules.market_close())

def opens(context, data):            # opening positions
if context.today.month != 1: return

log.info('current: {}'.format([s.symbol for s in context.portfolio.positions]))
opening = []
for s in context.output.index:

# with norm() .....
# donno how these can ever be missing. Uncomment and investigate with debugger eh.
if s not in context.weights.index: continue
order_target_percent(s, context.weights[s])

#order_target_percent(s, context.output.T[s]['weight']) # T is Transform, there is a better way.  .iloc or .ix or something
#order_target_percent(s, context.weight)
opening.append(s.symbol)
log.info('opening: {}'.format(str(opening)))

def close_losers(context, data):     # selling losers
if context.today.month != 12 or context.portfolio.positions_value == 0:
return

'''
A way to handle cost basic vs price relationship in both short & long at the same time
pos = context.portfolio.positions
amt = pos[s].amount
pnl = amt * (data.current(s, 'price') - pos[s].cost_basis)
'''

pos = context.portfolio.positions

for s in pos:
#if pos[s].cost_basis > data.current(s, 'price'):
#    order_target_percent(s, 0)
pnl = pos[s].amount * (data.current(s, 'price') - pos[s].cost_basis)
if pnl < 0:
order_target(s, 0)
print context.today, 'losers sold'

def close_winners(context, data):    # selling winners
if context.today.month != 1: return

#for s in context.portfolio.positions:
#    order_target_percent(s, 0)

pos = context.portfolio.positions

for s in pos:
#if pos[s].cost_basis > data.current(s, 'price'):
#    order_target_percent(s, 0)
pnl = pos[s].amount * (data.current(s, 'price') - pos[s].cost_basis)
if pnl > 0:
order_target(s, 0)

def make_pipeline():
m &= (Fundamentals.market_cap.latest > 1000000000)
m &= (Fundamentals.market_cap.latest < 10000000000)
m &= (
(Fundamentals.morningstar_sector_code.latest   != 103) &
(Fundamentals.morningstar_sector_code.latest   != 207) &
(Fundamentals.morningstar_sector_code.latest   != 206) &
(Fundamentals.morningstar_sector_code.latest   != 309) &
(Fundamentals.morningstar_industry_code.latest != 20533080) &
(Fundamentals.morningstar_industry_code.latest != 10217033) &
(Fundamentals.morningstar_industry_group_code  != 10106) &
(Fundamentals.morningstar_industry_group_code  != 10104)
)

#earnings_yield = Fundamentals.ebit.latest / Fundamentals.enterprise_value.latest
#earnings_yield = EBITPerEV(mask=m) ; m &= (earnings_yield > 0)                        # 124
earnings_yield = EBITPerEV(mask=m) ; #m &= (earnings_yield.percentile_between(70, 95)) # 141
#earnings_yield = Fundamentals.earning_yield.latest.zscore(mask=m) ; m &= (earnings_yield.percentile_between(70, 95))  # 114
#roic      = Fundamentals.roic.latest
#MF_rank   = (EY_rank + roic_rank).rank(ascending=False, mask=m)
MF_rank   = (EY_rank + roic_rank).rank(ascending=False, mask=m).demean() # use with norm()

''' Original before MF_rank re-rank ...
min                mean                 max
MF_rank                  7.0               79.76               129.0
earnings_yield      0.0291078522234     0.0569397883902      0.108915379496
roic             0.051941          0.12467516            0.414196
weight     0.00351053159478                0.04     0.0646940822467
'''

pipe = Pipeline(columns = {
'earnings_yield': earnings_yield,
'roic'          : roic,
'MF_rank'       : MF_rank,
}, screen = m )
return pipe

context.today = get_datetime('US/Eastern')

# context.stocks = sorted_rank[0:int(context.capacity)]

# weight as rank normalize 0 to 1
#context.output['weight'] = context.output['MF_rank'] / context.output['MF_rank'].sum()

# Adding norm() was a last minute thing with disastrous consequences.
#   You might want to go back to see if you can rescue it or
#     go back to context.output['weight'] = ... above
# demean moves down so middle is around zero, for norm to have pos & neg to chew on.
context.weights = norm(context, context.output['MF_rank'])

#context.weight = 1.0 / len(context.output)

if 'log_pipe_done' not in context:    # show pipe info once
log_pipe(context, data, context.output, 4)
#log_pipe(context, data, context.output, 4, filter=['alpha', 'beta', ... or what-have-you])

def norm(c, d):    # d data, it's a series, normalize it pos & neg
# A different normalization method that handles pos, neg separately for long, short weights
if d.min() >= 0:
d -= d.mean()
pos = d[ d > 0 ]
neg = d[ d < 0 ]
if   not len(pos) and len(neg):
d = neg - neg.mean()
elif not len(neg) and len(pos):
d = pos - pos.mean()
pos  = d[ d > 0 ]
neg  = d[ d < 0 ]
num  = min(len(pos), len(neg))
neg  = neg.sort_values(ascending=False).tail(num)
pos /=  pos.sum()
neg  = -(neg / neg.sum())
return pos.append(neg)

class ROIC(CustomFactor):
inputs = [Fundamentals.roic] ; window_length = 252
def compute(self, today, assets, out, roic):
roic = nanfill(roic)
out[:] = np.mean(roic, axis=0)

class Wild(CustomFactor):
# Intended for the default input (roic) to be overridden with any other fundamental like
#    or
#  fcf = Wild(inputs=[Fundamentals.fcf_yield], window_length=88, mask=m)
inputs = [Fundamentals.roic] ; window_length = 252
def compute(self, today, assets, out, z):
out[:] = np.mean(nanfill(z), axis=0)        # mean, avg

class EBITPerEV(CustomFactor):
inputs = [Fundamentals.ebit, Fundamentals.enterprise_value]; window_length = 144
def compute(self, today, assets, out, ebit, ev):
ebit = nanfill(ebit)
ev   = nanfill(ev)
out[:] = np.mean(ebit, axis=0) / np.mean(ev, axis=0)
#out[:] = ebit[-1] / ev[-1]

def nanfill(_in):    # https://stackoverflow.com/questions/41190852/most-efficient-way-to-forward-fill-nan-values-in-numpy-array
do_nanfill = 1        # set to 0 for an interesting test of the difference or no diff.
if not do_nanfill:
return _in
# Forward-fill missing values
np.maximum.accumulate(idx,axis=1, out=idx)
return _in

def log_pipe(context, data, z, num, filter=None):
'''
# Options
log_nan_only = 0          # Only log if nans are present
show_sectors = 0          # If sectors, do you want to see them or not
show_sorted_details = 1   # [num] high & low securities sorted, each column

if 'log_init_done' not in context:
log.info('\${}    {} to {}'.format('%.0e' % (context.portfolio.starting_cash),
get_environment('start').date(), get_environment('end').date()))
context.log_init_done = 1

if not len(z):
log.info('Empty')
return

# Series ......
context.log_pipe_done = 1 ; padmax = 6
if 'Series' in str(type(z)):    # is Series, not DataFrame
nan_count = len(z[z != z])
nan_count = 'NaNs {}/{}'.format(nan_count, len(z)) if nan_count else ''
if (log_nan_only and nan_count) or not log_nan_only:
log.info('{}{}{}   Series {}  len {}'.format('min' .rjust(pad+5),
))
return

# DataFrame ......
content_min_max = [ ['','min','mean','max',''] ] ; content = ''
for col in z.columns:
if col == 'sector' and not show_sectors: continue
nan_count = len(z[col][z[col] != z[col]])
nan_count = 'NaNs {}/{}'.format(nan_count, len(z)) if nan_count else ''
content_min_max.append([col, str(z[col] .min()), str(z[col].mean()), str(z[col] .max()), nan_count])
if log_nan_only and nan_count or not log_nan_only:
content = 'Rows: {}  Columns: {}'.format(z.shape[0], z.shape[1])
if len(z.columns) == 1: content = 'Rows: {}'.format(z.shape[0])

paddings = [6 for i in range(4)]
for lst in content_min_max:    # set max lengths
i = 0
for val in lst[:4]:    # value in each sub-list
i += 1
content += ('\n{}{}{}{}{}'.format(
''
))
for lst in content_min_max[1:]:    # populate content using max lengths
content += ('\n{}{}{}{}     {}'.format(
lst[4],
))
log.info(content)

if not show_sorted_details: return
if len(z.columns) == 1:     return     # skip detail if only 1 column
if filter == None: details = z.columns
for detail in details:
if detail == 'sector': continue
lo = z[details].sort_values(by=detail, ascending=False).tail(num)
content  = ''
content += ('_ _ _   {}   _ _ _'  .format(detail))
content += ('\n\t... {} highs\n{}'.format(detail, str(hi)))
content += ('\n\t... {} lows \n{}'.format(detail, str(lo)))
if log_nan_only and not len(lo[lo[detail] != lo[detail]]):
continue  # skip if no nans
log.info(content)

There was a runtime error.

Why are you ascending=False?
Wouldn't the higher earnings yield be better value?

This strategy is so weird. If you change the "ascending" setting in the line 98 in Blue's code as below, it means that you are selecting the worst companies in the output list (~750 companies) to go Long. The performance is still pretty good. However, if you go Short, it performs terribly.

context.output=pipeline_output('my_pipeline').sort_values(by='MF_rank', ascending=True).head(int(context.capacity)).dropna()


to

context.output=pipeline_output('my_pipeline').sort_values(by='MF_rank', ascending=False).head(int(context.capacity)).dropna()

4
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
from quantopian.pipeline.classifiers.fundamentals import Sector
from quantopian.algorithm import attach_pipeline, pipeline_output
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.factors import AverageDollarVolume
from quantopian.pipeline import CustomFactor
from quantopian.pipeline.filters import Q500US, Q1500US, Q3000US, QTradableStocksUS
from quantopian.pipeline.data import morningstar, Fundamentals
from quantopian.pipeline.filters.morningstar import IsPrimaryShare, IsDepositaryReceipt
import pandas as pd
import numpy as np

def initialize(context):
# Rebalance every day, 1 hour after market open.
#schedule_function(my_rebalance, date_rules.every_day(), time_rules.market_open(hours=1))

# Record tracking variables at the end of each day.
#schedule_function(my_record_vars, date_rules.every_day(), time_rules.market_close())

# Create our dynamic stock selector.
context.capacity = 25

my_pipe = make_pipeline()
attach_pipeline(my_pipe, 'my_pipeline')

set_long_only()

#schedule for buying a week after the year start
date_rule=date_rules.month_start(4),
time_rule=time_rules.market_open())
#schedule for selling losers a week before the year start
date_rule=date_rules.month_end(4),
time_rule=time_rules.market_open())
#schedule for selling winners on the 7th day of year start
date_rule=date_rules.month_start(3),
time_rule=time_rules.market_close())

today = get_datetime('US/Eastern')
if today.month == 1:
for stock in context.portfolio.positions:
#print stock
print stock
for stock in context.output.index:
order_target_percent(stock, context.output.T[stock]['weight']) # T is Transform, there is a better way.  .iloc or .ix or something
#order_target_percent(stock, context.weight)

#selling losers
today = get_datetime('US/Eastern')
if today.month == 12 and context.portfolio.positions_value != 0:
for stock in context.portfolio.positions:
if context.portfolio.positions[stock].cost_basis > data.current(stock, 'price'):
order_target_percent(stock, 0)
print today, 'losers sold'

#selling winners
today = get_datetime('US/Eastern')
if today.month == 1:
for stock in context.portfolio.positions:
order_target_percent(stock, 0)

def make_pipeline():
m &= (Fundamentals.market_cap.latest > 1000000000)
m &= (Fundamentals.market_cap.latest < 10000000000)
m &= (
(Fundamentals.morningstar_sector_code.latest != 103) &
(Fundamentals.morningstar_sector_code.latest != 207) &
(Fundamentals.morningstar_sector_code.latest != 206) &
(Fundamentals.morningstar_sector_code.latest != 309) &
(Fundamentals.morningstar_industry_code.latest != 20533080) &
(Fundamentals.morningstar_industry_code.latest != 10217033) &
(Fundamentals.morningstar_industry_group_code != 10106) &
(Fundamentals.morningstar_industry_group_code != 10104)
)

earnings_yield = Fundamentals.ebit.latest/Fundamentals.enterprise_value.latest
roic      = Fundamentals.roic.latest
MF_rank   = EY_rank + roic_rank

pipe = Pipeline(columns = {
'earnings_yield': earnings_yield,
'roic'          : roic,
'MF_rank'       : MF_rank,
}, screen = m )
return pipe

# context.stocks = sorted_rank[0:int(context.capacity)]

# weight as rank normalize 0 to 1
context.output['weight'] = context.output['MF_rank'] / context.output['MF_rank'].sum()

#context.weight = 1.0 / len(context.output)
There was a runtime error.

It outperforms here due to higher beta. <=1% alpha is something, but I imagine it can be improved with a few tweaks here.

If you need free historical fundamentals data, I run TenQuant.io. Feel free to check it out; it retreives data in real time from Edgar.

Hi guys,
I worked for some days on the magic formula starting from this post (really appreciated).
after many modifications and backtests, I found a really weird behaviour, based on the month in which the stocks are rotated:
the strategy results change heavily if the stock buys/sells are made in a different month from January (as proposed in the examples of this discussion).
I took the code of Blue Seahawk and backtested it using all the 12 months, here's the results:

Moreover, I have made many changes on the algorithm (using interval 2003-2020 instead of 2011-2017, using FCF yield instead of ebit/EV, using a custom formula to extract ROCE instead of ROIC, and other minor changes), but the results have the same behaviour: really good on january, good on the ending months of the year, and really bad if the rotation is made on spring/summer:

I can't find an explanation of this behaviour (it doesn't seem random). How can it happens?

@Luca Wiegand Low frequency trading is very susceptible to initial state values. Think of trying to fly a plane from New York to Los Angeles. If the pilot adjusts his course every minute then chances are the flight path will be quite straight with a lot of tiny corrections. On the other hand, if the pilot only adjusts his course every hour, the flight path will be much more erratic and potentially quite far off a 'straight line' at times. Moreover, if the direction calculations are off a bit, they will have a much greater impact when adjusting only hourly.

Conventional wisdom is the markets don't perform well in the spring and summer. There is the adage "sell in May and go away" . The trading in spring and summer isn't reflective of the other parts of the year. So, similar to the flight example above, if one bases trading 'direction decisions' only on data from those months then results may be significantly different from a 'straight line'.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hi Dan, thanks for the response.
I can agree on the fact that low frequency tarding can be more susceptible, but in this case the results each year should not have a bias towards one specific month, but it should be random.
If you check the results each year for choosing the stocks on january and on july:

The january strategy beats july 13 times over the 16 total.

Regarding the underperformance in spring and summer, even if it could be true only in some years, all the data used in the strategy are annual (LTM), except for the ones that came from balance sheet. So I don't find a reason why selecting stocks in an underperforming period can lead to underperformance for the whole year

I attach a backtest if someone want to have a look

2
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
from quantopian.pipeline.classifiers.fundamentals import Sector
from quantopian.algorithm import attach_pipeline, pipeline_output
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.factors import AverageDollarVolume
from quantopian.pipeline import CustomFactor
from quantopian.pipeline.filters import Q500US, Q1500US, Q3000US, QTradableStocksUS
from quantopian.pipeline.data.morningstar import Fundamentals
from quantopian.pipeline.filters.morningstar import IsPrimaryShare, IsDepositaryReceipt
import pandas as pd
import numpy as np

# First the numpy version
class TrailingTwelveMonths(CustomFactor):
window_length=315
window_safe = True

def compute(self, today, assets, out, values, dates):
out[:] = [
(v[d + np.timedelta64(52, 'W') > d[-1]])[
np.unique(
d[d + np.timedelta64(52, 'W') > d[-1]],
return_index=True
)[1]
].sum()
for v, d in zip(values.T, dates.T)
]

def initialize(context):
# Rebalance every day, 1 hour after market open.
#schedule_function(my_rebalance, date_rules.every_day(), time_rules.market_open(hours=1))

# Record tracking variables at the end of each day.
#schedule_function(my_record_vars, date_rules.every_day(), time_rules.market_close())

# Create our dynamic stock selector.
context.capacity = 30
context.weight = 1.0/context.capacity

my_pipe = make_pipeline()
attach_pipeline(my_pipe, 'my_pipeline')

set_commission(commission.PerShare(cost=0))
set_long_only()

#schedule for buying a week after the year start
date_rule=date_rules.month_start(2),
time_rule=time_rules.market_open())
#schedule for selling winners on the 7th day of year start
date_rule=date_rules.month_start(1),
time_rule=time_rules.market_close())

today = get_datetime('US/Eastern')
str_stock = ''
if today.month == 7:
for stock in context.output.index:
str_stock += '- {0} ({1})'.format(stock.asset_name, stock.symbol)
order_target_percent(stock, context.weight)

#selling winners
str_stock = ''
today = get_datetime('US/Eastern')
if today.month == 7:
for stock in context.portfolio.positions:
str_stock += '- {0} ({1})'.format(stock.asset_name, stock.symbol)
order_target_percent(stock, 0)
print(('selling winners: {0}'.format(str_stock)))

def make_pipeline():
filter_base = Q3000US() & Sector().notnull()

filter_market_cap = (Fundamentals.market_cap.latest > 1000000000) & (filter_base) #& (Fundamentals.market_cap.latest < 10000000000)

filter_sectors = (
(Fundamentals.morningstar_sector_code.latest != 103) & # Financial Services
(Fundamentals.morningstar_sector_code.latest != 207) & # Utilities
(Fundamentals.morningstar_sector_code.latest != 104) & # REIT
(Fundamentals.morningstar_industry_code.latest != 30910060) # Oil e Gas
#(Fundamentals.morningstar_sector_code.latest != 206) & # Healthcare
#(Fundamentals.morningstar_industry_code.latest != 20533080) & # Pharmaceutical Retailers
#(Fundamentals.morningstar_industry_code.latest != 10217033) & # Apparel Stores
#(Fundamentals.morningstar_industry_group_code != 10106) & # Metals & Mining
#(Fundamentals.morningstar_industry_group_code != 10104) # Coal
) & (filter_market_cap)

filter_mf = (
(Fundamentals.net_margin.latest > 0) &
(Fundamentals.tangible_book_value_per_share.latest > 0) &
(Fundamentals.book_value_per_share.latest > 0)
) & (filter_sectors)

#  NEW EBIT accurato e che arriva anche prima
ebit = TrailingTwelveMonths(inputs=[Fundamentals.ebit, Fundamentals.ebit_asof_date], mask=filter_mf )
earnings_yield =ebit/Fundamentals.enterprise_value.latest

total_assets = Fundamentals.total_assets.latest
current_liabilities = Fundamentals.current_liabilities.latest
current_assets = Fundamentals.current_assets.latest - Fundamentals.cash_cash_equivalents_and_marketable_securities.latest
intang = Fundamentals.goodwill_and_other_intangible_assets.latest
cash = Fundamentals.cash_cash_equivalents_and_marketable_securities.latest

pipe = Pipeline(columns = {
'earnings_yield': earnings_yield,
'EBIT': ebit,
'total_assets': total_assets,
'current_liabilities': current_liabilities,
'current_assets': current_assets,
'intang': intang,
'cash': cash,
} ,screen = filter_mf )
return pipe

"""
A function to create our dynamic stock selector (pipeline). Documentation on
pipeline can be found here: https://www.quantopian.com/help#pipeline-title
"""
data = pipeline_output('my_pipeline')
data['current_liabilities_mod'] = data[['current_liabilities','current_assets']].min(axis=1)
data['intang'] = data['intang'].fillna(0)
data['CapitalEmployed'] = data['total_assets']-data['current_liabilities_mod']-data['intang']-data['cash']
data['ROCE'] = data['EBIT'] / data['CapitalEmployed']
data['EY_rank'] = data['earnings_yield'].rank(ascending=False)
data['ROCE_rank'] = data['ROCE'].rank(ascending=False)
data['MF_rank'] = data['EY_rank'] + data['ROCE_rank']

context.output=data.sort_values(by='MF_rank', ascending=True).head(int(context.capacity)).dropna()
There was a runtime error.

The conventional wisdom of "sell in May and go away" is sort of premised on the fact that most companies fiscal year ends in December. Results come out in January, and the markets digest those results in February. Companies which had done well are rewarded with higher share prices while those which didn't have lower share prices. Conventional wisdom is that good stocks are often at their peak by May while poorer performing stocks are at their low. Hence, 'sell in May' assumes you had some good stocks so sell at their peak.

Consider what this strategy does. It buys 'good stocks' and sells 'bad stocks'. After earnings season, during the months of May through the summer, the 'good stocks' are at their highs while 'bad stocks' are at their lows. By rebalancing during these months the algo effectively buys 'high' and sells 'low'. Not the formula for a winning strategy.

However, consider moving that rebalancing forward before earnings season. The 'good stocks' haven't been run up and the 'bad stocks' haven't been dragged down. The algo, as noted, uses annual fundamental data for the most part. The picks for 'good' and 'bad' stocks therefore won't change a lot whether before or after earnings season. The algo will probably be trading about the same stocks. That hasn't changed. What has changed? The price. By trading before earnings season one has a fighting chance at buying low and selling high. That is a much better formula for a winning strategy.

All this of course are generalities and there are of course many exceptions. But stock valuations, on average, are not random with highest and lowest valuations being December through February. Check out the attached notebook showing the number companies reaching their min and max PE by month over the 10 years 2010-2019.

1

@Dan,

Why is there only 10 bars on the histogram?

@Vladimir Good catch regarding the number of bars. I got a lazy and used the default bars=10 for the hist method. Setting bars=12 is more appropriate. Doing that also changes the graph to not show such explicit peaks in January. It still shows definite variation but just not so pronounced. Again good catch.

Attached is an updated notebook.

2