The Quantitative Value Algorithm in Pipelines: Dreams and Nightmares

I've been implementing the algorithm in quantitative value for some time now in quantopian, which some success and much aggravation. I've implemented the algorithm to work through pipelines. For those unfamiliar with the book, here's an amazon link:

https://www.amazon.com/Quantitative-Value-Web-Site-Practitioners/dp/1118328078

I'm not sure if I want to share the implementation yet, given that it is still in progress, but here's a picture of the partial pipeline for the algorithm:

Edit: The image is rediculously large, better if you download it from here: https://dl.dropboxusercontent.com/u/61360917/index.png

This pipeline represents only one year history for the fundamental factors (that's two time periods one year apart), whereas the text recommends 8 years. With two year history the pipeline execution dies from being out of memory. I haven't been able to run a backtest, again because the pipeline runs out of memory. I have a feeling there might be a memory leak, since the pipeline runs once, starts the backtest, then dies on the next rebalance.

## What's missing

Financial stability. This is almost implemented. I can't figure out how to implement a computation over a factor, such that f(x) = 1 if x > 0 else 0. If anyone can suggest an implementation of this function for pipeline factors, I will be grateful. I tried writing a custom factor, but that fails with an error. It might have to do with factors being restricted to take bound columns or "on factors which are deemed safe for use as inputs to other factors". Safe transformations are hardcoded into zipline, and it doesn't appear you can declare custom factors to be safe.

class PositiveToOne(CustomFactor):
window_length = 1

def compute(self, today, assets, out, data):
out[:] = data[-1].apply(lambda x: 1.0 if x > 0 else 0.0)


Edit: Positive-to-one is implemented as (x/|x| + 1)/2. This will not work if x is 0, but that should be an unlikely circumstance for the particular factors being thresholded.

Potential Fraud Detection. In principle this should be possible, I just haven't done it. It's just a long algebraic expression.

## Output for Franchise Power, Computed Over One Year

Here's what the research environment spits out:

pout[pout['longs_filter']].sort('quality')[:32].index.get_level_values(1)

Index([   Equity(28450 [AGX]),   Equity(13698 [MYGN]),    Equity(8613 [CHDN]),
Equity(6458 [RGR]),    Equity(45771 [MMI]),    Equity(26563 [WLK]),
Equity(11645 [GBX]),     Equity(3660 [HRB]),   Equity(37869 [VRTS]),
Equity(18917 [HZO]),    Equity(20284 [SKX]),      Equity(812 [BEN]),
Equity(27413 [NSR]),    Equity(36372 [SNI]),   Equity(32860 [CPLA]),
Equity(27997 [WNR]),      Equity(41 [ARCB]),   Equity(36735 [IILG]),
Equity(6736 [SMG]),    Equity(24799 [MTN]),     Equity(9514 [BWA]),
Equity(4654 [MAN]),     Equity(7990 [VLO]),   Equity(45506 [PINC]),
Equity(39546 [LYB]),   Equity(15397 [STRA]),   Equity(33016 [ALGT]),
Equity(13718 [POOL]),      Equity(915 [BKE]), Equity(36930 [DISC_A]),
Equity(21697 [NTRI]),     Equity(1287 [CBI])],
dtype='object')

5 responses

Here's the updated long positions after fixing some bugs in ebit/ev computation:

Index([ Equity(8655 [INTU]), Equity(20306 [UTHR]),     Equity(755 [BC]),
Equity(8383 [FL]),  Equity(39111 [PPC]),  Equity(26439 [VVI]),
Equity(34997 [IQNT]),  Equity(3212 [GILD]), Equity(35040 [APEI]),
Equity(43713 [PBF]), Equity(38633 [ASPS]),  Equity(1219 [CATO]),
Equity(3735 [HPQ]),  Equity(18508 [WDR]),    Equity(734 [BAX]),
Equity(22766 [CVI]), Equity(19985 [HSII]), Equity(18595 [NHTC]),
Equity(13450 [KFRC]),  Equity(28450 [AGX]), Equity(13698 [MYGN]),
Equity(8613 [CHDN]),   Equity(6458 [RGR]),  Equity(45771 [MMI]),
Equity(26563 [WLK]),  Equity(11645 [GBX]),   Equity(3660 [HRB]),
Equity(37869 [VRTS]),  Equity(18917 [HZO]),    Equity(812 [BEN]),
Equity(27413 [NSR]),  Equity(36372 [SNI])],
dtype='object')


And the shorts selected by the algorithm are:

Index([  Equity(48384 [QRVO]),    Equity(1016 [BOBE]),      Equity(913 [BKH]),
Equity(9096 [BDN]),     Equity(3642 [HOT]),    Equity(46205 [SFR]),
Equity(21401 [HSTM]),    Equity(38446 [GOV]),     Equity(2169 [CVA]),
Equity(46671 [RUBI]),   Equity(21457 [VDSI]),   Equity(41271 [STAG]),
Equity(10535 [SUI]),   Equity(16307 [VSAT]),    Equity(3210 [GIII]),
Equity(32770 [DEI]),    Equity(20696 [CIR]),     Equity(27822 [UA]),
Equity(16841 [AMZN]),   Equity(34972 [ROIC]),   Equity(47912 [ZAYO]),
Equity(23709 [NFLX]),    Equity(21713 [WPC]),    Equity(45617 [QTS]),
Equity(15365 [PEGA]),    Equity(10984 [MAC]),   Equity(42919 [WAGE]),
Equity(46569 [PCTY]),    Equity(10027 [REG]),   Equity(45526 [PEGI]),
Equity(36896 [ASCM_A]),   Equity(46742 [ZOES])],
dtype='object')


Worryingly, it has decided to short AMZN and NFLX. Still going through the rest of the picks to see whether the algo makes sense.

Sunil

The implementation of the pipeline is now complete, but untestable. The original post has been updated with additional information and an updated pipeline image. The long equities selected now are:

Index([Equity(19985 [HSII]),  Equity(8655 [INTU]),  Equity(28450 [AGX]),
Equity(755 [BC]), Equity(35040 [APEI]),  Equity(1219 [CATO]),
Equity(13698 [MYGN]), Equity(22336 [RECN]), Equity(20306 [UTHR]),
Equity(24799 [MTN]),  Equity(26439 [VVI]),    Equity(300 [ALK]),
Equity(24791 [OUTR]),   Equity(3735 [HPQ]), Equity(44778 [PGEM]),
Equity(523 [AAN]),   Equity(3660 [HRB]),  Equity(18917 [HZO]),
Equity(4010 [IR]),  Equity(4246 [KLAC]), Equity(33016 [ALGT]),
Equity(8383 [FL]),  Equity(33729 [DAL]),   Equity(162 [AEPI]),
Equity(40749 [FRP]),   Equity(2263 [DOW]), Equity(24519 [SWHC]),
Equity(23599 [JBLU]),  Equity(18508 [WDR]),    Equity(734 [BAX]),
Equity(8613 [CHDN]), Equity(39563 [PLOW])],
dtype='object')


And the shorts are:

Index([   Equity(6455 [RGLD]),    Equity(10027 [REG]),   Equity(48543 [SHAK]),
Equity(10254 [PTEN]),   Equity(45526 [PEGI]), Equity(36896 [ASCM_A]),
Equity(5213 [NBL]),   Equity(26733 [CUBE]),    Equity(26562 [KRG]),
Equity(44990 [HDS]),     Equity(553 [ASEI]),     Equity(8214 [WMB]),
Equity(24482 [EQIX]),      Equity(448 [APA]),   Equity(25307 [BRKR]),
Equity(40547 [TRGP]),    Equity(42784 [FET]),     Equity(6900 [SJI]),
Equity(20394 [MDRX]),    Equity(2505 [ELRC]),   Equity(18010 [DEPO]),
Equity(110 [ACXM]),      Equity(35330 [N]),   Equity(39782 [BSFT]),
Equity(47126 [MRD]),    Equity(46205 [SFR]),     Equity(6904 [JOE]),
Equity(7244 [SWN]),     Equity(3436 [HAE]),    Equity(45521 [RNG]),
dtype='object')


Given that the backtester can't handle a pipeline this size, and that it is impossible to extend the lookback for fundamental data beyond one year, the algorithm is, well, useless.

Sunil

I am having the same problem. Since fundamental data does not change every day, a way to specify data frequency will save memory. Also ability to get data corresponding to both financial and calendar years. (E.g. last quarter update will not be available 1 calendar quarter into the past.)

Hi Sunil, i'm working on the same thing, and used np.where to create the financial stability factors - see below:

class ROA(CustomFactor):
inputs = [ morningstar.operation_ratios.roa ]
window_length = 1
def compute(self, today, assets, out, roa):
out[:] = np.where(roa[-1]>0,1,0)