Back to Posts
Listen to Thread

Here is a simple example of a pattern recognition algorithm based on the text string compression routine zlib. Thanks to James Jack for useful discussions and coding the NCD and CDM functions, and to Quantopian for enabling zlib.

For those interested in doing an intellectual "deep dive" into the topic, a starting point is:

Ming Li; Xin Chen; Xin Li; Bin Ma; Vitanyi, P.M.B.; , "The similarity metric,"
Information Theory, IEEE Transactions on , vol.50, no.12, pp. 3250- 3264, Dec. 2004
http://homepages.cwi.nl/~paulv/papers/similarity.pdf

Clone Algorithm
34
Loading...
Backtest from to with initial capital ( data)
Cumulative performance:
Algorithm Benchmark
Custom data:
Week
Month
All
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Information Ratio
--
Benchmark Returns
--
Volatility
--
Max Drawdown
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Information Ratio 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.

I ran the algorithm on SPY & SH (S&P 500 ETF & S&P 500 short ETF, respectively). Sample output, with X & Y representing the coded prices (relative to their respective moving averages) over a 30-day trailing window:

2012-02-23handle_data:53INFO----------------------------------  
2012-02-23handle_data:54INFO X: 000000000000000011111111111111  
2012-02-23handle_data:55INFO Y: 111111111111111100000000000000  
2012-02-23handle_data:56INFO----------------------------------  
2012-02-23handle_data:57INFO NCD: 0.142857142857  
2012-02-23handle_data:58INFO CDM: 0.571428571429  
2012-02-23handle_data:59INFO----------------------------------  

One might have expected NCD & CDM both to be ~ 1 (indicating a high degree of dissimilarity). Instead, both indicate a relatively high similarity between SPY & SH (NCD << 1 & CDM ~ 0.5). The interpretation, I think, is that SPY & SH (as coded) have the same information content. For example, if I am given the SPY time series, I can predict the SH time series (so long as I know that it moves in the opposite direction). I obtain a similar result for the pair SPY & IVV, which move in the same direction.

Clone Algorithm
34
Loading...
Backtest from to with initial capital ( data)
Cumulative performance:
Algorithm Benchmark
Custom data:
Week
Month
All
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Information Ratio
--
Benchmark Returns
--
Volatility
--
Max Drawdown
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Information Ratio 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
This backtest was created using an older version of the backtester. Please re-run this backtest to see results using the latest backtester. Learn more about the recent changes.
There was a runtime error.
Log in to reply to this thread.
Not a member? Sign up!