Let's see if Technical Analysis works

With "works" I mean it can be applied to a broad range of securities giving consistent results (statistically significant).

43
Notebook previews are currently unavailable.
15 responses

Hi Luca,

Here is tearsheet of Trading QQQ-IEF with Peculiarities of Volatility (CBOE_VIX).

5
Notebook previews are currently unavailable.

Here is tearsheet of Z-Score Adaptive Asset Allocation (SPY-SH) my version of Grant Kiehne SPY & SH, minute data

2
Notebook previews are currently unavailable.

Here is tearsheet of a version Andreas Clenow's Stocks On The Move by Guy Fleury, Charles Pare and Vladimir Yevtushenko
@Luca, It will be right to remove "if" from header.

5
Notebook previews are currently unavailable.

Vladimir, great numbers. Here is my version Stocks On The Move by Andreas Clenow under the same trading interval. Hope you find something useful in it.

1
Notebook previews are currently unavailable.

Interesting topic, particularly in light of the recent Quantopian call for the use of so-called "alternative data" as a path toward "new alpha" that could be blended into the 1337 Street Fund. We now also have the risk model, which contains several relevant "style risk" factors:

momentum
short_term_reversal
volatility


Presumably, these risk factors are expressed as Pipeline factors that take in daily OHLCV bars only, thus making them technical indicators. A large number of technical indicators could be added to the risk model straightforwardly. For example, all of the TA-Lib functions in the Q API plus all of the 101 Alphas that don't use fundamentals could be cranked into the risk model. Any other known technical indicators, such as the ones in the ML example, could be captured, as well (in addition to the numerous other technical indicators used in Q examples/lectures/etc.). Just roll them all into a technical_indicator risk factor and call it a day.

One vague, abstract thought I have is that a technical indicator can be defined as any black-box algorithm that takes in a window of trailing OHLCV bar data (or, I suppose, tick data) and returns values for stocks that may be predictive of returns. It seems that the input data need to have certain characteristics for this recipe to have any chance of success whatsoever. In other words, rather than testing a gazillion technical indicators, it might be more efficient and productive to consider how to detect persistent market regimes when any technical indicator would have a reasonable probability of working. I'm thinking along the lines of autocorrelation, Hurst exponent, etc.--tests to see if the OHLCV data consists of more than just a gobbledygook random walk, which wouldn't lend itself to any technical indicator.

The other angle is that the definition of "works" may depend on how the technical indicator is applied. My understanding is that the right ML technique should be able to combine transient, but statistically significant "working" technical indicators such that any one of them wouldn't "work" but would be valuable as a transient feature in the ML algorithm. I've yet to see an existence proof of this idea on Q, but one can imagine the technical indicators would go in and out of "workiness" in a predictable fashion, and that the right ML algorithm would work to derive "new alpha" for The Man and get paid for one's efforts. I wonder if the The Man's 1337 Street Fund is being approached in this fashion? Or if it will simply be a weighted fund of funds approach?

@ Vladimir - glad you picked up the SPY & SH, minute data algo from the Quantopian Dark Ages (well before the Hedge Fund Enlightenment and our current Post-Broker Integration Era). Perhaps it could be re-formulated to work on individual stocks, instead of ETFs, which I guess Q doesn't want in the 1337 Street Fund (at least ETFs from users' algos).

Grant, when you say: “...OHLCV data consists of more than just a gobbledygook random walk, which wouldn't lend itself to any technical indicator.

Well, ...yes. And this would also apply to any price series approaching randomness from whatever side. The closer price series get to randomness, the less effective all those trading methods based on trying to predict what is coming next, and the more they have to rely on uncertain statistical probabilities.

A lot of the tests seen here on Q are just that: attempts to predict random-like price series. And we can see a lot of strategies breaking down in their walk forward, which in itself is an indication of either over-fitting on randomness, or not taking care of return degradation, or relying on statistical properties that are not really there or not predictable with any accuracy.

Recently, I did a one-year walk forward on my favorite trading strategy. Its third. The objective was to show that the strategy would have maintained its CAGR level, even if the strategy had last been modified in November 2015. So, this made it a 2-year out-of-sample test on data that the strategy could not have known anything about. Trading interval: 21+ years. Over 100,000 trades.

Well, see the simulation results for yourself:

It came with my analysis of the strategy's properties:

The particularity of this trading strategy is that it enters and exits all its trades according to biased random-like functions (random()>x). So evidently, no need of indicators. It has no lag either since based on the flip of a coin. And yet, even under those “uncertain” trading conditions it achieved outstanding long-term results.

There are other avenues than trying to predict future prices. You could simply set a methodology, give trading procedures a long-term vision of things where your strategy can evolve to be whatever you want it to be by also gaining control over it.

...from the Quantopian Dark Ages (well before the Hedge Fund Enlightenment ...

LOL :D

I lost the NBs with results but if someone fancy seeing the Alphalens tear sheets here are some other technical indicators code + the actual Andreas Clenow's momentum. I remember I wasn't impressed by the results.

class UlcerIndex(CustomFactor):
inputs = [USEquityPricing.close]

def compute(self, today, asset_ids, out, close):
max = np.maximum.accumulate(close,axis=0)
dd = (max-close)/max
dd = dd**2.0
ui = np.sqrt(np.mean(dd,axis=0))
out[:] = ui

class HurstExp(CustomFactor):
inputs = [USEquityPricing.open]
window_length = int(252*0.5)
def Hurst(self, ts):
# Create a range of lag values
lags=np.arange(2,20)
# Calculate variance array of the lagged differences
tau = [np.sqrt(np.std(np.subtract(ts[lag:], ts[:-lag]))) for lag in lags]

# Use a linear fit to estimate
#poly = np.polyfit(np.log(lags), np.log(tau), 1)[0]

# 1st degree polynomial approximation for speed
# source: http://stackoverflow.com/questions/28237428/fully-vectorise-numpy-polyfit
n = len(lags)
x = np.log(lags)
y = np.log(tau)
poly = (n*(x*y).sum() - x.sum()*y.sum()) / (n*(x*x).sum() - x.sum()*x.sum())

# Return the Hurst exponent
return poly*2.0
def compute(self, today, assets, out,  OPEN):
SERIES = np.nan_to_num(OPEN)
hurst_exp_per_asset = map(self.Hurst, [SERIES[:,col_id].flatten() for col_id in np.arange(SERIES.shape[1])])
#print 'Hurst Exp:\n',hurst_exp_per_asset
out[:] = hurst_exp_per_asset

from scipy import stats

def _slope(ts, x):
#x = np.arange(len(ts))
log_ts = np.log(ts)
slope, intercept, r_value, p_value, std_err = stats.linregress(x, log_ts)
annualized_slope = (np.power(np.exp(slope), 250) - 1) * 100
return annualized_slope * (r_value ** 2)

class Momentum(CustomFactor):
inputs = [USEquityPricing.close]
params = ('exclude_days',)
def compute(self, today, assets, out, close, exclude_days ):
ts = close[:-exclude_days]
x = np.arange(len(ts))
slope = np.apply_along_axis(_slope, 0, ts, x.T)
out[:] = slope


@ Luca -

Thanks for the Hurst exponent factor. I guess my thinking at this point is to ponder how to characterize the OHLCV data at a base level--sorta like seeing if the wind is blowing at all before going outside to fly a kite. If the Q team is listening to this, I'd think there would be an opportunity to crunch your nice minute bar data back to 2002, and publish the result to users, to give some sense for how the market behaves at a gross level. Might be interesting.

I don't believe it would be that easy to catch the market regime and behave accordingly, but if you mange to do that let me know .

P.s. I believe the Hurst exponent factor comes from the community, but I can't remember precisely.

Luca,

Tearsheet above, corresponds to this version of the Clenow Momentum.
BTW there was such discussion here.

Another thought here is to attempt to combine technical indicators. For example, with 3 technical indicators, one can combine the alphas like this:

alpha = b0 + b1*alpha1 + b2*alpha2 + b3*alpha3 + b12*alpha1*alpha2 + b13*alpha1*alpha3 + b23*alpha2*alpha3 + b123*alpha1*alpha2*alpha3

There are 2^k terms, where k = 3 for this case, so there are 8 terms. The point is that technical indicators may work better in combination, than as stand-alone, and there is a systematic way to combine them.

@Vladimir, the factor I coded is from Clenow's updated version. I posted it here in case there are people curious of seeing the actual alpha factor without additional "noise" ( algorithm specific details). Also thanks for the link to the TA discussion, I forgot about that.

Hi Luca (or anybody listening who is knowledgeable enough to help)

I don't know if because some of the functions or datasets you're using in this notebook have been deprecated, but I get the following message after cloning & trying to run it:

construct factor history
Get pricing for 1095 entries
Alphalens
Dropping inf or -inf values from factor

/usr/local/lib/python2.7/dist-packages/ipykernel_launcher.py:16: DeprecationWarning: get_clean_factor_and_forward_returns: 'by_group' argument is now deprecated and replaced by 'binning_by_group' app.launch_new_instance()

Dropped 0.1% entries from factor data: 0.1% in forward returns computation and 0.0% in binning phase (set max_loss=0 to see potentially suppressed Exceptions).
max_loss is 35.0%, not exceeded: OK!

KeyErrorTraceback (most recent call last)
in ()
14 filter_zscore = filter_zscore,
15 long_short = long_short,
---> 16 prices_cache = prices_cache)

in run_tear_sheet(factor, factor_name, start_date, end_date, top_liquid, show_sector_plots, avgretplot, periods, quantiles, bins, filter_zscore, long_short, prices_cache)
188 periods_before=days_before,
189 periods_after=days_after,
--> 190 demeaned=long_short)
191 alphalens.plotting.plot_quantile_average_cumulative_return(avgcumret, by_quantile=False,
192 std_bar=False)

/usr/local/lib/python2.7/dist-packages/alphalens/performance.pyc in average_cumulative_return_by_quantile(factor_data, prices, periods_before, periods_after, demeaned, group_adjust, by_group) 1009 return q_returns.unstack(level=1).stack(level=0)
1010 elif demeaned:
-> 1011 fq = factor_data['factor_quantile']
1012 return fq.groupby(fq).apply(average_cumulative_return, fq)
1013 else:

/usr/local/lib/python2.7/dist-packages/pandas/core/series.pyc in getitem(self, key) 581 key = com._apply_if_callable(key, self)
582 try:
--> 583 result = self.index.get_value(self, key)
584
585 if not lib.isscalar(result):

/usr/local/lib/python2.7/dist-packages/pandas/indexes/multi.pyc in get_value(self, series, key) 624 raise InvalidIndexError(key)
625 else:
--> 626 raise e1
627 except Exception: # pragma: no cover
628 raise e1

KeyError: 'factor_quantile'

Do you know what could be the issue?
Great and potentially very useful notebook by the way (ie, if it works).
Thanks a lot.

Hey Josh,

Apologies if you already figured it out, but I believe it would work if you removed ['factor_quantiles'], and just use factor_data.

Hope it works.