Alpha Vertex PreCog test

From the looks of this chart, it appears to be over fit data. Very disappointing performance in 2017. Anyone else noticing such degradation in 2017?

3
23 responses

I noticed the same and that was my main concern with this Dataset.

Another attempt, slightly better but still draw down period is more than 6 months.

3

Yep. If you see the discussion on https://www.quantopian.com/posts/alpha-vertex-precog-dataset, Michael Bishop (one of the Alpha Vertex guys) claims that they are "hyper sensitive to both overfit and lookahead bias" but didn't offer up any evidence why we should think that there could be a problem. Your results show a big gnarly inflection point out of sample, suggesting they need to sharpen their pencils.

I should mention that I have a filter on profitable stocks (ebitda > 0) only. Maybe that is affecting the results. Will post a new backtest without the filter shortly.

Actually there could be a simpler test. We could take the top 500 stocks and measure the prediction score insample and out of sample and compare.

Is this really surprising?

Well, it is a surprise worth $135/month ;) Well, I'd kinda thunk they would have done 6-12 months of paper trading before going live on Q, but maybe they rushed into things. It is kinda surprising that they'd go public without being pretty darn sure they'd latched onto something. It'll be interesting to see the results in a year or two. I accept @Pravin test results and other member comments to the effect that the PreCog dataset might not be that predictive. I came to the same conclusion. But, here is the thing. Even if there might be no predictive powers behind the PreCog dataset, it might not matter. For instance, I could not differentiate the dataset's advantage from what is available from the market. But, it did provide me with an excuse to get in and out of trades. And, for me, that was sufficient. See my latest posts on that subject here: https://www.quantopian.com/posts/alpha-vertex-precog-dataset The outcome of the trading strategy has even more merit if the dataset is not the reason for the alpha generation. The alpha generated is due to the trading mechanics, the methodology used in this particular trading strategy. And since most of the program has been changed, I can not say that the original program is responsible for the outcome either. I changed the strategy's trading philosophy by controlling its entire payoff matrix, step by step. It is like in any kind of software development, you can not do it all at once. You need to debug and test your code as you go along. And the first test is always, does the program crashes or not? And then, let's see the results. Did the program do what was requested? One issue I see here is that there's no visibility into changes of the "algo" used by Alpha Vertex. I wonder if Q, as part of their arrangement with the vendor, gets any heads-up, or if Alpha Vertex can make changes as they see fit, without notification? In other words, say we wait a year to see how things play out. Will the Alpha Vertex team have been fiddling with the strategy all the while, so that it will be hard to sort out out-of-sample performance, relative to a backtest? I sent an email asking them these questions but they never replied :( Grant, Pravin has demonstrated in his first post that the PreCog dataset had no alpha. He also demonstrated that it appeared to be breaking down near the end, after their dataset release. Since then, Alpha Vertex has not come out to defend their approach or their data. Shouldn't that alone answer your question? After modifying the program in the other thread, I found I could extract some alpha by first ignoring most of its trading procedures, and the composition of the dataset itself. I used the program's structure to generate trading activity that appeared more as a statistical excuse to trading than any kind of forecasting tool. So this raises a funny question. If you totally ignore the predictive abilities of a dataset and still use part of it as an excuse to trade, are you in fact making predictions on that dataset? Even if it is just one in a gazillion possible subsets of the Q1500US? Just doing a head-scratch on how such derived signals supplied as data feeds fit within the Q-sphere. It's kinda like plugging into some Q user's algo that he could change at will without notice. I guess if things are done well and consistently, it's all good, but if not, then there's no way to know what's going on. There's no prospectus for this kind of beast. It is surprising that it is even legal (the same could be said of some other signal feeds, as well). Caveat emptor, I guess. Interesting that they are not regulated. @Grant, are you addressing the legality of what a data vendor provides or the legitimity of Q using it? That dataset, or another, might not matter much. @Pravin has already shown that this one might not have any alpha. I concurred with his findings, saying that the trading strategy might be trading on market noise. And, if trading on market noise, you are trading with no available alpha coming from the made predictions. If you make trades following someone else's predictions, and you do not make any money, then the answer is very simple: those predictions were no good, no better than the general market. If the predictions had value, you would inevitably outperform just by using that dataset. And again, @Pravin showed it was not the case. I found a slight difference in the dataset from what the market had to offer. The dataset's advantage was$ 0.27 on an $8,889 bet. Not enough to even say there was a difference between the general market noise and the dataset. But, I do not mind. I see my job as strategy developer to extract some alpha from all the data no matter what. If I succeed, I can only attribute it to the trading methodology used since the data itself was of no real help (no predictive powers). And this alone makes the trading procedures used to generate the alpha valuable. Updated comparison in-sample vs out-of-sample performance . Disappointing IN SAMPLE start_date = '2010-01-01' end_date = '2017-02-01' OUT OF SAMPLE start_date = '2017-02-01' end_date = '2017-09-10' 6 Click to load notebook preview The question in my mind is how one might determine if the innards of the Alpha Vertex Precog black box are the same today as they were before, in-sample compared to out-of-sample (or more generally, versus time, simply attempt to detect changes, ignoring any knowledge of in-sample and out-of-sample time periods). It is not entirely obvious to me that the most recent 7-month period is anomalous. In other words, at a high statistical confidence level, can we say that the most recent 7-month period differs from any other 7-month period picked at random from the data set? Or considering the data set as a time series, is there a test that could be applied that would suss out statistically significant anomalies? And is there a major anomaly associated with the transition from in-sample to out-of-sample periods? The problem I see with this sort of high-level black box signal feed is that it is then on the quant to develop protection against changes to the inner-workings of the black box (whether due to human fiddling or a lack of robust algo design or whatever). As I understand, it is completely unregulated; there is no obligation to notify anyone of anything. So, it seems one needs code to check periodically if something changed within the black box, based on its output time series. @Grant, I understand your point but what can we do about it? Trying to evaluate how you can trust a black box is what Quantopian has been doing for some years where the black boxes are the algorithms built from the users. Sure, Quantopian knows the algorithms don't change once they have them, this is different from the datasets. But even so, if an algorithm uses machine learning, even without making any change in the code the model evolves and you need to decide somehow when stopping "trusting" an algorithm and when keeping using it. So the problem of trusting a dataset is indeed similar to the evaluation process of algorithms performed by Quantopian. It's not an easy task, I am pretty sure about it and Quantopian could add some information but the final answer is that either you invest lots of time building your "dataset evaluation engine" or you simply use a dataset that performs well and stop using it when it doesn't anymore. Also, given that live trading is no possible anymore on Quantopian, the problem of trusting a dataset is now a problem for the Q hedge fund only. As the performace of a dataset is inherited by the algorithm performance , and as Quantopian has already in place a process for out-of-sample evaluation of their algorithms, I believe they have already solved their problem. the problem of trusting a dataset is now a problem for the Q hedge fund only Not really, in my opinion. The case in point brought to mind how it would be nice to have a "black box change detector" that would do better than waiting 6-12 months for out-of-sample data. If our intuition is correct, there is something fishy going on with the PreCog data set. It is also reasonable to think that the fishiness was an event in time (i.e. a regime change in the time series). So, I'm wondering if the right technique (e.g. http://scikit-learn.org/stable/) could be used to detect changes earlier. I'd heard that Quantopian would like to offer thousands of data sets. Such a "regime change detector" might be useful, particularly if they include black box data such as the PreCog data set. The idea would be that if a regime change is detected, simply to drop the data set, with the notion that something in its black box was changed dramatically, and so it needs to be re-vetted. Here's an update to Luca's notebook above. I'm not so experienced in interpreting these things, but it seems that the Alpha Vertex ML algo still hasn't sorted out how to make money. 4 Click to load notebook preview Here's an updated backtest of Jamie's on https://www.quantopian.com/posts/alpha-vertex-precog-dataset (his backtest # Backtest ID: 58bde10db3fab35e38fef4cc). I only changed the end date of the backtest. I'll post a tear sheet next. 4 Loading... Total Returns -- Alpha -- Beta -- Sharpe -- Sortino -- Max Drawdown -- Benchmark Returns -- Volatility --  Returns 1 Month 3 Month 6 Month 12 Month  Alpha 1 Month 3 Month 6 Month 12 Month  Beta 1 Month 3 Month 6 Month 12 Month  Sharpe 1 Month 3 Month 6 Month 12 Month  Sortino 1 Month 3 Month 6 Month 12 Month  Volatility 1 Month 3 Month 6 Month 12 Month  Max Drawdown 1 Month 3 Month 6 Month 12 Month """ _/ _// _// _/ // _// _// _/ _// _// _// _// _// _// _// _////// _// _// _// _// _// _//// _// _// _// Alpha Vertex combines data science and machine learning technologies to deliver cognitive systems that provide advanced analytical capabilities to the investment community. PreCog is an analytical service built on top of the AV Knowledge Graph which uses machine learning models to forecast returns at multiple horizons. Precog leverages thousands of artificial intelligence models working in unison to analyze high dimensional financial datasets in order to produce predictive signals that can be incorporated into trading strategies. Disclaimer: ----------- Alpha Vertex Inc. is not a provider of trading algorithms or strategies. It is the responsibility of the user to incorporate the signals provided by Alpha Vertex into a trading strategy of his or her design. Any example strategies provided by Alpha Vertex are for illustrative purposes only and are not intended to represent in whole, or in part a robust trading strategy that should be used without modifaction in a live trading scenario. Past performance is not indicative of future results and Alpha Vertex does not guarantee any specific outcome or profit. Before acting on any information in the strategy below, you should consider whether it is suitable for your particular circumstances and strongly consider seeking advice from your own financial or investment adviser. """ ############################################# # Import Modules ############################################# # Get the PreCog data # Premium version availabe at # https://www.quantopian.com/data/alphavertex from quantopian.pipeline.data.alpha_vertex import precog_top_500 as precog from quantopian.algorithm import attach_pipeline, pipeline_output from quantopian.pipeline import Pipeline from quantopian.pipeline.data.builtin import USEquityPricing from quantopian.pipeline.factors import AverageDollarVolume, CustomFactor from quantopian.pipeline.filters.morningstar import Q1500US import numpy as np import pandas as pd from datetime import datetime import pytz ############################################# # Strategy Parameters and Definition ############################################# def initialize(context): """ Initialize the trading strategy and set key parameters. Called once at the start of the algorithm. Strategy Background: --------------------- Create a Long / Short trading strategy which utilizes the Alpha Vertex 5 day daily forecast of stock log returns. Create two custom factors 1) normalize the projected log returns for each stock by the historical volatility of those returns 2) select stocks whose recent prediction quality is high from the universe of over 500 candidate names. Go long stocks with a high normalized expected return and go short stocks with a negative expected return. Hold positions for a minimum of 2 days and a max of 5 days. """ # define position limits context.max_longs = 1.0 context.max_shorts = -1.0 # min weight for a long position context.long_weight_min = 0.02 # max weight for a long position context.long_weight_max = 0.10 # min weight for a short position context.short_weight_min = -0.02 # max weight for a short position context.short_weight_max = -0.04 # zscore of returns. # Select stocks with above average expected returns context.ret_zscore_thresh = 0.25 # Positing holding periods # hold stocks for at least 2 days and at most 5 days context.min_hold_period = 2.0 context.max_hold_period = 5.0 context.rebal_dict = {} # prediction quality threshold context.pred_quality_thresh = 0.65 # liquidity thresholds context.min_dollar_volume = 5e5 # Rebalance every day, 1 hour after market open. schedule_function(trade_rule, date_rules.every_day(), time_rules.market_open(hours=1)) # Record tracking variables at the end of each day. schedule_function(my_record_vars, date_rules.every_day(), time_rules.market_close()) # Create our dynamic stock selector. attach_pipeline(make_pipeline(context), 'av_pipe') def make_pipeline(context, pred_quality_thresh = None): """ Dynamically apply the custom factors defined below to select candidate stocks from the PreCog universe """ if pred_quality_thresh is None: pred_quality_thresh = context.pred_quality_thresh # liquidity screen. universe = Q1500US() & precog.predicted_five_day_log_return.latest.notnan() # use the Alpha Vertex PreCog 500 universe only #av_universe = liq_screen > context.min_dollar_volume # filter on predition quality. # Select stocks that have a pred_quality > 65%. # This can be changed above in the initialize function prediction_quality = PredictionQuality(mask=universe) quality = prediction_quality > pred_quality_thresh # Filter on normalized returns. # Exclude stocks that are predicted to have very large outsized/irregular returns normalize_return = NormalizedReturn(mask = quality) ret_filter = (normalize_return > -10) & (normalize_return <10) ## create pipeline columns = {'NormalizedReturn' : normalize_return, 'PredictionAcc': prediction_quality} pipe = Pipeline(columns=columns, screen=ret_filter) return pipe def get_longs_and_shorts(context, data, ret_zscore_thresh=None): """ Get stocks that satify the pipeline filtering criteria and calculate the target portfolio weights for each stock """ # use default value for return zscore threshold if none passed # change this to what you want in initialize function if ret_zscore_thresh is None: ret_zscore_thresh = context.ret_zscore_thresh # get stocks that satisfy the screening criteria specified in pipeline abover pipe_output = context.filtered_df # replace NaNs with zero pipe_output['NormalizedReturn'] = pipe_output['NormalizedReturn'].replace(np.nan, 0) # Candidate long positions: stocks with a return zscore above the min threshold long_df = pipe_output[pipe_output['NormalizedReturn'] > ret_zscore_thresh] # Calculate weight for long candidates # avoid divide by zero errors if len(long_df) >0 : long_df['weight'] = long_df['NormalizedReturn']/long_df['NormalizedReturn'].sum() # Candidate short positions: stocks with a return zscore above the min threshold short_df = pipe_output[pipe_output['NormalizedReturn']< -ret_zscore_thresh ].sort_values(by = 'NormalizedReturn', ascending=True) # Calculate weight for short candidates # avoid divide by zero errors if len(short_df) >0: short_df['weight'] = short_df['NormalizedReturn']/short_df['NormalizedReturn'].sum() return long_df, short_df def before_trading_start(context, data): """ Get the day's buy and sell list """ # stocks that meet the pipeline filtering criteria context.filtered_df = pipeline_output('av_pipe') # long and short positions along with target portfolio weights context.long_df, context.short_df = get_longs_and_shorts(context, data) # Lower the return threshold standard if too few stocks are returned if (len(context.long_df) < 20) & (len(context.long_df)>0): context.long_df, context.short_df_ = get_longs_and_shorts(context, data, ret_zscore_thresh = context.ret_zscore_thresh*0.9) # lower standard one more time but stop after this if (len(context.long_df) < 20) & (len(context.long_df)>0): context.long_df, context.short_df_ = get_longs_and_shorts(context, data, ret_zscore_thresh = context.ret_zscore_thresh*0.8) # get list of tickers to go long and short context.long_assets = [] context.short_assets = [] if len(context.long_df) >0: context.long_assets = context.long_df.index if len(context.short_df) >0: context.short_assets = context.short_df.index def trade_rule(context, data): """ Execute the trading strategy every day Trade Rule: _____________ BUY stocks that are predicted to go up over the next 5 days. Hold position for at least 2 days and at most 5 SELL stocks that are predicted to go down over the next 5 days. Hold position for at least 2 days and at most 5 """ today = pd.to_datetime(get_datetime()) long_weight = 0 short_weight = 0 # don't buy more than the max long weight set in the initialize function if len(context.long_df) > 0: long_weight = min(context.max_longs/len(context.long_df), context.long_weight_max) # don't sell more than the max short weight set in the initialize function if len(context.short_df) >0: short_weight = max(context.max_shorts/len(context.short_df), context.short_weight_max) # combine the long and short lists together context.universe_df = context.long_df.append(context.short_df) context.universe_assets = context.universe_df.index # loop through tradeable universe and place orders if they meet trading criteria for security in context.universe_assets: if security not in context.rebal_dict.keys(): context.rebal_dict[security] = datetime(1901,1,1, tzinfo=pytz.utc) ###### BUY LOGIC ####### # Buy if prediction is +ve and we don't have an existing long position in the stock # or the prior position was short if data.can_trade(security) & (security in context.long_assets): curr_pos = context.portfolio.positions[security].amount days_held = np.busday_count(context.rebal_dict[security], today) # keep track of when the last time the stock was trading (useful for min holding period) if (days_held >= context.min_hold_period) &(curr_pos<=0): context.rebal_dict[security] = today order_target_percent(security, long_weight) ##### SELL LOGIC ###### # SELL if prediction is -ve and we don't have existing position in the stock # or the prior position was long if data.can_trade(security) & (security in context.short_assets): curr_pos = context.portfolio.positions[security].amount days_held = np.busday_count(context.rebal_dict[security], today) # keep track of when the last time the stock was trading (useful for min holding period) if (days_held >= context.min_hold_period) &(curr_pos>=0): context.rebal_dict[security] = today order_target_percent(security, short_weight) # close positions we have held for more than 5 days for security in context.rebal_dict.keys(): if data.can_trade(security): curr_pos = context.portfolio.positions[security].amount days_held = np.busday_count(context.rebal_dict[security], today) if (days_held > context.max_hold_period) &(curr_pos!=0): #log.info('close: {}'.format(security)) order_target_percent(security, 0.0) ############################################# # Custom Factors ############################################# class PredictionQuality(CustomFactor): """ create a customized factor to calculate the prediction quality for each stock in the universe. compares the percentage of predictions with the correct sign over a rolling window (3 weeks) for each stock """ # data used to create custom factor inputs = [precog.predicted_five_day_log_return, USEquityPricing.close] # change this to what you want window_length = 15 def compute(self, today, assets, out, pred_ret, px_close): # actual returns px_close_df = pd.DataFrame(data=px_close) pred_ret_df = pd.DataFrame(data=pred_ret) log_ret5_df = np.log(px_close_df) - np.log(px_close_df.shift(5)) log_ret5_df = log_ret5_df.iloc[5:].reset_index(drop=True) n = len(log_ret5_df) # predicted returns pred_ret_df = pred_ret_df.iloc[:n] # number of predictions with incorrect sign err_df = (np.sign(log_ret5_df) - np.sign(pred_ret_df)).abs()/2.0 # custom quality measure pred_quality = (1 - pd.ewma(err_df, min_periods=n, com=n)).iloc[-1].values out[:] = pred_quality class NormalizedReturn(CustomFactor): """ Custom Factor to calculate the normalized forward return scales the forward return expecation by the historical volatility of returns """ # data used to create custom factor inputs = [precog.predicted_five_day_log_return, USEquityPricing.close] # change this to what you want window_length = 10 def compute(self, today, assets, out, pred_ret, px_close): # mean return avg_ret = np.nanmean(pred_ret[-1], axis =0) # standard deviation of returns std_ret = np.nanstd(pred_ret[-1], axis=0) # normalized returns norm_ret = (pred_ret[-1] -avg_ret)/ std_ret out[:] = norm_ret ############################################# # Helper Functions ############################################# def my_record_vars(context, data): """ Plot variables at the end of each day. """ record(leverage=context.account.leverage, net_leverage=context.account.net_leverage, values = context.portfolio.portfolio_value / 1e6)  There was a runtime error. The tear sheet for the backtest above. It is worth noting that Q did add a "warning" of sorts on their data store pages: Note: Quantopian started collecting this dataset live on March 6, 2017. Why this matters: https://www.quantopian.com/posts/quantopian-partner-data-how-is-it-collected-processed-and-surfaced 1 Click to load notebook preview @Grant, that does make the point that the Precog dataset might have been somehow “massaged” if not “doctored”. This should raise a lot of other questions. Just as @Pravin had shown before that there might not be much alpha there, so does your tearsheet. Note that the strategy is making less than$3.00 net profit per trade on 62,851 trades. A slight change in fee structure, and apparently a little more time, and that might have disappeared too. Note also that the gross leverage came in at 3.46 and all it got was \$2.94 a trade to pay leveraging fees which were not accounted for.