Simplest Machine Learning With KNN, Benchmark QQQ

This algorithm uses very simple machine learning technique KNN as regressor and tries to predict 3 days return based on look back historical data.
It selects stocks in QQQ holdings and target 3 days return. For each stock, load historical price data from 2013-2016 and generate a dataframe containing 1,2,3,5,10 days adj.close price change.

    df['T1'] = df['price'] / df['price'].shift(1)  # 1 day change
df['T2'] = df['price'] / df['price'].shift(2)  # 2
df['T3'] = df['price'] / df['price'].shift(3)  # 3
df['T5'] = df['price'] / df['price'].shift(5)  # 5
df['T10'] = df['price'] / df['price'].shift(10)  # 5


Next, add a column (Y) containing 3 days returns to this dataframe (Look ahead).
 df['y'] = df['price'].shift(-target_days) / df['price'] # N days return from today  For each symbol create this and build a regressor with this dataframe as training data.
When backtesting, everyday calculate 1,2,3,5,10 change after market open for each stock and make prediction based on this data point [T1,T2,T3,T5,T10]. If the (y=)prediction >1% buy and hold this stock for 3 days, if the prediction drop below certain threshold in any of the 3 days, sell it.
Again this is a very simple algorithm but I think it is a good starting point or maybe a good weak learner to build more complex models. When backtesting, you should ensure training data period does not intersect backtesting period to avoid look ahead bias.
For KNN model, number of neighbors and distance metric are two parameters that require fine tuning, and I feel that chebyshev distance is almost always better than euclidean distance when it comes to time series.
This is actually my first algorithm so please feel free to point out if there is any error. Thanks!

58
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 58f1313cf69a3d61860e4e6b
There was a runtime error.
1 response

Here is another algorithm based on K nearest neighbors that uses a number of technical indicators as features in addition to price difference lags. The KNN regressor is trained weekly to predict the next day's return. The sign of the prediction is used to construct a long only portfolio of pre-selected funds. The algorithm is traded daily with a prediction threshold that attempts to reduce the number of transactions.

20
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 59408706c5dec36971903354
There was a runtime error.