Back to Community
Market/Security prediction using Machine Learning classifier and Google Trend data

I have created this notebook that tries to machine learn patterns between keywords search volume as provided by Google Trend and a selected security and see if some alpha signal can be generated.

The data files that is used by the notebook can be downloaded from:

Please provide your comments.



Loading notebook preview...
9 responses

Here is a backtest result using multiple securities.

Clone Algorithm
Total Returns
Max Drawdown
Benchmark Returns
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
import numpy as np
import pandas as pd
from datetime import datetime, timedelta
import random

# called once per simulation
def initialize(context):
    # random.seed(None)
              date_column = 'Date',
              date_format = '%y-%m-%d',
    context.leverage_factor = 1.0
    context.number_of_positions_ratio = 0.8 #len(context.stocks)
    context.long_short_ratios = []
    schedule_function(weekly_market_open, date_rule=date_rules.week_start(), time_rule=time_rules.market_open(minutes=60))
def weekly_market_open(context, data): 
    long_buys = []
    short_buys = []
    df = data.current(data.fetcher_assets, ['probability', 'predictiveness', 'prediction'])
    df.sort_values(by='probability', ascending=True, inplace=True)
    df_high_predictiveness = df[-1*int(context.number_of_positions_ratio*len(df)):]
    for stock in df_high_predictiveness.index:        
        if df_high_predictiveness.loc[stock, 'prediction'] > 0:
    # removing untradable stocks.
    for stock in long_buys:
        if not data.can_trade(stock):

    for stock in short_buys:
        if not data.can_trade(stock):
    # sell stocks that are in portfolio but were not selected for this coming week.
    for stock in context.portfolio.positions:
        if stock.sid not in [x.sid for x in long_buys] and stock.sid not in [x.sid for x in short_buys]:
            order_target_percent(stock, 0)
    long_calls = len(long_buys)
    short_calls = len(short_buys)
    for stock in long_buys:
        order_target_percent(stock, context.leverage_factor/(long_calls+short_calls))
    for stock in short_buys:
        order_target_percent(stock, -1*context.leverage_factor/(long_calls+short_calls))
        long_short_ratio = float(long_calls - short_calls)/(short_calls + long_calls)
        long_short_ratio = 0
    # beta edge the whole thing.
    if long_short_ratio > 0.5:
        order_target_percent(sid(8554), -1.1*long_short_ratio*context.leverage_factor)    
There was a runtime error.

@luc, I just came across your post. This is great work! Would you be willing to share your code on how you generated the data file?


The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

This is really really cool!! Wondering if it can be fitted to a futures strategy..

@Jonathan, @Nicky => Thanks.

Here is a link to the python notebook, keyword list and some historical gt data. All the data file contains are the Google Trend values has received from G.

The notebook can't run in research, but will run fine on a local machine. The above notebook should also be run on a local machine and the dataframe saved as a CSV. That CSV is in turn used by the Q algo to run the back test.

The main issue with this algo at this point is that there are not enough triggered shorts to have a statistically significant backtest.

Futures may be interesting to test, I'll have to look into it.

@luc.. Great idea to relate Google Trends and the stocks performance. As a newbie, I am trying to understand how the following works.

f_high_predictiveness = df[-1*int(context.number_of_positions_ratio*len(df)):]


Disregard this line of code. I was trying to rank predicted securities by their past "predictiveness", and that simply does not work.


Again, (as Dropbox killed the public folder for 1TB users :( ):

Link to latest Google trend data:


Luc, can you link the for the user who is not 1TB users. Thanks.

Instead of trading directly a SPY ETF, I applied the algorithms result to a simple momentum algo (I just used Q's simple momentum algo). i.e. When the ML of GTs indicates calm waters ahead, the momentum algo is biased long. When the ML indicates potential market reversal, the momentum algo goes a little short. Here is the backtest. So, maybe the initially proposed algo does not stand on its own, but rather could be used as a long/short ratio signal.


Loading notebook preview...