Back to Community
How to pass a Pandas Series to the next day?

Hi,

I wanted to keep a cumulative Pandas Series with indices from day to day (say day 1 there are 3 stocks in the list and next day another stock is added to the list to make it 4 in total). I was only able to pass a list object but not Pandas Series or Pandas Dataframes. It seems that Series or Dataframes could not be appended for some reason. Any suggestions? Thank you so much!

In the following code, the all_fetched_stocks (a python list) can be extended but not the case for portfolio_series (a Pandas Series object)...

import numpy as np  
import pandas as pd

portfolio_series = pd.Series(['b'], index=[symbol('SPY')])  
all_fetched_stocks = list()

def preview(df):  
    log.info(' \n %s ' % df)  
    return df

def initialize(context):  
    fetch_csv('https://docs.google.com/spreadsheets/d/1zuFXXidaZQHdz8tZrEwAFxmxWQf0YRkBfQENlUbH2as/pub?output=csv',  
             date_column='Date',  
             pre_func = preview,  
             date_format='%Y/%m/%d')  
    schedule_function(show_fetched_assets, date_rules.every_day(), time_rules.market_open(minutes=1))  
def show_fetched_assets(context, data):  
    for stock in data.fetcher_assets:  
        all_fetched_stocks.append(stock)  
        temp_series = pd.Series(['n'], index=[stock])  
        portfolio_series.append(temp_series)  
    log.info(len(portfolio_series))  
    log.info(len(all_fetched_stocks))  

Many thanks! It is driving me crazy...

4 responses

Instead of a Pandas Series, I believe you want to store the values in a Pandas Dataframe. Therefore, simply change 'pd.Series' to 'pd.DataFrame' and you are almost there.

You should also make a few other changes:
1) use the 'columns' parameter when making a dataframe to give the data you are storing a label. It makes it easier to access those values later.
2) explicitly specify any globals as 'global' within a function IF you are re-assigning them (Python is picky about that). Better yet, put any 'globals' in the 'context' object (ie context.portfolio_series)

Also, the 'append' method will keep appending rows to the dataframe and happily create duplicate indexes. In this case, it will keep adding duplicate rows for each stock in the fetcher file. If one doesn't want duplicate rows, then use either the '.loc' or the '.set_value' method. Both of these will create a row if needed but otherwise simply update an existing row.

Some updated code is below, but also a version is in the attached backtest.

import numpy as np  
import pandas as pd


def initialize(context):  
    # Create our stock list as a dataframe  
    # Specify the column name to make it easier when referencing the values  
    context.portfolio_df = pd.DataFrame(['b'], index=[symbol('SPY')], columns=['my_label']) 

    fetch_csv('https://docs.google.com/spreadsheets/d/1zuFXXidaZQHdz8tZrEwAFxmxWQf0YRkBfQENlUbH2as/pub?output=csv',  
             date_column='Date',  
             pre_func = preview,  
             date_format='%Y/%m/%d')  


    schedule_function(show_fetched_assets, date_rules.every_day(), time_rules.market_open(minutes=1))  


def show_fetched_assets(context, data):  
    for stock in data.fetcher_assets:  
        # the 'set_value' method will create a new row if needed  
        # otherwise it simply updates any previous data (ie won't create duplicates)  
        context.portfolio_df.set_value(stock, 'my_label',  'n')


Clone Algorithm
2
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 5970f56c53dac551c3a74e3c
There was a runtime error.

Hi Dan,

You are my hero!!! Thanks so much!

I was actually looking for pass a dataframe but I could not get it done... Which made me even wonder if 'context' of quantopian bans passing a dataframe (just like write_csv was not allowed in Research Notebook).

Then I was looking for the second option, which is the Series. But still, I could get it to 'append' properly. And I could not tell if this is a pandas issue or quantopian issue.

I could not thank you enough.

I have another stupid question: where would you recommend if I want to learn or read a little bit "in-depth' on pandas and numpy, just like your knowledge on set_value() vs append()...

Any advice will be much appreciated!!! Many thanks!!!

Glad to help.

As far as learning pandas and numpy methods (and tricks) all I ever do is search google. Something like "how do I append a row to a pandas dataframe". You'll typically get a lot of results but the only ones I look at (usually) are from "stackoverflow" and "pandas.pydata.org". The latter will be the official documentation.

I then usually test things out in a notebook first. It's easy to see real time what's going on in an interactive way. Also, when writing code in the IDE use the debug window ( https://www.quantopian.com/help#debugger ) to inspect objects/variables as they run. One thing I do first is to verify the type of object being set or returned. Simply type type(my_variable) in the debug console and it will display the type of 'my_variable' (insert your desired variable name). It's very helpful to know if it's a Pandas Series or a Dataframe or a simple list to then know what methods are available.

Good luck.

Hi Dan,

Thanks a lot! It is very helpful! I did not use debugger in quantopian much before... This definitely helps a lot for debugging in IDE.

Thanks again and happy coding!