Back to Community
KeyError symbol error -- date MUST be in %y-%m-%d format

Q-Tribe,

I'm trying to load a signal file from a dropbox location. I get 5 of the rows and then -- fffttt -- choke. There's another 10 rows or so after that error.
And the error in the runtime does not reflect this logged error.
You might just fetch the file to see how simple it is.

1970-01-01DataReview:76INFO  
EntryDate Symbol Side  Quantity  EntryPrice  NetAmount  ExitDate  
0   20150317    ADP  Buy        57       86.91    4953.87  20150407  
1   20150317    ARE  Buy        51       97.26    4960.26  20150407  
2   20150317   BRCM  Buy       111       44.56    4946.16  20150407  
3   20150317    CAT  Buy        63       79.35    4999.05  20150407  
4   20150317     CR  Buy        78       63.39    4944.42  20150407  
1970-01-01null:nullWARNrequests/packages/urllib3/util/ssl_.py:79:  
InsecurePlatformWarning: A true SSLContext object is not available.  
This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail.  
For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.  
End of logs.  

Runtime error:

KeyError: 'symbol'  
There was a runtime error on line 35.  

And this is line 35 : universe_func = SetUniverse)

Here's the simple code that I was hoping would work (pulled mainly from the help file).

def initialize(context):  
    fetch_csv('https://dl.dropboxusercontent.com/u/217910322/Finance/TestSignals.csv',  
              date_column   = 'EntryDate',  
              date_format   = 'yyyymmdd',  
              pre_func      = DataReview,  
              post_func     = DataReview,  
              universe_func = SetUniverse)

def SetUniverse(context, fetcher_data):  
    my_stocks = set(fetcher_data['Symbol'])  
    context.count = len(my_stocks)  
    print 'total universe size: {c}'.format(c=context.count)  
    return my_stocks

def DataReview(df):  
    log.info(' %s ' % df.head())  
    return df  
8 responses

Your pre_func / post_func DataReview(df) only shows the first five rows because you're using the .head() method:

log.info(' %s ' % df.head())  

If you change it to:
log.info(' %s ' % df) It will print everything. I think Quantopian is aware of the SSL warning, but it shouldn't affect you.

You get a key error because you try to access a key that doesn't exist:
my_stocks = set(fetcher_data['Symbol']) I don't know why you changed it, but if you check the example again it should be:
my_stocks = set(fetcher_data['sid'])

@Epic T., df.head() right. Cleared that up just fine thanks.

No help on the error though.

my_stocks = set(fetcher_data['sid'])

There's no "sid" in my data. There is however "Symbol"
(Columns: EntryDate Symbol Side Quantity EntryPrice NetAmount ExitDate)

So I figured fetcher_data would contain a column called "Symbol". But I can't even get into SetUniverse in debug mode -- it's never called.

Maybe the help doesn't really illustrate everything I need to know about fetcher foibles. Off to examine other use cases...

Update:

"Symbol" won't work -- but "symbol" will...

"For Security Info, your CSV file must have a column with header of 'symbol' which represents the symbol of that security on the date of that row. Internally, Fetcher maps the symbol to the Quantopian security id (sid). You can have many securities in a single CSV file."

Yes, it's case sensitive in this case. Hope it works now.

Still not quite there...

Clues,

On download these are the columns -- I renamed and moved the symbol column figuring that fetcher was incapable of renaming the one.

During pre_func:

1970-01-01DataReview:81INFO symbol EntryDate Side Quantity EntryPrice NetAmount ExitDate
0 ADP 20150317 Buy 57 86.91 4953.87 20150407
...

Then during post_func

1970-01-01DataReview:81INFO Empty DataFrame
Columns: [Side, Quantity, EntryPrice, NetAmount, ExitDate, sid]
Index: []

So I'm not sure what fetcher is doing in between...

Same error about "KeyError: symbol"

Epic T., -- you know -- nevermind. This is not worth my time (nor yours).

Peculiar problem, but it turns out that it's the date format that makes it crash (I think). I added dashes between the dates, "2015-03-17". And then it ran properly, I guess the date format for this is date_format='%y-%m-%d', but it seems to run without specifying that. I couldn't get your format to work, but I'm not that good with dates in Python. Is it easy for you to specify the dates in another format?

@ET, tested, works. Thanks for your efforts in this. I suppose I'm too used to other language formats for things like date strings.

EDIT: Theoretically one of these "should" work:
date_format = '%Y%m%d',
date_format = '%Y%M%d',
date_format = '%Y%M%D',
But none do.

Thanks for your help Epic T. To warp it over, I mean, to wrap it up, I had to add a dummy date column to the data as this code:

data[stock]['dt']  

always returned the same as data[stock].datetime and NOT the date that I had supplied.
If "mask = true/false" has anything to do with that -- it didn't matter both had the same affect.

But, in the end I got it to work with your help. Now I can load a signal file outside of Q and examine the approximate IB P&L.

Edit: do we have to forego the use of data[stock].close_price when we use the fetcher? I got errors when I tried to use close_price - not available on SIDData object?

See: https://www.quantopian.com/posts/runtime-error-siddata-object-has-no-attribute-price

Clone Algorithm
5
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
import pandas as pd

def initialize(context):
    set_slippage(slippage.FixedSlippage(spread=0.00))
    set_commission(commission.PerShare(cost=0, min_trade_cost=None))
    schedule_function(HandleEntry, date_rules.every_day(), time_rules.market_open())
    schedule_function(HandleExit,  date_rules.every_day(), time_rules.market_close())
    
    fetch_csv('https://dl.dropboxusercontent.com/u/217910322/Finance/TestSignals.csv', 
              date_column   = 'DummyDate', 
              date_format   = '%y-%m-%d',
              pre_func      = RenameColumns,
              post_func     = DataReview,
              universe_func = SetUniverse,
              mask          = False)
    
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
def handle_data(context,data):
    record(Leverage = context.account.leverage)
    record(PosValue = context.portfolio.positions_value)
    
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
def HandleEntry(context,data):
    exchange_time = pd.Timestamp(get_datetime()).tz_convert('US/Eastern')
    for stock in data:
        entryDate = pd.Timestamp(data[stock]['EntryDate'])
        if (entryDate.date() != exchange_time.date()):
            continue
        if (data[stock]['Side'].strip() == "Buy"):
            quantity = data[stock]["Quantity"]
        else:
            quantity = -data[stock]["Quantity"]
            
        order(stock, quantity)
        print(">>> {0:<5} qty {1:>5}  exp$ {2:>7.2f}".format(
               stock.symbol, quantity, data[stock]["EntryPrice"]))
                
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
def HandleExit(context,data):
    exchange_time = pd.Timestamp(get_datetime()).tz_convert('US/Eastern')
    for stock in data:
        exitDate = pd.Timestamp(data[stock]['ExitDate'])
        if (exitDate.date() != exchange_time.date()):
            continue
        order(stock, 0)
        print("<<< {0:<5} qty {1:>5}  exp$ {2:>7.2f}".format(
               stock.symbol, context.portfolio.positions[stock].amount, data[stock]["EntryPrice"]))
            
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
def SetUniverse(context, fetcher_data):
    my_stocks = set(fetcher_data['sid'])
    context.count = len(my_stocks)
    print 'total universe size: {c}'.format(c=context.count)
    return my_stocks

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
def RenameColumns(df):
    df = df.rename(columns={'Symbol': 'symbol'})
    return df

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
def DataReview(df):
    log.info(' %s ' % df)
    return df
There was a runtime error.

Yep, that's intentional:

The backtester will pull in only the data that is available on the current bar, based on the 'date_column' specified in fetch_csv to prevent look-ahead bias. This column will be overwritten with the algorithm datetime and be fed into the backtest once the data becomes available each day.

No, you don't have to forego using close_price. I think this is a problem in general and not for fetcher in particular. You just have to make the check that Alisa writes in her response in your link before attempting to use close_price:

for stock in data:  
        if 'price' in data[stock]: