Back to Community
Research Updates - get_pricing and Jupyter Notebook Upgrade

Yesterday, we shipped two major upgrades to Quantopian Research that should help with the research workflow.

get_pricing was updated to be much faster, and to use the same pricing and volume data as the rest of the platform. This means that get_pricing now uses the same data source as Pipeline and the backtester. It also means that the data loaded by get_pricing is point-in-time split- and dividend-adjusted. For these adjustments, the reference date, or the ‘date which you are looking back from’, is always the end_date of the get_pricing call. get_pricing is also much faster - up to 50x faster on some queries. Here are some performance test results (blue bars are the new version):

As a reminder, now that the data returned by get_pricing is a bit different, your existing notebooks may have different results. This does not affect any data used in backtesting or live trading.

Research was upgraded to use the latest version of Jupyter Notebook (4.3.1). This upgrade adds some new features that should make working in a notebook a little bit easier.

  • The “command palette” allows you to search for a command by name (can be brought up with Cmd/Ctrl-Shift-P):

  • There’s a new search and replace dialog which can be brought up with F while you’re not editing a cell (aka “command mode”) or via the ‘Edit' dropdown menu:

  • You can now select multiple cells with Shift-Up/Down or Shift-K/J. You can apply actions such as execute, cut/copy/paste, and cell type conversions to a group of selected cells:

We hope these changes help you when researching new ideas. Please let us know if you encounter any issues with the changes.

Note: We recommend using the latest stable version of Chrome, Safari or Firefox. Some community members have encountered issues when using IE.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

18 responses

Ooooh, awesome!

I had been working around this price slowness by instead loading the entire Q1500US prices FIRST and storing all of that in memory just once, but then of course working with that was much slower if your real universe was also much smaller, and then changing dates was still a pain. Much needed update, works great!

Hi James - new updates seem pretty impressive and really appreciate how Q keeps moving forward with updates and improvements.

That being said, how can a user avoid his/her previous research tools from going obsolete with the updates that are made - for example with this update it seems to include a lot of other changes also that has caused quite a few of my research / analysis notebooks to stop working.

An example is attached, that is kind of a 'deeper' tear sheet then the regular available I use and it is now riddled with security violations since this upgrade - is there a solution for me for other then spend another double digit hours getting it back up and running.

cross posted: https://www.quantopian.com/posts/pd-dot-options-deeper-tear-sheet-algo-analysis-changes-to-research

Loading notebook preview...
Notebook previews are currently unavailable.

Times to run get_pricing for 3000 stocks for a year went from 240 seconds to 11 seconds per run. It ran so fast that at first I though I had a new bug.

THANK YOU!

Does the fact that get_pricing returns adjusted data mean that below code would be exposed to look-ahead bias as it'd use price data adjusted for future dividends and splits?

data = get_pricing(...)  
algo = TradingAlgorithm(...)  
input_data = data.transpose(2, 1, 0)  
algo.run(input_data)

It seems get_pricing doesn't return recent data.

Here's my code to get last 365 days closing price. As you can see in the print out, I see good data in early dates when I print head, but no data (NaN) for recent dates when I print tail. Is it a bug?

today = datetime.date(2017, 2, 28)
end_date = today.strftime('%Y-%m-%d')
oneyear = today + timedelta(-365)
start_date = oneyear.strftime('%Y-%m-%d')
asset_list = ['xlb', 'xle', 'xlf', 'xli', 'xlk', 'xlp', 'xlu', 'xlv', 'xly', 'vnq', 'efa', 'eem', 'gld', 'hyg', 'lqd', 'tlt']
prices = get_pricing(asset_list, start_date=start_date, end_date=end_date, fields=['close_price'])
print prices['close_price'].head()
print prices['close_price'].tail()

                       Equity(19654 [XLB])  Equity(19655 [XLE])  \  

2016-02-29 00:00:00+00:00 40.954 55.272
2016-03-01 00:00:00+00:00 42.003 56.608
2016-03-02 00:00:00+00:00 41.856 58.012
2016-03-03 00:00:00+00:00 42.101 58.910
2016-03-04 00:00:00+00:00 42.571 59.466

                       Equity(19656 [XLF])  Equity(19657 [XLI])  \  

2016-02-29 00:00:00+00:00 16.785 50.963
2016-03-01 00:00:00+00:00 17.374 51.985
2016-03-02 00:00:00+00:00 17.533 52.024
2016-03-03 00:00:00+00:00 17.652 52.386
2016-03-04 00:00:00+00:00 17.728 52.582

                       Equity(19658 [XLK])  Equity(19659 [XLP])  \  

2016-02-29 00:00:00+00:00 40.236 49.666
2016-03-01 00:00:00+00:00 41.410 50.174
2016-03-02 00:00:00+00:00 41.527 50.286
2016-03-03 00:00:00+00:00 41.527 50.540
2016-03-04 00:00:00+00:00 41.626 50.764

                       Equity(19660 [XLU])  Equity(19661 [XLV])  \  

2016-02-29 00:00:00+00:00 44.763 65.170
2016-03-01 00:00:00+00:00 44.526 66.587
2016-03-02 00:00:00+00:00 44.763 66.695
2016-03-03 00:00:00+00:00 45.044 66.484
2016-03-04 00:00:00+00:00 45.551 66.351

                       Equity(19662 [XLY])  Equity(26669 [VNQ])  \  

2016-02-29 00:00:00+00:00 73.169 73.153
2016-03-01 00:00:00+00:00 75.095 75.184
2016-03-02 00:00:00+00:00 75.017 75.746
2016-03-03 00:00:00+00:00 75.469 76.128
2016-03-04 00:00:00+00:00 75.420 76.128

                       Equity(22972 [EFA])  Equity(24705 [EEM])  \  

2016-02-29 00:00:00+00:00 51.983 29.733
2016-03-01 00:00:00+00:00 53.360 30.803
2016-03-02 00:00:00+00:00 53.748 31.215
2016-03-03 00:00:00+00:00 54.281 31.562
2016-03-04 00:00:00+00:00 54.640 32.186

                       Equity(26807 [GLD])  Equity(33655 [HYG])  \  

2016-02-29 00:00:00+00:00 118.630 75.874
2016-03-01 00:00:00+00:00 117.775 77.073
2016-03-02 00:00:00+00:00 118.670 76.539
2016-03-03 00:00:00+00:00 120.720 76.825
2016-03-04 00:00:00+00:00 120.560 77.044

                       Equity(23881 [LQD])  Equity(23921 [TLT])  

2016-02-29 00:00:00+00:00 111.370 127.744
2016-03-01 00:00:00+00:00 110.729 125.650
2016-03-02 00:00:00+00:00 110.928 126.134
2016-03-03 00:00:00+00:00 111.462 126.608
2016-03-04 00:00:00+00:00 111.443 125.806
Equity(19654 [XLB]) Equity(19655 [XLE]) \
2017-02-22 00:00:00+00:00 NaN NaN
2017-02-23 00:00:00+00:00 NaN NaN
2017-02-24 00:00:00+00:00 NaN NaN
2017-02-27 00:00:00+00:00 NaN NaN
2017-02-28 00:00:00+00:00 NaN NaN

                       Equity(19656 [XLF])  Equity(19657 [XLI])  \  

2017-02-22 00:00:00+00:00 NaN NaN
2017-02-23 00:00:00+00:00 NaN NaN
2017-02-24 00:00:00+00:00 NaN NaN
2017-02-27 00:00:00+00:00 NaN NaN
2017-02-28 00:00:00+00:00 NaN NaN

                       Equity(19658 [XLK])  Equity(19659 [XLP])  \  

2017-02-22 00:00:00+00:00 NaN NaN
2017-02-23 00:00:00+00:00 NaN NaN
2017-02-24 00:00:00+00:00 NaN NaN
2017-02-27 00:00:00+00:00 NaN NaN
2017-02-28 00:00:00+00:00 NaN NaN

                       Equity(19660 [XLU])  Equity(19661 [XLV])  \  

2017-02-22 00:00:00+00:00 NaN NaN
2017-02-23 00:00:00+00:00 NaN NaN
2017-02-24 00:00:00+00:00 NaN NaN
2017-02-27 00:00:00+00:00 NaN NaN
2017-02-28 00:00:00+00:00 NaN NaN

                       Equity(19662 [XLY])  Equity(26669 [VNQ])  \  

2017-02-22 00:00:00+00:00 NaN NaN
2017-02-23 00:00:00+00:00 NaN NaN
2017-02-24 00:00:00+00:00 NaN NaN
2017-02-27 00:00:00+00:00 NaN NaN
2017-02-28 00:00:00+00:00 NaN NaN

                       Equity(22972 [EFA])  Equity(24705 [EEM])  \  

2017-02-22 00:00:00+00:00 NaN NaN
2017-02-23 00:00:00+00:00 NaN NaN
2017-02-24 00:00:00+00:00 NaN NaN
2017-02-27 00:00:00+00:00 NaN NaN
2017-02-28 00:00:00+00:00 NaN NaN

                       Equity(26807 [GLD])  Equity(33655 [HYG])  \  

2017-02-22 00:00:00+00:00 NaN NaN
2017-02-23 00:00:00+00:00 NaN NaN
2017-02-24 00:00:00+00:00 NaN NaN
2017-02-27 00:00:00+00:00 NaN NaN
2017-02-28 00:00:00+00:00 NaN NaN

                       Equity(23881 [LQD])  Equity(23921 [TLT])  

2017-02-22 00:00:00+00:00 NaN NaN
2017-02-23 00:00:00+00:00 NaN NaN
2017-02-24 00:00:00+00:00 NaN NaN
2017-02-27 00:00:00+00:00 NaN NaN
2017-02-28 00:00:00+00:00 NaN NaN

yes, looks like it misses data from 22 of Feb till now

Loading notebook preview...
Notebook previews are currently unavailable.

Same for me, not getting pricing data after 22 Feb 2017

getting a 500 error in notebook using get_pricing()

have same question like junwi c

get_pricing() doesn't work again

import time
curday = time.strftime('%Y-%m-%d',time.localtime(time.time()))
l20day = time.strftime('%Y-%m-%d',time.localtime(time.time()-20*24*3600))

baba_day_closes = get_pricing(
'BABA',
fields='close_price', #modify to price, open_price, high, low or volume to change the field
start_date=l20day, #customize your pricing date range
end_date =curday,
frequency='daily', #change to daily for daily pricing
)

matplotlib is installed for easy plotting

print baba_day_closes

======================= 2017-02-13 00:00:00+00:00 103.10
2017-02-14 00:00:00+00:00 101.56
2017-02-15 00:00:00+00:00 101.56
2017-02-16 00:00:00+00:00 100.79
2017-02-17 00:00:00+00:00 100.49
2017-02-21 00:00:00+00:00 102.14
2017-02-22 00:00:00+00:00 NaN
2017-02-23 00:00:00+00:00 NaN
2017-02-24 00:00:00+00:00 NaN
2017-02-27 00:00:00+00:00 NaN
2017-02-28 00:00:00+00:00 NaN
2017-03-01 00:00:00+00:00 NaN
2017-03-02 00:00:00+00:00 NaN
2017-03-03 00:00:00+00:00 NaN
Name: Equity(47740 [BABA]), dtype: float64

get_pricing() doesn't work again for me also

Ditto what the folks above said. NANs after 2/22, Welp!!

Given that get_pricing now uses "the same pricing and volume data as the rest of the platform" this behavior would seem odd. I'd think that it would be pulling directly from the same database(s). Maybe there is some intermediate plumbing that has a bug? Are there NaNs output from the backtester, as well?

We have a fix for the 500 errors that will be going out this morning. We're also looking into the fact that there's no data after 2/22. Sorry for the inconvenience.

I'll post back here when the fix has been shipped.

Hello,

I have tested get_pricing and Pipeline using SPY data.
From the test, I can see Pipeline data was shifted 1 day from get_pricing data.

For example, get_pricing "open_price" value as of 2017-11-30 matched with pipeline "latest_open" value as of 2017-12-01, as you can see the test.

Loading notebook preview...
Notebook previews are currently unavailable.

The 1 day 'shift' between pipeline output and 'get_pricing' output is really a matter of semantics.

The 'get_pricing' method returns prices (and volume) for specific dates and times and can return either daily or minutely pricing. The values returned are the same as would be returned by the 'history' method in an algorithm. If one runs the 'history' method in an algorithm at noon on 2017-12-01, one could get prices as of noon 2017-12-01. The dates in the 'history' and the 'get_pricing' return the current data by definition.

The pipeline output however returns prices (and a lot of other data) as of market close on the previous day. A pipeline is typically run in the "before_trading" method in an algorithm. However, it can be run any time during the day. The thing to note is that if a pipeline is run anytime on 2017-12-01 it will always return data as of the previous close 2017-11-30.

The notebook environment maintains the same framework as in algorithms. The one day 'shift' therefore is semantic. When running a pipeline, the dates reflect the dates to run the pipeline and not the dates of the data. A pipeline always returns data as of the previous trading day which implies the run date is one more than the data date. The data returned by a pipeline is the data that an algorithm would see as of that date (ie the previous days data) and not the data on that data.

Hope that clears it up?

Hi Dan,
Thank you ! It's clear now.