Back to Community
Minute-bar data in Research

Please help me with some basic questions. I want to use the Research environment to do some analysis on a pair of securities. My task requires that I examine performance differences between these two highly uncorrelated securities during historical market swoons, corrections and crashes. I desire to examine historical data over relatively short date ranges, from a few days to perhaps six weeks. I want to catalog and statistically characterize the relative behavior of this pair of securities during 15 to 20 such market dips. Especially when studying events of only a few days duration, I will want to generate moving averages using data bars of less than one day. Probably 15 minute to 30 minute bars would capture the behavior I hope to observe.

Note that the object of this study is preliminary research, for the purpose of setting some buy and sell signal parameters to be used in a trading algorithm, eventually. Several factors make the Quantopian Research environment potentially a good place to do this work, if it is possible, namely:

  • The availability of the pandas library means that at least some of the statistical number crunching - as opposed to simple data extraction - could probably be performed in the Research environment.

  • Quantopian prominently states, "We have minute-bar historical data for US equities and US futures since 2002 up to the most recently completed trading day." I need some of that data for my analysis to be meaningful.

  • The research environment's ability to make charts, critical for visually verifying the date ranges to be studied, appears to be up to the task.

Here's the rub: I see no evidence that the data.history object is available in the Research environment. Please tell me this is not so! I can find no working example of code using data.history that runs in the Research environment. If there is one, please direct me to it and that will be all I need for now. As far as I can tell, get_pricing is the main available tool in the Research environment, and that offers only daily close prices. Please tell me, how am I to access the 15- and 30-minute data I need?

On July 27, 2016, In a response to a question by "Adam," Nathan Wolfe said, "Algorithm code can't be run directly in Research." (https://www.quantopian.com/posts/name-context-is-not-defined-in-research-notebook). What does this mean, precisely? If the "initialize" and "handle_data" functions are unavailable in the Research environment, then I would like to know some other way of accessing minute-bar data. I don't see my current research naturally taking the form of a trading algoritm, so I am a bit stuck. Is there a way to access minute-bar data in the Research environment using a pipeline? That would interest me. Could you point me to examples? Thank you.

3 responses

Minute data is available in research using the "get_pricing" method. Specify frequency='minute'. Something like this.

prices = get_pricing(['MSFT'], start_date="2012-1-5", end_date="2015-6-1", fields='price', frequency='minute')  

The 'get_pricing' method returns either a pandas Panel/DataFrame/Series depending upon the selection of parameters. See the docs https://www.quantopian.com/help#quantopian_research_get_pricing .

Attached is an example notebook. My experience with Quantopian has been very positive and it sounds like your use case should work well.

Good luck!

Loading notebook preview...
Notebook previews are currently unavailable.

Dan,

Thank you for the link to your notebook. I was indeed able to get minute-bar data! Just for reference, the last line in your notebook, "my_prices.xsclose_price", threw an error, interpreting xsclose_price as an undefined attribute. I found I had to use "my_prices.xs(key='close_price',axis=0)" to get the desired results. I have no idea why, but the axis argument (equal to zero) was necessary to avoid an error. This seemed to me not very Python-like. I found some documentation (http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.xs.html) for pandas 0.22.0 naming four arguments to DataFrame.xs, namely:

  • key : object - Some label contained in the index, or partially in a MultiIndex
  • axis : int, default 0 - Axis to retrieve cross-section on
  • level : object, defaults to first n levels (n=1 or len(key)) - In case of a key partially contained in a MultiIndex, indicate which levels are used. Levels can be referred by label or position.
  • drop_level : boolean, default True -If False, returns object with same levels as self.

Clearly one may pack a lot of data into a dataframe.
Do you know, does Quantopian use pandas 0.22.0 at this point?

Thanks,

Tom