In Research, data can be accessed in a couple ways:
In the following section, we explore the different ways that you can access data in Research and when you might want to use each one.
One of the most simple data look-ups you can do in Research is with the
symbols() method. The
symbols() method allows you to look up a particular asset by supplying a ticker symbol or SID.
symbols() returns an
Equity which contains various properties about the asset such as its
exchange, and more. Certain functions in the Quantopian API require an
Equity as input, and
symbols() is a way to create an
Equity instance of a particular equity in Research.
from quantopian.pipeline import Pipeline from quantopian.research import run_pipeline def make_pipeline(): # Empty Pipeline return Pipeline() run_pipeline(make_pipeline(), '2017-01-01', '2017-05-01')
This will return an empty
DataFrame with a
MultiIndex (where level 0 is dates, and level 1 is assets). For example, the following might be the first few rows of this empty pipeline:
Running pipelines in Research can take time, especially for very computationally expensive calculations. Because of this, a progress bar is displayed underneath a cell whenever you run a pipeline to give you a sense of how long it will take.
Beyond pipeline, there are other ways to retrieve pricing-related data in Research. All of the following convenience methods retrieve data for assets in a date range, with the particular data varying by method:
prices(): Retrieves close prices.
returns(): Retrieves returns.
volumes(): Retrieves trading volumes.
log_prices(): Retrieves logarithmic-prices.
log_returns(): Retrieves logarithmic-returns.
get_pricing(): Retrieves open/high/low/close prices and volume data.
See the Research API Reference for full documentation.
You can get open/high/low/close pricing and volume data in pipeline via the EquityPricing dataset. The methods mentioned here can sometimes be easier to use if you just want to look up a price series for a single asset or a small set of assets. You can also use these convenience functions to get minute frequency pricing and volume data, which is not currently supported in pipeline.
An important feature of the price retrieving Research API methods is that all returned prices are adjusted as of the ``end_date`` supplied to the method. This is different from how pipeline and the backtester apply adjustments, where all prices are adjusted as of the simulation date.
Another way to think about adjustments in Research API methods is that the
end_date argument also represents a "perspective date". Any corporate actions that occurred before the
end_date will be applied to pricing and volume data retrieved by a Research API method.
In many cases, it may be useful to explore risk loadings and risk factor returns data in Research. Functions to import risk data are available in the
quantopian.research.experimental namespace; for a full list of functions and documentation, see the API Reference.
Monitoring Custom Dataset Loads¶
After uploading a custom dataset via Self Serve, you can monitor the status of your dataset's historical loads and live updates in Research using Load Metrics.
Load Metrics are accessible via
quantopian.interactive.data. For example, to monitor a dataset called
from quantopian.interactive.data import user_<your user ID> load_metrics = user_<your user ID>.load_metrics load_metrics[load_metrics.dataset == 'my_dataset']
To learn more about the information you can get from Load Metrics, see the Self-Serve Data documentation.