Global equities price data

G'day, desperately need some help.

a broad overview of what I am trying to achieve:
1. get pricing data for a hand picked list of ETFs that are listed on the ASX
2. Spearman's rank correlation coefficient matrix of said ETF's

for each ETF, I am trying to add a pd series containing closing price data (with date time indexes) as columns to a pd dataframe.
I am getting NaNs for the second column (probably because the indexes from the first contain two levels - date time and zipline equity).

Any help would be much appreciated.

1
Notebook previews are currently unavailable.
2 responses

Great questions!

The biggest difference (challenge) when working with non-US securities is the current need to select securities by SID and not symbol. One typically first gets a pipeline of all securities, then searches by symbol to find the SID.

So, first 1. get pricing data for all securities. This can be done in the pipeline. Prices can be returned as part of the pipeline. No need for a second step to add prices.

Next, 2a. select a subset of securities by symbol. This can be done by adding a column of symbols, then using the query method to select only certain symbols. There are of course other ways to do this too.

Finally, 2b. get Spearman's rank correlation coefficient matrix of securities This can be easily done with a single pandas corr method. However, this method expects the securities to be in columns. Use the unstack method first to swap the securities index into columns.

That's it!

There are a couple of issues with this however. One doesn't typically want to find correlations of prices but rather returns. See this post for more insight and a notebook which highlights the pitfall of looking for correlations in prices (https://www.quantopian.com/posts/correlation-between-prices-or-returns). The other issue is that a pipeline returns prices which are adjusted as of each pipeline date. If there were a stock split, the timeseries pipeline data would look like a large price change. Probably not good. Returns however, won't be affected by splits. Use returns instead of prices and it fixes both these issues.

See attached notebook. Also an easy way to visualize the output as a heatmap using seaborn.

1