Back to Community
research platform - how to get a nice heatmap?

How can I generate a color heatmap? I was able to get this drab gray-scale one. Any way to add color? Also, when I tried the seaborn approach (commented out in the code), I got just a blank, white heatmap.

Once I get the heatmap working properly, I'll re-post, with comments in the code, since it may be of general interest how to visualize relationships versus trading minute (horizontal axis) and trading day (vertical axis).

Loading notebook preview...
13 responses
plt.pcolor(ht_map,cmap='hot')  
plt.colorbar()  
plt.clim(ht_map.min().min(),ht_map.max().max())  

The list of colour maps is here.

Looks like an interesting event between minutes 30 and 40 on a lot of days?

WIth that many cells, the seaborn.heatmap will contain only borders. You need to set linewidths=0 as a kwarg.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Grant, would you mind explaining briefly what this is a map of? it looks like it might be faster implemented without the lambda function, using just matrix operations?

I have a modified notebook with vectorized Z calculations and an alternate heatmap, but how do I share it?

-- I see now, there's a button that appears when hovering over a cell.

Thanks all for the tips.

@ Simon,

I'm computing the difference in the volume z-scores for SPY & SH, with a trailing window of 390 minutes. I'll have to think about how that might be done with matrix operations. The problem I see is that the computation is done on a rolling basis, so for every minute, a new current z-score needs to be computed. However, perhaps the mean and standard deviation computations required to compute the z-score could be sped up, using an online algorithm (see http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance). In other words, replace stats.zmap with a custom z-score computation that is of the online type.

The calculation can use rolling_mean and rolling_std from Pandas, as in the notebook that I shared.

I see. I am playing around with groupby to try and get the Z scores relative only to their minute buckets, since that would likely better highlight abnormal conditions. L

@ Simon, I'm not sure what you mean by a minute bucket. By definition, a z-score is computed using a trailing window of data, in this context.

Is this code already doing what you want:

stats.zmap(y[-1],y)-stats.zmap(x[-1],x)  

See http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.zmap.html#scipy.stats.zmap. Since 'scores' is a scalar, a single value is returned, which is the z-score for the current minute.

But aren't you comparing volumes vs the historical volumes over the last 390 trading minutes? The morning volumes are always going to be higher because of the intra day seasonality of volume. I am personally more interested in seeing truly abnormal data, for instance, by comparing the volume of today's 9:40 vs 9:40 of the previous six months. If that has a significant deviation, then something happened that doesn't usually happen at 9:40, which to me is more interesting.

We are probably looking for different things. :)

Alternatively, we could calculate the ratio of SH/SPY volume and calculate some stats from that raw ratio data, which might also be interesting.

The heatmap should flag anomalies for a given time of the trading day. Each column represents a trading minute, so if you see a 9:40 blip (or a blip decaying horizontally to the right), then something happened at 9:40. In other words, just look down the 9:40 column for 6 months. If there are no blips, then nothing special ever happens at 9:40.

But aren't you calculating the Z score of each minute vs the previous 390 minutes, not the previous N days of that minute?

For instance you have a blip at 10am, but there is always a blip at 10am. So to my way of thinking, its not a blip, it's just the way 10am is every day...

I kinda see what you're saying. One could examine changes at each point in time, across N days. Perhaps certain times of the day, there is an anomalous volume relationship between SPY & SH, and then one could dig further into the data to see if there is any predictive power in prices due to the anomaly.

In my likely misguided mind, SPY & SH should be in a leader-follower relationship. I have the view that SPY is just an algorithm that has an overall objective of replicating an investment in the S&P 500 index. So, at every instant of time, it is collecting information about the S&P 500 index, and making adjustments to minimize the tracking error, including issuing new shares or pulling shares off of the market. Additionally, shares of SPY are bouncing around the globe, from individual to individual and from institution to institution, in a chaotic process to achieve collective planetary financial nirvana. And then some clever financial engineers and programmers come along and devise SH, which as an algorithm also uses the S&P 500 as its input, and tries to achieve the inverse return at the end of the trading day. It magically accomplishes this feat by investing in some voodoo financial instruments, rather than by directly shorting the individual S&P 500 constituents stocks. And just like SPY, there are individuals and institutions buying and selling shares of SH throughout the day, hoping to achieve some benefit to themselves or humankind. In this context, when I say "individuals and institutions" I figure we are really talking about automated trading robots. One interesting figure is that the SPY is considerably cheaper than SH, with respective expense ratios of 0.1% versus 0.9%, so the voodoo comes at a price.

Since SPY and SH are both keyed off of the S&P 500 index, then naively one would think that at any instant of time, they would be in perfect harmony by any statistical measure. Maybe if all of the individuals and institutions stopped trading SPY and SH (the zero-volume limit), then the respective ETF algos would bring everything into ideal synchronization?

Weird house of cards. What if one day, Standard & Poor's just decided to stop publishing their index. It's their index, right? They could just stop. The financial world would probably come to a stand-still! It would be hilarious.