Looking back at the original question "I need some historical VIX data to generate alpha signal to trade stocks. How would I go about it?". This is exactly the workflow/model that pipeline supports.
- Fetch historical data
- Manipulate data
- Output a factor which associates an alpha or signal value with each security
The key is to manipulate the data inside the pipeline (in this case). Don't focus on getting the historical VIX out of pipeline. Rather, focus on moving the code which generates the alpha signal into pipeline. Maybe just a paradigm shift?
For example, a simple alpha signal may be "go long on stocks when the current VIX is less than the 10 day moving average of VIX, otherwise go short". This may be based upon the premise that VIX is generally inversely correlated with the market. How to code this signal inside of pipeline? Something like this:
from quantopian.pipeline.data.quandl import cboe_vix
from quantopian.pipeline.factors import SimpleMovingAverage
vix_current = cboe_vix.vix_close.latest
vix_ma_10 = SimpleMovingAverage(inputs=[cboe_vix.vix_close], window_length=10)
long = vix_current < vix_ma_10
A couple of things to notice. While the VIX dataset, like many of the macro economic datasets, is technically a 'slice' in that there isn't a value associated with each asset, one can combine it with regular datasets. Values are broadcast to each asset behind the scenes. Additionally, one doesn't always need to use a custom factor. In this case, 'vix_close' works fine as an input to a built-in factor such as 'SimpleMovingAverage'.
If more complex logic is needed beyond the built-in factors, one can always define a custom factor and put that logic into the 'compute' method. Generally, any logic that could be done on a dataframe returned by something like 'data.history' can just as easily (and much more quickly) be done inside a custom factor. Something like this.
# Get our inputs
inputs = [cboe_vix.vix_close, USEquityPricing.close]
# Set a window length to look back. In this case about 2 years
window_length = 400
def compute(self, today, assets, out, vix, close):
# Apply any logic over the input data
signal = ...some logic....
out[:] = signal
Maybe include a more specific example if help is needed to move logic into a custom factor.
Another comment was made in the above posts "My problem is that I need years of data, not just days. This will likely impact the back test or any future processes in which I have to wait to collect enough data to get started". One doesn't need to wait to collect data to get started. The pipeline engine automatically grabs any history as needed.
The key is the paradigm shift to bring the logic into pipeline if large quantities of historical data are needed. Typically much simpler than trying to move the data out.
See the attached notebook for an example.