Guide for porting your algorithms to a local Zipline Research Environment

For those running into memory limitations on the platform, want to use other libraries not supported by Q, or just want to develop locally, here's a brief guide on setting up a local Research environment through the open-source Zipline that powers Q.

Guide
1. Follow the Zipline installation documentation and ideally install it into a separate environment along with the other libraries you need. Note that only up to Python 3.5 is supported.
2. Activate the environment - in Anaconda distribution, this would be:

$conda activate env_zipline  1. Run the following to make sure everything is installed correctly. You should see quandl and quantopian-quandl with no ingestions. $ zipline bundles 
2. Ingest the quantopian-quandl data bundle. This will serve as the replacement for USEquityPricing on Q. However, note that only end-of-day prices are built into Zipline. For higher granularity data, you would need a subscription to the Quandl bundle SEP (~$30/month). $ zipline ingest -b quantopian-quandl 
3. In your algorithm, the following library references need to be changed:
from zipline.api import *
from zipline.pipeline import CustomFactor, Pipeline
from zipline.pipeline.data import USEquityPricing
from zipline.pipeline.factors import _  # Built-in factors here
from zipline.pipeline.engine import PipelineEngine


This should allow you to run a pipeline locally on your machine, develop a model, etc and then upload the outputs through Q's Custom Data functionality.

(Optional) For Fundamentals data
1. Register on Quandl and find your API Access Key. You would also need a subscription to the Sharadar/SF1 dataset (~$30/month). 2. Clone or download the following repo. Install the libraries quandl, zipline into a new environment with Python 2.X. $ conda create --name env_alphacompile python=2.7
$conda activate env_alphacompile  1. Navigate to where you unzipped the file, and launch setup.py. Then, open up load_quandl_sf1.py in * ..\envs\env_alphacompile\Lib\site-packages\alphacompiler\data* 2. Append your API Key somewhere at the top as an environment variable with os.environ['QUANDL_API_KEY'] = 'abc' # Put your API key here  1. Add a start_date you would like to query from, and change the bottom lines of code into the following. Then, modify the fields list with the fundamental data you need. See the Quandl dataset documentation for values. if __name__ == '__main__': BUNDLE_NAME = 'quantopian-quandl' fields = [] # List of your fields you want to query num_tickers = all_tickers_for_bundle(fields, BUNDLE_NAME) tot_tickers = num_tkrs_in_bundle(BUNDLE_NAME) pack_sparse_data(tot_tickers + 1, # number of tickers in bundle + 1 os.path.join(BASE,RAW_FLDR), fields, os.path.join(BASE,FN))  1. Open .\alpha-compiler-master\alpha-compiler-master\alphacompiler\util\sparse_data.py. Under init(), change self.data_path='SF1.npy'. Under pack_sparse_data(), replace the df statement with:  dateparse = (lambda x: pd.datetime.strptime(x, '%Y-%m-%d')) df = pd.read_csv(os.path.join(rawpath,fn), index_col="Date", parse_dates=['Date'], date_parser=dateparse )  1. Launch load_quandl_sf1.py . This will start fetching data via API calls and would take some time. $ cd .\alpha-compiler-master\alpha-compiler-master\alphacompiler\data
$python load_quandl_sf1.py  1. Open sf1_fundamentals.py and change the fields to the same as in a previous step. Copy and paste the whole alpha_compiler folder with the processed dataset into env_zipline\Lib\site_packages (Python 3 is fine). 2. Reactivate the env_zipline environment 3. If everything has worked as they should, import the library and Fundamentals should be accessible. from alphacompiler.data.sf1_fundamentals import Fundamentals fd = Fundamentals() # Replace all calls to Fundamentals with fd # Example: fd.capex  1. Make sure you update the fundamentals references in your algorithm (i.e. PE_RATIO) with the same names used in fields in load_quandl_sf1.py. If fundamentals are used in a CustomFactor, make sure to manually specify they are window safe. class MyFactor(CustomFactor): inputs = [fd.currentratio] window_length=1 fd.currentratio.window_safe = True def compute(self,today,assets,out,value): out[:] = # Do something  Note: There are certain proprietary Q features such as the universe filter QTradableStocksUS, Risk Model, and Optimize API. You could attempt to replicate this locally, but otherwise I would recommend only moving the Research component locally to develop your model before feeding its outputs back on Q for the actual backtesting. 13 responses Does any of the documents in the Zipline docs folder on GitHub help at all? I haven't tried local install. They've certainly helped me install Zipline through Conda, and ingested the quantopian-quandle data bundle. However I'm not sure how to reference the data and convert the rest of the code. I only need to set it up through Pipeline computations, not the actual backtesting. Thanks for the input - after some trial and error I've managed to get USEquityPricing and the common functions loaded: from zipline.api import * # zipline Data & Factors from zipline.pipeline import CustomFactor, Pipeline from zipline.pipeline.data import USEquityPricing#, Fundamentals #from zipline.pipeline.experimental import QTradableStocksUS from zipline.pipeline.factors import (Returns, VWAP, AverageDollarVolume, SimpleMovingAverage, AnnualizedVolatility, SimpleBeta) from zipline.pipeline.engine import PipelineEngine  Seems that Fundamentals isn't built-in to Zipline, so one would need a Quandl License first (~$30/month).

Managed to get everything working. Updated the original post with steps for anyone else attempting this.

That’s pretty awesome, thank you! I might give this a go as well.

Updated the guide for setting up Fundamentals - it should be significantly easier now. There are still some things I'd like to do with the set-up, for example ingesting custom CSVs (i.e. for other asset classes) and some errors with the TradingCalendar class, so if anyone wants to collaborate or have questions don't hesitate to ask on here or messages.

Hi Adam, This is a great work.
I have one question for you. When I try to run a simple example on local installation, I got some error as following. It seems it can not download SPY data or what? I could use a local csv instead of it by replacing the benchmarks.py in zipline files but how can I fix this ? Thanks

Traceback (most recent call last):
File "/anaconda3/envs/env_zipline/bin/zipline", line 11, in
File "/anaconda3/envs/env_zipline/lib/python2.7/site-packages/click/core.py", line 764, in call
return self.main(*args, **kwargs)
File "/anaconda3/envs/env_zipline/lib/python2.7/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/anaconda3/envs/env_zipline/lib/python2.7/site-packages/click/core.py", line 1137, in invoke
return process_result(sub_ctx.command.invoke(sub_ctx))
File "/anaconda3/envs/env_zipline/lib/python2.7/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/anaconda3/envs/env_zipline/lib/python2.7/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/anaconda3/envs/env_zipline/lib/python2.7/site-packages/zipline/
main.py", line 107, in _
return f(*args, **kwargs)
File "/anaconda3/envs/env_zipline/lib/python2.7/site-packages/click/decorators.py", line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File "/anaconda3/envs/env_zipline/lib/python2.7/site-packages/zipline/
main.py", line 276, in run
blotter=blotter,
File "/anaconda3/envs/env_zipline/lib/python2.7/site-packages/zipline/utils/run_algo.py", line 159, in _run
File "/anaconda3/envs/env_zipline/lib/python2.7/site-packages/zipline/finance/trading.py", line 103, in __init
_
self.bm_symbol,
environ,
File "/anaconda3/envs/env_zipline/lib/python2.7/site-packages/zipline/data/loader.py", line 216, in ensure_benchmark_data
data = get_benchmark_returns(symbol)
File "/anaconda3/envs/env_zipline/lib/python2.7/site-packages/zipline/data/benchmarks.py", line 35, in get_benchmark_returns
data = r.json()
File "/anaconda3/envs/env_zipline/lib/python2.7/site-packages/requests/models.py", line 897, in json
File "/anaconda3/envs/env_zipline/lib/python2.7/json/init.py", line 339, in loads
return _default_decoder.decode(s)
File "/anaconda3/envs/env_zipline/lib/python2.7/json/decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/anaconda3/envs/env_zipline/lib/python2.7/json/decoder.py", line 382, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

@hamed
IEX SPY data is sorta gone/changed.
Fixes at:
https://github.com/quantopian/zipline/issues/2480
alan

@ Alan
Thanks Alan. Appreciated.

@Adam I am also interested in ingesting custom CSV files. Is anyone already doing this? I get a lot of errors and run into issues when trying to ingest local CSV data.

@Dave I also started studying how to ingest custom CSV files. I understand that creating our own Bundles is best but as I am not an advanced programmer it is difficult to understand how to do this. Anyway, I found different posts on this topic in google group https://groups.google.com/forum/#!topic/zipline/-XT2pbnbz7s or here https://github.com/quantopian/zipline/pull/1860. I also found an interesting video this morning https://www.youtube.com/watch?v=vh42tQDDC1U.

Happy to share ideas if you wish.

@ Dave
Hope this will solve your problem.

go through these steps:
1- You can simply read SPY data from somewhere and store it as CSV in format for the whole period you want:
date close
2007-12-26 149.46
2007-12-27 147.52
...

and save file in your hard disk.
the file name should be: SPY.csv
if you have another benchmark save it accordingly and change the name to 'yourBM'.csv
2- go to .../site_packages/zipline/data in your zipline installed folder and replace the existing benchmark.py with the following file with the same name. This will read SPY data from your local drive rather than going and downloading data :

import numpy as np
import pandas as pd
import os

def get_benchmark_returns(symbol):
current_dir = os.getcwd()