New book on Quantopian/Zipline backtesting and modeling

Hi guys,

My third book is just released today. Trading Evolved is an in-depth guide into the world of Python based backtesting. Starting with the assumption of little to no prior knowledge, I'll take you on a ride which will eventually show you how to construct advanced trading models for equities and futures.

I started writing this book a year ago after finding a lack of this type of explanatory documentation in the Python world. The tools available to us are incredibly powerful, but it is not easy for most people to really get into this way of working. It can be a daunting process to learn about Python and backtesting, both from a technical and financial perspective.

My new book is highly practical and full of source code and detailed explanations. It's my intention that by the end of this book, you will be well on your way to becoming a professional systematic trader.

You can find the new book here: https://amzn.to/31tDkwn

A considerable amount of work has gone into writing this book and I hope that the community out there will have use for it. I would greatly appreciate reviews, comments and feedback.

Thank you,

Andreas

123 responses

I just ordered the book for delivery in start of September. I really look forward to reading it and promise to come back with feedback.

Thanks!
Fredrik

Much appreciated, Fredrik! I hope you'll like it.

Hej Andreas,

That looks awesome! I’ll most likely order it either way, but I’m curious if there’s a table of contents available, and also if the book covers alpha research and how to best avoid/limit overfitting?

Mvh,
Joakim

Tjena Joakim,

I don't cover those topics though. I find that Rob Carver is better than I at explaining those things, and would very much recommend his books.

ac

Tackar!

That looks like just what I need. Looking forward to going through it! I might pick up one of Rob's books too while I'm at it. :)

I already ordered the book and am looking forward to it.

Quick question - for the code in the book, will there be .py or .ipynb files available for download?

Either way, I cant wait to get my hands on this! Thanks again Andreas!

Both, but mostly Notebook files. Most of the models and demos is in Jupyter, with things like bundles and the like in .py.

All the code is in the book, with explanations, and also downloadable with inline comments in the source.

@Andreas, just ordered too, and looking forward to reading your new book. It is like it is coming out just went I have a use for such a book. Thanks for putting it out.

Andreas - your site seems to have stopped serving https - I can access the site with http - although the browser still keeps spinning I can see most of the content . Also - where is the code - I can't seem to find it on your site

Thanks, Geoff. You're absolutely right, and I'm getting the techies on it. It's either a total coincidence, or the release of the book a few days ago actually generated so much traffic that my site broke. A self-DoS...

Once the site is operational again, and hopefully that's within a couple of hours, you'll find the code as well as random sample data here: https://www.followingthetrend.com/trading-evolved/

Oh, and as you rightly point out, http still works even if it is really slow. So you can get the code here http://www.followingthetrend.com/trading-evolved/ under headline Downloads.

Thanks Andreas - managed to get the code. Going through your book and examples now. BTW - appreciate the insights in your books and good to have you in the python community

I will keep a running log here of errors, updates and changes. As this list grows large enough, I'll push out and updated version of the book.

A curious new error to highlight: Seems like Zipline won't install on Conda 4.7.10. Solution is either to downgrade Conda, as show in the article (thanks to Richard Dale), or to use pip. The latter is a bit more messy though, so downgrading Conda is easiest for now.

i just bought it and looking forward to reading it

@Andreas, just finished your book. Much appreciated. Great job.

I bought your book with the anticipation that you would cover the stuff I needed now, and it is exactly what you provided. Your book, for me, will be a great time-saver by opening doors to greater possibilities.

Having Zipline on my machine is to keep all my stuff on my machine as well (total and guaranteed program privacy).

Thank you for all the code examples and explanations. Be assured I will find good use of them.

Again, congratulation for a job well done.

Thank you, Guy! I'm glad you liked it.

Book writing is a hobby for me, and I'll keep doing it as long as it's fun. And getting positive feedback is the fun part. :)

I am brand new here. This is one of the first posts that I read and I am happy about that. I just bought your book!

Does the book cover the issue how to use proprietary/own data with zipline?

Sure does. Two chapters dedicated to hooking up your own data source. I also show how to set up a local MySql securities database, populate it with data and use that data for Zipline. Equities and futures are covered.

I Just purchased this book. Looks like it will provide some great information to learn from. Thank you, and will let you know what I think of it soon!

I have Python 3.6 on my PC and I don't want to try and install Conda or Zipline. Let's say right now I'm too lazy/wary of installing issues.
Is it possible to follow your book here on, and only on, Quantopian, as the title of the post suggests?

The models in the book should be possible to replicate and run on Quantopian with minor modifications.

Also, I'm sure you're aware, but you can install a Python 3.5 environment without risking any issues with your current 3.6 environment.

Bought today! Thank you very much for your great work Andreas. It seems really good, especially for people like me who are not IT programmers. The chapters on custom data with Zipline and MySql seem very interesting!

Hi Andreas,

I just purchased the book, thank you.

I wish I saw the updates page earlier, I spent quite some time creating a Dockerfile to install zipline with conda. I ended up just using pip.

Looking forward to the read, thanks again!

Can we access the source code if we got the book on amazon?

Great book, it looks like it was a great deal of work

@Ryan Thanks, it took a lot of time and effort to put together. An Amazon review is a good way to vote for more books like this. :)

Hi Andreas,I have downloaded code all the data and processed both the stocks and futures data. I've run the stocks simulations etc and have had a productive time learning the thought process you are taking us through.
I am now on Chapter 15 and am having a few issues running the futures backtest.
1) In the book and code you have in agriculture 'BL, but that is not in the data.zip - I've just commented that out as an easy work around
2) In the zipline backtest - you start from - start = datetime(2001, 1, 1, 8, 15, 12, 0, pytz.UTC), but the futures in the data.zip file seem to start at about 2015 or so (they seem to start at different dates). I've played changing the date to (2015 etc but then I get

~/anaconda3/envs/Zipline/lib/python3.5/site-packages/zipline/data/dispatch_bar_reader.py in load_raw_arrays(self, fields, start_dt, end_dt, sids) 110 for i, asset in enumerate(assets):
111 t = type(asset)
--> 112 sid_groups[t].append(asset)
113 out_pos[t].append(i)
114

KeyError:

So I'm wondering if there is some missing data? If not - I can upload the full error logs, but I'm hoping that its just missing data

Geoff

Hi Geoff,

The data is just random anyhow, and not even very realistic random. I merely provided that so that those who really have no access to any data can still play with Zipline. I generated this data myself with a simple Python based random walk script.

With the futures, it would be too much data for an easy download if I generate too far back in time. For my actual backtests, I used about 20,000 individual futures contracts.

My recommendation is this:

• Read through the book first, until chapter 23.
• If you want to replicate the models locally, find and subscribe to a data source for that type of data.
• Construct your own bundle, as outlined in chapters 23 and 24.
• Run the models on your own data, and update bundle name and instrument universe as needed.

One welcome development is that a data provider is soon releasing their own Zipline interface to their data, which would make things much easier. I have been testing Norgate's new interface, and it seems really good so far.

Thanks for taking up time over your weekend to answer these questions and give advice. Norgate looks to be reasonably priced, however it looks to be windows only installer and not practical for the likes of me (you'll find with python there are a lot of linux people out there). I can get my data from other sources and its not too hard to script.

Hi,

I bought the book and have read through about 1/3 of it and am really enjoying the author takes you through learning the python and the related libraries. I downloaded the sample code and data and have been able to get up through Chapter 6 samples working. For context, I've worked in the software industry for 30+ years and have lost track of how many computer languages I know, and I have used Python in the past.

So was enjoying the book and moved on to backtesting... and then Zipline. I have spent the better part of an afternoon trying every possible work around posted online to getting zipline working on my system. Yes, I've set up a Python 3.5 environment with Anaconda and tried all the variations on the errata page. I've had the most success with "pip install zipline" and while it seems to make it pretty far it eventually fails with a series of "Failed building the wheel..." errors that all end with "error: Microsoft Visual C++ 14.0 is required. Get it with "Microsoft Visual C++ Build Tools": https://visualstudio.microsoft.com/downloads/"

I have MS VS 2019 Community edition installed. I still get the error. I found an artricle on this at https://www.scivision.co/python-windows-visual-c-14-required/ and it basically implied installing MS VS 2015 or 2017 might help, but I don't have a license for these so at this point I am SOL and calling it quits for today.

I humbly request the author (or someone) try install the latest version of Python, Anaconda 3, VS 2019 Comm Edition on the latest patch of Windows 10 and publish how they made zipline work. I'd like to know!

I'm looking forward to solving the zipline installation issue and moving on in the book :-)

Regards,

Laura P.

Hi Laura, i had similar issues installing Zipline, but downgrading Conda to 4.6.11 solved the problem. Also make sure you're running python 3.5 and the latest 3.7. Hope it works for you as well.
conda install conda=4.6.11

Hi Maxim,
I managed to (sort of) fix the zipline installation issue by 1) finding a VS 2014 link on Stack Overflow (it's a MS download, but I could not find the link through the MS website) at http://go.microsoft.com/fwlink/?LinkId=691126&fixForIE=.exe. 2) fixing the VS 2014 installation as it was missing "rc.exe" on the PATH - instructions here: https://stackoverflow.com/questions/14372706/visual-studio-cant-build-due-to-rc-exe After 1) + 2) I was able to "conda install -c quantopian zipline" and zipline and all of it's required packages appeared to install. However, when I go to use zipline in Jupyter Notebook I get an error about LRU not being available when I import anything from zipline.
I did try your suggestion of downgrading Conda, but that generated a new set of errors.

I may just clean install python+anaconda+VS14 and try over. Tomorrow.

Again, it would be nice to get a set of instructions for fresh install of all the latest sw on the latest os. The book, errata, etc. are a bit adhoc in their suggested approach to getting zipline workling.

Regards, Laura P

A very good book! I tried a few days and finally got the zipline 1.3 installed ……. :)

For those who use Win10 and Anaconda, you can try this:
https://anaconda.org/robinszeto/env_zipline

### For Zipline

I haven't read the book by Andreas, but these might help.

Python
I always install Anaconda to a directory with short name, like C:\Python36.
Then I don't use any of the special Conda aspects of it, I treat it like straight Python.
I'm using Anaconda 3 version 5.2 from https://repo.continuum.io/archive/, that is, Anaconda3-5.2.0-Windows-x86_64.exe

Compiler
Laura, correct, that link/solution you posted already, in my experience has been the simplest, quickest and most surefire way to take care of the LRU problem with a C compiler on Windows 10: http://go.microsoft.com/fwlink/?LinkId=691126&fixForIE=.exe. So I'm surprised there were further problems and would check these ...

Path
Elements of the Windows path involving python in mine working fine with zipline:
C:\Python36;C:\Python36\DLLs;C:\Python36\Lib;C:\Python36\Library\bin;C:\Python36\Library\mingw-w64\bin;C:\Python36\Scripts;C:\Python36\bin

Dependencies
Another thing that can sometimes help, checking dependencies:

python.exe -m pip install --upgrade pip
pip install pipdeptree
pipdeptree -p zipline


Supposing pipdeptree reports any conflicts, let's say, a module named zoo with a message looking something like this ...

C:\> pipdeptree -p zipline
Warning!!! Possibly conflicting dependencies found:
- zoo [required: >=2.0,<3.0, installed: 1.5.10]


... that can be resolved like:

pip install "zoo>=2.0"

Debugger
Once past all those, seems to me zipline is remarkably reliable.
If someone wants to dig into zipline with breakpoints, to my knowledge VS Code cannot, but PyCharm does.
(the name must be from snake charming since python is a snake)

The instructions above on how to get Anaconda 3 version 5.2 combined with Robin's YML files at https://anaconda.org/robinszeto/env_zipline got me past zipline installation issues. Thank you!

I created a quandl account and successfully got zipline to ingest the quandl bundle (with some warnings). Now trying to write a backtest using zipline with quandl data and am having what appears to be some errors parsing the quandl data. So progress, but still not quite there...

I did eventually get zipline working :-)

I had to apply the patches to the zipline files as outlined in “Patching the Framework” on Mr Clenow's errata page as the final fix. So the full install that worked for me was

uninstall all prior copies of python, anaconda
install VS14 build tools from here http://go.microsoft.com/fwlink/?LinkId=691126&fixForIE=.exe. and patch VS 2014 installation to add missing "rc.exe" on the PATH - instructions here: https://stackoverflow.com/questions/14372706/visual-studio-cant-build-due-to-rc-exe
Run installer Anaconda3-5.2.0-Windows-x86_64.exe from here https://repo.continuum.io/archive/
Launch Anaconda Navigator and open a terminal window per the instructions in the book
Use conda to create the Python environ "env_zipline" using Robin Szeto's YML https://anaconda.org/robinszeto/env_zipline
Install matplotlib 3.0.0 using Anaconda Navigator
Follow the book instructions to get a Quandl account and ingest the quandl bundle using zipline
Follow instructions for “Patching the Framework” on Mr Clenow's errata page
Use "env_zipline" when launching Jupyter Notebook

At that point zipline should work in Python examples per the book on a Win10 machine.

Good luck!

Update: Anaconda-3 2019.07 (the latest version) works just as well as Anaconda-3 version 5.2 in the above instrsuctions

Dear all,

Great to see some of you had used my "env" to get the zipline installed.
Here is an updated one for your reference:

Thanks and regards,
Robin

I'm glad it worked out, Laura.

Issues like these are the main reason why everyone advised me against writing this book. The problem is that any number of things can go wrong depending on local environments and unexpected changes in software or API calls, and much of it can't be predicted or preempted.

Luckily, there are kind and helpful people like Robin and Mr. Seahawk out there to assist! Thanks guys!

Andreas

And yet you wrote the book - thank you, I'm enjoying it - so here we are!

The problems are very normal (albeit icky) software dev problems and are largely addressable with a systematic and published approach to tool versioning and package management and documenting of the install process on a clean machine.

You have a good start in your book. It would be nice to have a repository for instructions and conda YML for the top couple OS variants across Mac/Windows/Linux, maybe linked to on your website? :-) Just an idea.

Laura P

Ha, something tells me you've got a far deeper software engineering background than I do.

Although I've been programming since the 80s and started my first IT firm in the 90s, I really have no formal background in tech. My actual background is in finance and business, and on the tech side I've always taken a 'fly by the seat of your pants' approach. Which by the way is one of my favorite anglosaxan idioms since it makes no sense whatsoever. My approach is completely result focused and that often leads to doing things less by the book than a proper software engineer would have done.

Well, if you or anyone else out there would like to contribute to such a project, I'd be happy to help out, host it and give it visibility. :)

Funny thing though. So far, the book sold extremely well. Surprisingly well. I got a large amount of feedback, but only two really negative. One was quite upset because the book was far too difficult. The other was equally upset over the book being far too simplistic.

Hi,

The "Momentum Model" example in Ch 12 uses the bundle "ac_equities_db" (which, of course, we don't have) and changing it to "quandl" generates errors due to the symbol names you use in the index universe. An AC blessed fix for this would be appreciated.

Related, is there a better place to post errata than this thread?

I managed to get zipline working on a 2nd Win10 machine without any issues by following the instructions I posted (and updated). Yes, I have a sw eng background. BTW - "flying by the seat of your pants" is rooted in pilot lingo in reference how accurately your ass tells you some things about your flying that instruments don't capture. Modern F1 drivers talk about the same phenomenon and if you've ever raced a car around a track you quickly understand the term.

Regards, Laura

Ah, yes, I've realized that I was perhaps not clear enough on that one. I got a few questions like that already.

From chapter 12 on, you will need better data than you'll find for free. That means that you'll need to get your own data and build your own bundle to hook it up. That part is explained in chapters 23 and 24.

Initially I had those chapters up front, explaining how to make bundles and hook up your own data early on. I moved it to the back on advice from a fellow author, who pointed out that those technical chapters will make many casual readers drop off without getting to the models and trading stuff.

This is a good place to discuss, but I'm following the threads on other places as well. There's some activity around these things on my own site, followingthetrend.com, and my errata page there. If you find errors, and hopefully solutions, please let me know and I'll update the official errata page.

Thanks for the background on that phrase! Next time my wife tells me to slow down my BMW, I'll use that. "But honey, I'm flying by the seat of my pants, just like pilots and F1 drivers do!"

Hi, I am having some problems running file Portfolio Backtest.ipynb from chapter 8. Even after modifying file Benchmarks.py as recommended by Andreas, the code is still unable to execute correctly on my PC. Running zipline 1.3.0 with python 3.5.6 in Anaconda virtual environnement. All previous codes from the book worked correctly. Error is when running "run_algorithm" function. See below. Any one having a similar issue?

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-b68d206bfb01> in <module>()
103     capital_base=10000,
104     data_frequency = 'daily',
--> 105     bundle='quandl'
106 )

C:\Anaconda3\envs\quantopian\lib\site-packages\zipline\utils\run_algo.py in run_algorithm(start, end, initialize, capital_base, handle_data, before_trading_start, analyze, data_frequency, data, bundle, bundle_timestamp, trading_calendar, metrics_set, default_extension, extensions, strict_extensions, environ, blotter)
428         local_namespace=False,
429         environ=environ,
--> 430         blotter=blotter,
431     )

C:\Anaconda3\envs\quantopian\lib\site-packages\zipline\utils\run_algo.py in _run(handle_data, initialize, before_trading_start, analyze, algofile, algotext, defines, data_frequency, capital_base, data, bundle, bundle_timestamp, start, end, output, trading_calendar, print_algo, metrics_set, local_namespace, environ, blotter)
227     ).run(
228         data,
--> 229         overwrite_sim_params=False,
230     )
231

C:\Anaconda3\envs\quantopian\lib\site-packages\zipline\algorithm.py in run(self, data, overwrite_sim_params)
760             daily_stats = self._create_daily_stats(perfs)
761
--> 762             self.analyze(daily_stats)
763         finally:
764             self.data_portal = None

C:\Anaconda3\envs\quantopian\lib\site-packages\zipline\algorithm.py in analyze(self, perf)
474
475         with ZiplineAPI(self):
--> 476             self._analyze(self, perf)
477
478     def __repr__(self):

<ipython-input-2-b68d206bfb01> in analyze(context, perf)
87 def analyze(context, perf):
88     # Use PyFolio to generate a performance report
---> 89     returns, positions, transactions = pf.utils.extract_rets_pos_txn_from_zipline(perf)
90     pf.create_returns_tear_sheet(returns, benchmark_rets=None)
91

ValueError: too many values to unpack (expected 3)



I am not an expert but "too many values to unpack" may imply a version mismatch in the libraries. Check your install vs. Robin Szeto's YML and pyfolio 0.9.2

Yes, my thoughts as well. You might have accidentally installed an earlier version of PyFolio.

Note that, at least last time I checked, the PyFolio package on the conda channel isn't updated. Installing through conda will give you an old version.

Thanks a lot for your help Laura & Andreas! I run pyfolio 0.5.1 so that's probably why... I confirm it is impossible to upgrade using conda update as you mentioned earlier, Andreas Conda Installation Version Inconsistency

That's definitely the issue. And I think I know why...

The PyFolio package on conda isn't updated for some reason, so if you install using conda you'll get 0.5.1. You need to use pip to install this package.

I've raised this a long time ago, but it's not updated yet. I was hoping to avoid using pip for the book, but for this package I had to.

I'd uninstall PyFolio and then install it again, using pip.

Just to confuse matters: it appears conda will run pip to install packages if you provide a YML with a pip section. So it's possible to get all the right libraries with a single YML and conda.

Andreas, I saw your post about Ch23/24. I kind understand why you might put these end of the book. imho, this is a technical books for a technical audience so before Chap 12 would have been fine for me. Anyway, I set up a trial Norgate Data account and am working through modifying your sample code to get zipline to ingest the data. I believe you need a line of code after you read the CSV file for zipline to work with Norgate:

df.rename(columns={"Close":"close","Open":"open","High":"high","Low":"low","Volume":"volume"},inplace=True)


Also, some validity checking of the dataframe returned by pd.read_csv() will help filter out corrupt data files (I had one in my download for some reason). I added a line to ensure the df return from read_csv() had at least 5 columns, which is pretty minimal error checking but got me past the corrupt file. So I now have a Norgate trial data bundle working and I'm debating whether to go through Ch24 and get MySQL working or skip that step go back to Ch12. which leads me to my question.

Will zipline run_algoritm() run any faster if I take the extra step of importing the data into MySQL and creating a MySQL based bundle? It looks like the ingest process will be faster, but the actual in Python use in handle_data() and data.history() will run at the same speed as they are mapping internally in zipline/pandas to bcolz and there's no direct linkage to MySQL after the ingest. Is this correct?

Have a great weekend :-)

Laura P

My problem has now been solved. He is what I did.

1. I first created a new anaconda virtual environment for zipline using line "conda env create -f env_zipline_20190904b_office.yml" into Anaconda Prompt. zipline yml file here
2. As it seemed that pyfolio was not included, I installed the latest version of this library using command "pip install pyfolio"
3. I then run again file "Backtest Analysis.ipynb" in Jupyter Notebook and everything is fine now :-)

Thanks so much for you help Laura & Andreas!

Cyril

I have tried installing and uninstalling and reinstalling in various iterations, but I seem to keep ending up with a variation of the same error.

Usually I can do an ingest in zipline as indicated in the book and don't get a "ImportError: cannot import name 'load_prices_from_csv'" until running the example code available on the book website.

I have tried importing the environment method as above and seem to getting "ImportError: cannot import name 'load_prices_from_csv'" when I try to ingest the quandl data.

(env_zipline) C:\Users\Jason>zipline ingest -b quandl
Traceback (most recent call last):
File "C:\Users\Jason\Anaconda3\envs\env_zipline\Scripts\zipline-script.py", line 11, in <module>
File "C:\Users\Jason\Anaconda3\envs\env_zipline\lib\site-packages\pkg_resources\__init__.py", line 484, in load_entry_point
File "C:\Users\Jason\Anaconda3\envs\env_zipline\lib\site-packages\pkg_resources\__init__.py", line 2707, in load_entry_point
File "C:\Users\Jason\Anaconda3\envs\env_zipline\lib\site-packages\pkg_resources\__init__.py", line 2325, in load
return self.resolve()
File "C:\Users\Jason\Anaconda3\envs\env_zipline\lib\site-packages\pkg_resources\__init__.py", line 2331, in resolve
module = __import__(self.module_name, fromlist=['__name__'], level=0)
File "C:\Users\Jason\Anaconda3\envs\env_zipline\lib\site-packages\zipline\__init__.py", line 23, in <module>
from . import data
File "C:\Users\Jason\Anaconda3\envs\env_zipline\lib\site-packages\zipline\data\__init__.py", line 2, in <module>


And the error in JupyterLab that has been consistent across various installation attempts:

ImportError                               Traceback (most recent call last)
<ipython-input-2-d870b0556631> in <module>()
1 # Import a few libraries we need
2 get_ipython().run_line_magic('matplotlib', 'inline')
----> 3 from zipline import run_algorithm
4 from zipline.api import order_target_percent, symbol,      schedule_function, date_rules, time_rules
5 from datetime import datetime

~\Anaconda3\envs\env_zipline\lib\site-packages\zipline\__init__.py in <module>()
22
---> 23 from . import data
24 from . import finance
25 from . import gens

~\Anaconda3\envs\env_zipline\lib\site-packages\zipline\data\__init__.py in <module>()
----> 2 from .loader import (
5 )



Any help would be appreciated!

@Andreas:
I've just read the index of your book. But it seems there is intruduction how to connect your program/algo to IB for live trading?

There is nothing about live trading in the book. Mostly because I have an allergy against law suits and I believe that including a chapter on live trading would give me quite a rash. Publishing a book with code for live trading would be a legal nightmare.

I am not sure if there is misunderstanding here. Ok, I change my question as follow:
In your book is there describtion about how to connect the zipline platform or the algo-program to the paper account by IB?

Besides, as to the backtesting, I wonder if there is describtion how to do the auto- optimation. Formerly one can do this with the Quantopian notebook. But later Quantopian has taken out and switch off a component and it is not possible any more to do the auto-optimation of backtesting.

In your book is there describtion about how to connect the zipline platform or the algo-program to the paper account by IB?

No.

I wonder if there is describtion how to do the auto- optimation.

I'm not familiar with that term.

To the "auto- optimation" I mean:
Assumed my algo is based on SMA1 cross over SMA2. But I am not sure what values are the "best" for SMA1 and SMA2. So I set the SMA1 to 10 and SMA2 to 50 first and do a backtesting. Then I will change the SMA1 and SMA2 to other values. After each changing I do the backtesting again. Surely I can do this manually but this is quite boring. So if I use a for loop, then I can just start the backtesting one time (but in fact there will be hundres of backtesting running one after the other) and at the end select the "best" combination of SMA1 and SMA2.

Hope you understand what I mean here. One can also call this as parameter optination for backtesting. The Ninjatrader has such a function. One could do this by Quantopian formerly. But the Quantopain turned it off later.

Surely this could lead to over-fitting, but this is another theme.

To the question of connecting the zipline to paper account by IB:
Formerly one can do the live-trade here by Quantopian. Since the Quantopian turned off this function, many people looking for another altervatives. Me too. I've heard the zipline is one of them. But at that time the zipline was still in developing phase. I am not sure if the zipline is now mature enough. I though you know zipline well. It is not matter if you haven't described this in your book. I just want to know if one can use the zipline to do live-trade.

Thomas, Have you looked at IBridgePy http://www.ibridgepy.com/ ?

Hi, I've enjoyed the book to the last page and am going back through and working with the examples. The symbols CU and NE are listed as currencies and TW is shown as an equity in the book, but I don't seem to have these in the couple data feeds I'm using. What are they? In general, some comments in the code next to the futures symbols would make them easier to map to other data feeds.

Best, LP

@Lee
Yes, I am using now IbridgePy. But I find it is not stable and robust enough. My algo is broken up quite often bcz this or that problem. The documentation for backtesting is poor.

But the IBridgePy is easy to use and it is free.

CU is the Euro currency futures and NE is New Zealand currency futures. TW is MSCI Taiwan index futures.

The ticker symbols may appear slightly differently for some data providers.

Yes, mapping the codes is quite a joy :-p Two more: BL and LR?

Are you aware of any sort of Zipline upgrade that does incremental data updates instead of the full ingest?

BL is milling wheat. LR is Robusta.

I'm not sure, but I think the upcoming Norgate Data plugin for Zipline works with incremental updates. I don't know how they solved it though.

a nice ft article on hedge funds following the trend. Unfortunately its paywall

I completed my first reading. Congrat again for this great book Andreas. I created my first custom Bundle and it seems ok. Without your help it would have taken months! Of course there is still a lot of homework to do to use and customize all the valuable source codes from the book.
I am also new in MySQL and found this part is a bit light for a beginner. I have issues running the codes of chapter 24, probably because of my low level. Any book or materials to recommend to start learning mySQL combined with python?
It would have also been great to focus on how to use pipeline with custom bundles. Everything I read on this topic is very technical and difficult to understand for someone with an intermediate level in python. Maybe Andreas will need to write another book :-)
Any interest to build a community around this book? I am not sure Q forum is the best place to discuss all these topics but maybe it is...
Feedbacks welcomed.

Andreas, thank you. LR - of course - thank you. I still can't map BL to Norgate Data futures - the closest seems to be KE, "KC HRW Wheat" but I am not confident it's the right mapping. Any comments?

Also, great news about the potential ND plugin to zipline. Have been contemplating finding a way to do fast and incremental "zipline ingests", more like "zipline snack" (or maybe nibble?) but would be happy if someone else solves the problem.

Cyril - you are in the deep end of the pool now! 1) as far as I can see, you are not required to use MySQL to work with the samples in the book, it's just a very useful tool for all the reasons Andreas lists in Ch24. I'm using MySQL for a variety of reasons beyond what the book covers for stock/futures metadata for example. 2) keep in mind that MySQL is a universe of learning unto itself and just about any book on MySQL will help you because you need to learn how to set up the database and run queries, all of which have little to do with Python or zipline. So find a book (or website) on MySQL that you like and just start experimenting. I should mention that with all things Python, getting the right versions of the MySQL libraries installed can be a challenge.

The idea to start a community/forum around the book other than this Q thread is good. But where?

-Laura P

Greetings all !

Andreas, with Laura Peterson's help I was finally able to install Zipline and get it working. She came out of nowhere and responded to one of my questions on Anaconda Community giving me the missing piece of the puzzle. Laura, again, thanks very much for your help.

I seem to be part of a very small group using Python on a Mac. Most of the fixes noted here and elsewhere are not relevant to the Mac OS, and some of the issues I experienced on my Mac cannot be answered by Windows users. Andreas, I continue to believe that posting the YML's you used to test your code on Windows, Mac OS and Linux machines would be very helpful to your readers.

I am an accountant, not a coder. So much of the code displayed ion the book is directionally understandable, but I doubt I will be able to generate my own code after reading this book. Nevertheless, I an enjoying reading the book and the ideas presented.

I do have a few immediate questions:

1. Do the returns using the strategies in the book include dividends?

2. If so, how are the dividend declaration and payment dates captured and used in determining total returns?

3. Is there a way to separate dividends from Total Return?

Ed

Thanks Laura! You are right I must learn swimming by myself but the deepness of the pool is a bit frightening :-) This book was the best I can get because written by a senior manager coming from the finance industry and sharing the same language. Happy to discuss here as main topic is zipline / quantopian unless Andreas finds a better way to communicate. I found a couple of libraries on Github to improve communication between zipline and local cvs files but will have to test them.

The MySql chapter nearly got cut. I had a couple of technical reviewers of the manuscript telling me that it's irrelevant to the core topic and would only confuse people. They may very well be right, and it's certainly not required. But I did take their advice and moved the chapter to the end of the book, instead of the middle where it originally was.

I find MySql really useful for maintaining a local securities database. You don't really need to go very deep into this topic to have use for it. You might want to look into the Head First book series on topics like this. I find that series really great for getting into a totally new technical subject. It's not written for software engineers, but rather people with a more casual background.

Discussion forum: I'd be happy to take suggestions. I had a similar thought, and a few people suggesting it.

At first, I considered setting up a vBulletin or similar on my own server, but I had some bad experiences in the past trying to police a site like that. Second, on suggestion I looked into using GitHub for the code sharing and discussion, but it doesn't seem like a great place for this kind of purpose.

I'm sure there must be a great existing site somewhere, allowing creation of sub communities with all kinds of built in functionality. I normally try my best to stay far away from the whole social media scene, and I've lost track of the players in the space. If anyone knows of a good site for this kind of purpose, please let me know.

Thank you very much for your answer Andreas. Writing a chapter on MySql was an excellent idea and we can easily feel that your initial plan was to spend more time on this topic. Thank you for recommending the Head First book!
Regarding forum, best would be to use this thread but I am afraid it will become complex to update & read. Not sure talking about zipline is the main goal of this community. Could be better to use google group dedicated to zipline. I would love to have the opinion of Quantopian team!
Moderation is a full-time job without any value added and potential legal hurdles so if you wish to create an independant forum, it is probably best to keep it private and small. Once again Google Groups seem to be a good option but I will ask my children tonight as they are my best social network advisors :-)

Hi Andreas

Let you know I solved the technical problem myself. This little tip could be helpful to anyone who is a Python beginner so I will share it here. You need to patch benchmark.py and loader.py like the book said to make the "Your First Zipline Backtest" work. However, you can't patch it then run the test while Python is running. You need to exit everything first. Then patch. Then reload the whole python zipline, set quandl API key, ingest quandl bundle. then run the test, then it will work.

Also I notice that I need to set quandl API key and ingest quandl bundle EACH TIME I start python zipline backtesting. Once I quit Anaconda, everything is lost.

Now I am onto Portfolio Backtest. Hopefully everything will work fine from here onwards. Fingers Crossed.

Andreas, since yesterday, I cannot load/access your website: www.followingthetrend.com and your book Errata and Updates page. It says "The connection has timed out. The server at www.followingthetrend.com is taking too long to respond." Could you please have a look at that? Thank you.

Henry

Ordered the book. Thank you!

For the market data part, I am wondering if you have come across some good source of fundamental data apart from just market data.

The best free fundamental data sources I've found must be screen scraped with something like the Python BeautifulSoup package (which I certainly can't recommend as a practice).

The book website is slow and does timeout at times. Patience is a virtue.

The quandl API key issue that Henri Luk had could be any number of things. One possible solution is to set the environment variable at the system level (how you do this depends on whether you are running Windows/Mac/Linux).

Personally, it seems like I've spent way too much time on data source issues but I have mostly solved them. I have a nice automated system that runs overnight to obtain equity, futures, and fundamental data, cache it, insert in MySQL and Zipline and a minmal logging facility so I can check the correctness of the data. It's a work in progress and the samples in the book were helpful in getting me started.

Hi Laura

Thank you for your advice. I am running Windows 10 Pro 64bits. I am a beginner with Python. Would you mind if you could elaborate on how I can set the environment variable at the system level please?

YES! what you are doing with your data capturing is exactly how I want to do it for myself. I am still slowly working through the book at the moment (up to page around 200) so I am nowhere near that but when I try to do that later. May I ask you for help if I am stuck somewhere please?

Thank you for your help Laura.

Henry

Hi Andreas I just bought your book, I read all the comments. So I am excited to whats in store :)

Henri, as with many things the internet is a better source of "how to" than I can type here and that is true of setting a system environment var.

Pulling together a custom overnight job is not too difficult. You make sure all the python applets you want to execute run cleanly from the command line, put them all in a single batch (.bat) file - including the call to anaconda "activate" to set the Python environment - and schedule it using the Windows "Task Scheduler". You may have to play around with the batch file and the task scheduler job properties a bit to get it to work the way you want. The process on a Mac or Linux is a bit different but well documented on the interwebs.

Hi Laura (and anyone else who has faced this issue!), Did you have to modify the "random futures data" bundle (or whatever bundle you used to load your futures contracts) to ensure that the futures contracts loaded in the correct order to ensure the proper rolling of continuous futures contracts? Right now my continuous futures contracts are rolling on a yearly basis (from one "F" contract" to the next "F", for example) and that seems to be a result of the order in which the contracts were ingested. Can you share how you were able to overcome this issue? (Caleb, this might be related to your issue above too...? Unless you have already encountered/addressed the issue I am describing?) Thanks for your input!

Huge thanks to Laura Peterson and Robin Szeto. I used their resources and fixed my installation which I wouldn't have been able to do without them

Has anyone managed to make this setup work with minute based equity data?

if yes, who is your data provider? Any sample bundle specs for minute based data?

Thanks!

just read week ago. A nice book to cover many trading strategy ideas, specially in Python.

I've not had that sqllite error. I will say that the work Norgate Data have done to make their feed work well with Mr. Clenow's book is very helpful. They provide full zipline integration and I saw they have even published the book examples using their zipline plugin. It's good to see the reinvigorated interest in zipline due to the book!

Hey @Andreas Clenow great book just finished it, do you have any work coming out on optimisation of portfolio/parameters soon?

Hey everyone. Just wondering whether anyone has had an ingest issue where a given futures contract is mapped to the month ahead. For example I have ingested a chain of ES contracts - and zipline has mapped the price/volume data of my ESZ15 to its ESH16. I have tried and checked and everything appears to adhere to all formatting and ordering rules, and am not getting any annoying errors. I feel success is close once this is resolved..

Anyone seen this before?

---Problem Solved---
I foolishly removed the insert into metadata table part because I had my database doing that instead. This resulted in the SIDs for each contract being ingested beginning at 1, and the index of the metadata table beginning at 0. naturally resulting in an offset. This matters, for anyone who runs into this issue at some point.

...just ordered book! Looks excellent. I am learning this, don't want to reinvent the wheel, and I am encouraged by all the previous positive posts. Thanks, Andrew!

Andreas,
I have been in technology for over 30 years and have recently stumbled upon trend following and algo trading which have quickly become my new passions. I finished your two books, Following the Trend and Stocks on the Move and am now in the middle of Trading Evolved. Your work is outstanding and I greatly appreciate the trading strategies you describe and explain how to backtest in Python/zipline. I am new to Python and have closely followed the great support provided by Laura P and Robin S. I have quandl correctly importing a bundle and I have Pyfolio 0.9.1 installed. However, when I run the First Zipline Backtest, I get the following error:

JSONDecodeError: Expecting value: line 1 column 1 (char 0)


I have gone through all the suggestions in this thread and have googled the error message without being able to resolve the issue. Does anyone have an idea where I might be doing something wrong? Thanks in advance.

Here is the full error message:

JSONDecodeError Traceback (most recent call last)
in ()
71 handle_data=handle_data,
72 capital_base=10000,
---> 73 data_frequency = 'daily', bundle='quandl'
74 )

~\Anaconda3\envs\env_zipline\lib\site-packages\zipline\utils\run_algo.py in run_algorithm(start, end, initialize, capital_base, handle_data, before_trading_start, analyze, data_frequency, data, bundle, bundle_timestamp, trading_calendar, metrics_set, default_extension, extensions, strict_extensions, environ, blotter) 428 local_namespace=False,
429 environ=environ,
--> 430 blotter=blotter,
431 )

~\Anaconda3\envs\env_zipline\lib\site-packages\zipline\utils\run_algo.py in _run(handle_data, initialize, before_trading_start, analyze, algofile, algotext, defines, data_frequency, capital_base, data, bundle, bundle_timestamp, start, end, output, trading_calendar, print_algo, metrics_set, local_namespace, environ, blotter) 157 trading_calendar=trading_calendar,
160 )

--> 103 self.bm_symbol,
104 )
105

--> 149 environ,
150 )
151 tc = ensure_treasury_data(

217 try:
--> 218 data = get_benchmark_returns(symbol)
219 data.to_csv(get_data_filepath(filename, environ))
220 except (OSError, IOError, HTTPError):

47 )
---> 48 data = r.json()
49
50 df = pd.DataFrame(data)

~\Anaconda3\envs\env_zipline\lib\site-packages\requests\models.py in json(self, **kwargs) 895 # used.
896 pass
898
899 @property

~\Anaconda3\envs\env_zipline\lib\json_init_.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw) 317 parse_int is None and parse_float is None and
318 parse_constant is None and object_pairs_hook is None and not kw):
--> 319 return _default_decoder.decode(s)
320 if cls is None:
321 cls = JSONDecoder

~\Anaconda3\envs\env_zipline\lib\json\decoder.py in decode(self, s, _w) 337
338 """
--> 339 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
340 end = _w(s, end).end()
341 if end != len(s):

~\Anaconda3\envs\env_zipline\lib\json\decoder.py in raw_decode(self, s, idx) 355 obj, end = self.scan_once(s, idx)
356 except StopIteration as err:
--> 357 raise JSONDecodeError("Expecting value", s, err.value) from None
358 return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

@Andreas

I have difficulties to run "Populating Database"

I setup the
- SQL community server.
- created to database model like described in the book
- used some csv files from random stocks
- have connection to the database

but it looks like is not correctly writing to it, I'm quite sure the db empty later.
"Populating Database" has as small typo "query = insert_init + vals + insert" is guess it should be "...insert_end"

Do you have more information were I could have set up the SQL sever wrong? (you said someone deleted the pages from your book, may be store a PDF on your homepage)
Could you try if the version in the book/on you Homepage works with your SQL...might be one more typo...

Thanks
Carsten

the message I get:
ProgrammingError: (mysql.connector.errors.ProgrammingError) 1146 (42S02): Table 'dbstocksprice.equity_history' doesn't exist [SQL: "insert into equity_history \n (trade_date, ticker, open, high, low, close, volume, dividend, in_sp500)\n values \n ('1999-11-18 00:00:00', 'A', 50.24309847871317, 50.24309847871317, 50.24309847871317, 50.24309847871317, 58425783.68, 0.0, 0.0),('1999-11-19 00:00:00', 'A', 51.092206843003424, 51.092206843003424, 51.092206843003424, 51.092206843003424, 19084157.7, 0.0, 0.0),('1999-11-22 00:00:00', 'A', 50.39224360925426, 50.39224360925426, 50.39224360925426, 50.39224360925426, 8310053.22, 0.0, 0.0),('1999-11-23 00:00:00', 'A', 4

Hi Andreas,
I just bought your new book on Amazon and now waiting for it to arrive here in a few days. I look forward to reading it and chatting with you again.
All the best, from Tony M.

@Edward: Looks like you ran into the IEX API issue. Zipline fetches data in the background from IEX, which is somewhat odd as that data isn't really used for anything meaningful anyhow. Well, recently IEX decided to kill their API, bricking Zipline.

The good news is that the fix is easy. Just disable the benchmark data fetching. You won't need it.

More details: https://github.com/quantopian/zipline/issues/2480

@Carsten: Seems like you don't have a table with the name equity_history. Your query tries to insert values into a table which does not exist. Create table if you haven't done so, and check spelling if you already have the table.

@Tony, I hope you'll like the book. I was speaking at a conference in Singapore a couple of weeks ago, but didn't see you there. Perhaps next time around, if you're still based in spor.

Ordering the book as well! Looks great, thanks for sharing!

I'm running a futures backtest and trying to get a more complete tear sheet, but am running into a few issues.
I have:

def analyze (context, perf):
returns, positions, transactions = pf.utils.extract_rets_pos_txn_from_zipline(perf)
pf.create_full_tear_sheet(returns, positions = positions, transactions = transactions, live_start_date='2001-02-01', benchmark_rets=None)


This does return quite a few more helpful stats for the backtest, but when it lists the "Top 10 Positions" (long, short, overall), it's not calculating their percentage return accurately. Does this have to do with the cost basis? What metrics are other folks using for full tear sheets on futures backtests?
Also, it is not returning the benchmark. The code in the book has "benchmark_rets=None". Is that correct? I patched the benchmark file, and everything else seems to be running smoothly.

The patch you made to not get the benchmark means you will not get benchmark results from the backtest.

Thanks, Paul. That's sort of what I had thought, but the other sample backtests (at least in the book) still seemed to include benchmark results.
So what are other folks doing as a workaround, or are people just not incorporating benchmarks in their futures backtests on zipline now?

I give up. I have spent hours trying to install this thing in my Mac with no success...

Could anyone recommend a quality data provider that supports macOS? Norgate is Windows only. Thanks very much.

Raphael, I use Interactive brokers data for generating trade signals. Works well on a Mac and free for non professionals too.

Anyone had an issue installing nb_conda package in Python 3.5 environment?

I get the following message when attempting to install the package:

"UnsatisfiableError: The following specifications were found
to be incompatible with the existing python installation in your environment:
Specifications:
- nb_conda -> python[version='>=2.7,=3.6,=3.7, python[version='>=3.6']

Bernie,

I (and others) had the same problem. Our workaround was to install nb_conda using the terminal rather than through the GUI of Anaconda Navigator. From your environment in Anaconda Navigator, open up the terminal. Then paste in:

conda install -c conda-forge nb_conda

Cheers!

Tom

@Tom

I got the same issue, but

conda install -c conda-forge nb_conda

allowed to install the nb_conda package.

Unfortunately the Jupiter notebook gave an error opening the conda tap, does it work in your environment? Do you use Mac or Windows? (I’m on a Mac)

@Carsten

I'm using both Linux (Ubuntu 18.04) and Windows 10 on a dual-booted machine. I got it to work on both. I did not have problems opening the environmment in Jupyter Notebook. One idea. Try opening it in Jupyter Lab instead of Jupyter Notebook? (Jupyter Lab is just the newer version of Jupyter Notebook and available on Anaconda Navigator.)

You can see the exact sequence I followed in my long post at the bottom of this thread: https://www.followingthetrend.com/2019/08/trading-evolved-errata-and-updates/

I hope that helps. I'm afraid I can't help you with any Mac-specific problems.

Any suggestions would be greatly appreciated!

With the help of comments here and elsewhere, and a great deal of fiddling, I have managed to get through the first 7 Chapters. Now, with no errors. And thanks to Tom above got nb_conda installed.

I am in Chapter 8, simply opened the code given for the chapter in a new Jupyter Notebook and receive the error message below for the first cell:

ValueError Traceback (most recent call last)
in ()
5 from datetime import datetime
6 import pytz
----> 7 import pyfolio as pf
8
9 def initialize(context):
D:\Chip\Documents\Anaconda3\envs\Z35_Test\lib\site-packages\pyfolio_init.py in ()
2
3 from . import utils
----> 4 from . import timeseries
5 from . import pos
6 from . import txn
D:\Chip\Documents\Anaconda3\envs\Z35_Test\lib\site-packages\pyfolio\timeseries.py in ()
23 import scipy as sp
24 import scipy.stats as stats
---> 25 from sklearn import linear_model
26
27 from .deprecate import deprecated
D:\Chip\Documents\Anaconda3\envs\Z35_Test\lib\site-packages\sklearn__init
.py in ()
72 else:
73 from . import __check_build
---> 74 from .base import clone
75 from .utils._show_versions import show_versions
76
D:\Chip\Documents\Anaconda3\envs\Z35_Test\lib\site-packages\sklearn\base.py in ()
18
19 from . import __version
_
---> 20 from .utils import IS_32BIT
21
22 _DEFAULT_TAGS = {
D:\Chip\Documents\Anaconda3\envs\Z35_Test\lib\site-packages\sklearn\utils__init
.py in ()
18 from scipy.sparse import issparse
19
---> 20 from .murmurhash import murmurhash3_32
21 from .class_weight import compute_class_weight, compute_sample_weight
22 from . import _joblib
__init
_.pxd in init sklearn.utils.murmurhash()
ValueError: numpy.ufunc size changed, may indicate binary incompatibility. Expected 216 from C header, got 192 from PyObject

I am on a Win10/64 system and have reinstalled Anaconda twice and created a fresh Python3.5 environment multiple times all with the same result.

Any suggestions will be greatly appreciated.

Chip

Hello Andreas,

I am reading your book and got stuck with ingesting Quandl data. The data can be ingested, but the date format cannot be parsed by pandas. Maybe my pandas version is wrong (0.22.0). Or maybe Quandl updated their date format. Can you share what pandas version are you using? The error occurs at

Error parsing datetime string "2018_11_25T05:24:33.144739" at position 4


Full backtrace

results = run_algorithm(
start=start_date,
end=end_date,
initialize=initialize,
analyze=analyze,
handle_data=handle_data,
capital_base=10000,
data_frequency='daily',
bundle='quandl'
)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
pandas/_libs/tslib.pyx in pandas._libs.tslib.convert_str_to_tsobject()

pandas/_libs/src/datetime.pxd in datetime._string_to_dts()

ValueError: Error parsing datetime string "2018_11_25T05:24:33.144739" at position 4

During handling of the above exception, another exception occurred:

ParserError                               Traceback (most recent call last)
pandas/_libs/tslib.pyx in pandas._libs.tslib.convert_str_to_tsobject()

pandas/_libs/tslibs/parsing.pyx in pandas._libs.tslibs.parsing.parse_datetime_string()

/opt/anaconda3/envs/quant/lib/python3.5/site-packages/dateutil/parser/_parser.py in parse(timestr, parserinfo, **kwargs)
1373     else:
-> 1374         return DEFAULTPARSER.parse(timestr, **kwargs)
1375

/opt/anaconda3/envs/quant/lib/python3.5/site-packages/dateutil/parser/_parser.py in parse(self, timestr, default, ignoretz, tzinfos, **kwargs)
648         if res is None:
--> 649             raise ParserError("Unknown string format: %s", timestr)
650

ParserError: Unknown string format: 2018_11_25T05:24:33.144739

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
/opt/anaconda3/envs/quant/lib/python3.5/site-packages/zipline/data/bundles/core.py in most_recent_data(bundle_name, timestamp, environ)
485                      filter(complement(pth.hidden), candidates),
--> 486                      key=from_bundle_ingest_dirname,
487                  )],

/opt/anaconda3/envs/quant/lib/python3.5/site-packages/zipline/data/bundles/core.py in from_bundle_ingest_dirname(cs)
122     """
--> 123     return pd.Timestamp(cs.replace(';', ':'))
124

pandas/_libs/tslib.pyx in pandas._libs.tslib.Timestamp.__new__()

pandas/_libs/tslib.pyx in pandas._libs.tslib.convert_to_tsobject()

pandas/_libs/tslib.pyx in pandas._libs.tslib.convert_str_to_tsobject()

ValueError: could not convert string to Timestamp

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-9-ed12f86c5de4> in <module>()
7     capital_base=10000,
8     data_frequency='daily',
----> 9     bundle='quandl'
10 )

/opt/anaconda3/envs/quant/lib/python3.5/site-packages/zipline/utils/run_algo.py in run_algorithm(start, end, initialize, capital_base, handle_data, before_trading_start, analyze, data_frequency, data, bundle, bundle_timestamp, trading_calendar, metrics_set, default_extension, extensions, strict_extensions, environ, blotter)
428         local_namespace=False,
429         environ=environ,
--> 430         blotter=blotter,
431     )

/opt/anaconda3/envs/quant/lib/python3.5/site-packages/zipline/utils/run_algo.py in _run(handle_data, initialize, before_trading_start, analyze, algofile, algotext, defines, data_frequency, capital_base, data, bundle, bundle_timestamp, start, end, output, trading_calendar, print_algo, metrics_set, local_namespace, environ, blotter)
139             bundle,
140             environ,
--> 141             bundle_timestamp,
142         )
143

519         if timestamp is None:
520             timestamp = pd.Timestamp.utcnow()
--> 521         timestr = most_recent_data(name, timestamp, environ=environ)
522         return BundleData(
523             asset_finder=AssetFinder(

/opt/anaconda3/envs/quant/lib/python3.5/site-packages/zipline/data/bundles/core.py in most_recent_data(bundle_name, timestamp, environ)
495                 'maybe you need to run: $zipline ingest -b {bundle}'.format( 496 bundle=bundle_name, --> 497 timestamp=timestamp, 498 ), 499 ) ValueError: no data for bundle 'quandl' on or before 2019-12-30 02:43:25.867976+00:00 maybe you need to run:$ zipline ingest -b quandl


I know the data have been successfully ingested, because I can execute the following:

import quandl
data = quandl.get("WIKI/AAPL", trim_start = "2000-12-12", trim_end="2019-12-30", authtoken=quandl.ApiConfig.api_key)


@Andreas

I managed to populate the data base (Chapter 24)

Just 3 issues

• there is a small typo: query = insert_init + vals + insert should be query = insert_init + vals + insert_end

• For Mac it reads always a hidden file, my solution is: -> while '.DS_S' in symbols: symbols.remove('.DS_S')
for Mac this should look like this:

def process_symbols():
# Remember slicing? Let's slice away the last four
# characters, which will be '.csv'
# Using [] to make a list of all the symbols
symbols = [s[:-4] for s in os.listdir(data_location)]
while '.DS_S' in symbols: symbols.remove('.DS_S')
for symbol in tqdm_notebook(symbols, desc='Importing...'):
import_file(symbol)

• For the Database Table you used the attribute 'Unique' for trade_date and ticker (Figure 24‑2 in the book)
If I use this, I can exactly transfer one row for every ticker to the database.
If I remove the attribute for ticker, I get all the data, but just for one ticker.
The other ticker don't load fort the existing dates from the first ticker.
I have to remove unique for both to fill the database - no problem if the datasource is clean.
I think you would like to avoid several times the same trade_date for one ticker, but with the setting shown in your book, the result is like explained above.
If one could set unique for ticker AND trade_date this should do the job, but did not find the setting so far.

Thankx
Carsten

@Tom

I have tried your install sequence suggested in your link above. I have done it several times the last from a complete uninstall/reinstall of Anaconda. I continue to get a Kernal Error when I open Jupyter Notebook. The last couple of lines of the error are:

File "D:\Chip\Documents\Anaconda\envs\pyt35\lib\site-packages\jupyter_client\connect.py", line 100, in secure_write
win32_restrict_file_to_user(fname)
File "D:\Chip\Documents\Anaconda\envs\pyt35\lib\site-packages\jupyter_client\connect.py", line 53, in win32_restrict_file_to_user
import win32api
ImportError: No module named 'win32api'

I am on a Windows 10 64bit laptop.

Thanks for any help.

Chip

Chip,

Try the following command in the terminal (make sure you go into the terminal from the environment you created):

conda install pywin32

After this installs, get out of everything. Out of the terminal. Out of Anaconda Navigator. Then go back in and see if you can get it to work.

I had this exact problem and error code, and (searching for "ImportError: No module named 'win32api'" in Google) I found that solution as the top suggested solution on Google. In general, if you get an error code, try Google. It's crazy how many times the top search result or two will be what you need.

I hope this helps, and kudos to you for your persistence.

@ Chip

I have been working with a good Window's 10 installation of Python 3.5 and Zipline for the past month. Unfortunately, I recently upgraded a package and the environment no longer worked with zipline backtest tearsheats. As such, I removed and tried to re-install Python and Zipline. I spent over two days trying to get a working environment again. I tried Lauren's steps (9/3/2019) and Robin Szeto's YML configuration above which was the original configuration I got working in early December. I could no longer get it to work. I tried ideas on https://github.com/quantopian/zipline/issues/2514 and ran into other errors including the Kernel error you encountered. Finally, I tried Tom's configuration steps at the bottom of https://www.followingthetrend.com/2019/08/trading-evolved-errata-and-updates/. I ran into errors and had to try a few times before I was finally able to get my environment working. Here are the specific steps I took:

1. Uninstall all prior copies of python, anaconda
a. NOTE: After doing this multiple times over the past few days, I opted to just delete the zipline environment within Anaconda. I then removed the corresponding subdirectory on my home folder . This process was successful when I finally got the configuration steps in a working order. Also, I had followed Lauren's recommendation to install VS14 build tools and add "RC.exe" to the path.
2. Install the latest Anaconda Distribution –
a. “Anaconda3-2019.10-Windows-x86_64”
3. Update conda:
a. (base) conda update conda
4. Create new zipline env:
a. (base) conda create -n yourenvname python=3.5 anaconda
b. NOTE: I used (env_zip35)
5. Activate env:
a. (base) conda activate yourenvname
a. (env_zip35) python -m pip install –upgrade pip
7. Install modules using pip and conda (specific method is important):
a. (env_zip35) pip install msgpack
b. (env_zip35) pip install pyfolio
c. (env_zip35) conda install -c Quantopian zipline
d. (env_zip35) conda install -c conda-forge nb_conda
e. (env_zip35) conda install -c conda-forge matplotlib
8. Uninstall gevent:
a. (env_zip35) pip uninstall gevent
b. NOTE: I did encounter the RLOCK issue so I initially used conda unistall gevent and it removed needed components. One of them was the win32api.
9. Per Tom’s suggestion, re-run zipline install
a. (env_zip35) conda install -c Quantopian zipline
10. Verify dependencies,
a. (env_zip35) pip install pipdeptree
b. (env_zip35) pipdeptree -p zipline
c. NOTE: Look for any issues that might be at top of list. I saw none following these steps.
11. Import quandl bundle:
a. (env_zip35) Set QUANDL_API_KEY = your_own_api_key
b. (env_zip35) zipline ingest –b quandl
12. Install Norgate and Norgate-zipline integration (optional)
a. (env_zip35) pip install norgatedata zipline-norgatedata

After most of the individual installs/updates, I would open Jupyter Notebook and run a couple quick tests to make sure things were progressing properly. I had identified Kernel errors and other issues during previous attempts.

I hope this helps!

Ed H

Ed,

Thanks for putting up your detailed steps. We are all getting through this thing together!

Tom

Tom & Ed;

Thanks for the help, installing pywin32 solved the Kernal error. However, I am now back to the same error I have had with each of the different installation methods I have tried. (I posted the error message about 4 posts above.)

I can run the code for chapter 7 with no problems. When I load the code from Chap. 8 and run the first cell it throws the error:
“ ValueError: numpy.ufunc size changed, may indicate binary incompatibility. Expected 216 from C header, got 192 from PyObject”

I have searched for answers but with no luck.

Chip

Chip,

I didn't have any problems with the first few steps of Chapter 8 so can't help on that one. I'm taking a break from the book and playing with algos directly on Quantopian for a bit -- as I felt the "setting up the environment" stuff was getting in the way of my actually learning the bigger picture of backtesting, writing the code, etc. That said, I'll circle back relatively soon.

Tom

@Chip,

I consistently ran into that same ValueError: numpy.ufunc size changed. The only way I got around it was following the detailed steps I used in my post from yesterday. Please give it a try and see if you have success.

Ed

Ed;

I just, very quickly, built a new environment following your steps. It appears to have worked as I am able to run the code from Chapter 8, though I do have one error from the next to last cell but believe it is only a typo or something similar. I will research later after doing a complete reinstall of Anaconda and the environment.

Thanks for your step by step guide and encouragement.

Chip

@Laura

I think a sql server is a nice idea as you mentioned, so I set up a MySQL (community edition) like explained in the book.

Finally it runs fine and its doing what it should do - but just very slow it seems.

Here is a comparison:
- directly reading the csv files from disk and ingesting took 1 :35 minutes.(no sql involved)
- first sending them to the database ( 12:30 min) and then ingesting them to zipline ( 50 min.!!!)
I used always the random stocks.

Looks I'm doing something wrong here or are SQL servers that slow?
If I'm doing something wrong, where can I start to look?
I just installed it with its standard installer and used the Workbench to set it up.

thankx
Carsten

@Andreas, and All

I'm playing a bit in Chap 12 with the momentum model and notice some strange behavior.

The number of positions can dramatical overshoot the set portfolio size , special with small like portfolio_size = 5.
In Mai 2005 it goes >50!!

I checked the different steps and the new portfolio has the correct size, everything looks fine in the code.
First ist just starts with 1 additional stock which seems coming back from Zipline.
If I reduce the weight of the order_target_percent this effect starts to dampen.
With half of the portfolio, I get in May 2005 back to 5 stock and occasionally 6 in other part.

To keep this simple, I used 'weight = 0.5/portfolio_size' instead of 'weight = vola_target_weights[security] '
I used the random stock sample to be able to compare results and
I monitored the size of kept_positions with 'print(len(kept_positions))'

Somebody else found this behavior?
or do I have a unique Problem with Zipline installation?

Best
Carsten

I've just started messing around with Quantopian and I'm looking forward to learning more.

@chip,

Issue: “ ValueError: numpy.ufunc size changed, may indicate binary incompatibility. Expected 216 from C header, got 192 from PyObject”

Looks like I may have had a similar problem chapter 8 which is now resolved. If still an issue, you may want to look at https://github.com/scikit-image/scikit-image/issues/3655.
My steps included copying the environment (so that I could revert back if necessary) and running 'pip install numpy==1.16.1' in the new environment

thanks,
Stephen

I've having trouble creating a tearsheet for the combined portfolio. When you store the results of the combined models, do you just run a tearsheet the same way you did for a single strategy? That doesn't seem to be working for me.

Hello all
Please could you help, having similar issues with Zipline.
I too am working through the Trading Evolved and stuck on the Zipline backtest after seemingly getting past the installation.

So far I have:
1. Created zip35 environment in Anaconda
2. Installed Zipline
3. Failed to install 'nb_conda' v2.2.1 getting an UnsatisfiableError
4. Installing conda (conda install -c conda-forge nb_conda) via the terminal which seemed to work
6. Launched Jupyter Notebook
7. Creating new notebook from the conda_env:zip35
8. Started writing first line of code:

# Import Zipline functions that we need

from zipline import run_algorithm \
from zipline.api import order_target_percent, symbol

...but nothing happens, also when I tried renaming the notebook it says "settings is null".

Can you advise on next steps? Is there an issue with the conda install?

@Boris

I experienced similar problem and solved them by starting from scratch and following the detail post (about a week ago) by Edward Hayman above.

Chip

followed the great instruction of Edward Hayman, it works! thank you Edward Hayman.