Back to Community
Thoughts on the Goldman announcement

Hi everyone,

Goldman had some pretty interesting news today -- they announced that some of their "secret sauce" would be open sourced.
The original article is here: http://www.wsj.com/articles/goldman-sachs-to-give-out-secret-sauce-on-trading-1439371800
I wrote some of my reactions in a blog post: http://blog.quantopian.com/secret-sauce-vs-open-source/

I'd love to hear what the community thinks.

thanks,
fawce

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

20 responses

First thought: I can't read the WSJ article.

Second thought: Based on your post and others, it doesn't sound like Goldman will be providing open source software, which makes this not very interesting to me. I have been involved with open source communities as a user, tester, documentation writer, bug finder, and marketer. I find it incredibly beautiful to be improving something that the entire world can benefit from.

That said, I will take this time to thank you for Zipline!

Hi Fawce,

Recently, I spoke with a guy from Red Hat (met him on an airplane). He said that their software runs a lot of financial services servers. I suspect that under the hood, there is lots of open-source code in the industry.

I guess if you want to lead the charge, open source all of the Quantopian software. The problem is you'd end up giving away the business; somebody could just set up another Quantopian if you revealed everything. I've always been curious how you manage the backend stuff (bar construction from Nanex Nxcore feed, IB communications, timing/clocking, etc.). Any plans to open source it? Or is there too much value in the intellectual property/competitive advantage?

Cheers,

Grant

As an ex-gs'er and also someone who worked in HFT/algo trading for 11 years, I can never forgive them for what they did to Sergei Aleynikov. To think that I could be harassed an put in jail by my ex employer for doing what was a very mundane thing at the time, is a very scary thought, if for no other reason than it is so real.

I tend to think this decision is more of a result to stem the tide of people choosing Silicon Valley over the financial industry (like myself) than any real desire to be open.

Grant- running a website like Quantopian is about more than just the source, its about the data, the operations side of it, the connectivity, etc. The actual source code in and of itself is not nearly enough to "set up another quantopian." Even if you did manage to do that initially, the platforms are being worked on daily, eventually you will fall behind and become an also-ran.

But yes, since 2006 or so, linux has been the platform of choice for every trading system I have seen. There is tons of open source in the financial world... its the foundation which it is built upon! And not just platforms, but software libraries as well. Its not even necessarily about cost, its just so much easier to download something off the web that is completely accessible, than to go through the "sales process" and wait for the bill to be paid, etc.

At Citi it turned out that all the glossy interfaces boiled down to some processes running on a Linux box that churned out prices to all their clients, like a broker would. A lot of other pricing models where database driven (options) but adjusted in real-time by traders on the floor. The real technology was in how (e.g. at MS) algo's could be tested on the fly with real market data, real-time debugging etc.

The banks look at what their rivals are about to offer, so expect the others to counter if its successful. I rather get the feeling what they are offering is still behind closed-curtains though.

Regarding being able to "just set up another Quantopian" with access to all of the code, I realize more details, talented people, etc. would be needed. The point is that Quantopian has made a decision not to reveal some code used in their operation, even though seemingly it would be straightforward to have it all out in the open on github, just like zipline. In his blog post, Fawce says "Financial services companies need to do more than simply use open source; they need to lead open source projects; they need to contribute code" but only a portion of Quantopian's code is public (for good reason, since they are trying to build a profitable business, just like everybody else). So, hey, why not open-source the code to generate the real-time OHLCV minute bars and trailing data window used in the Quantopian platform? I imagine it is the same sort of business decision at the Goldman's of the world. "Hmm? Wouldn't we be giving away the farm? What is the business case? Would we have legal liabilities? What about customer support costs? Forget it. Too complicated and murky."

By the way, I just came across an article about IBM open-sourcing code:

http://www.infoq.com/news/2015/07/ibm-developerworks-open

It sounds like the real-deal, in contrast to the misleading Goldman article (I read nothing about source code being put up on github or elsewhere).

Hi Fawce,

I agree with the idea that the GS announcement doesn’t mean much. It’s really a marketing spin on a new product offering - or rather, an existing product being advertised to a broader audience.

While I appreciate the call for more open-source and “giving back” in the financial industry, I will also echo Grant’s concerns about Quantopian specifically. Yes, there’s Zipline and Quantopian is light years ahead of the curve in this sector. And yet, if we’re having a discussion about fundamental values, I wonder how much else Quantopian could (and does?) contribute back to the open-source community. In practice, of course, this question turns into pure speculation about economics. Would the open-sourcing of other projects benefit Quantopian in the long run or rather consume resources unnecessarily and potentially erode the competitive edge?

I don’t see a clear, generic answer to this question. IBM is notorious for open-sourcing projects that primarily promote its proprietary services. Even RedHat, the “open-source company” has its secret sauce. Heck, I myself follow a similarly balanced approach between transparency and non-disclosure in my area of business. In this sense, “calling” for transparency is a good thing, but “calling others out” does not seem productive.

As others have pointed out, Quantopian is trying to build a profitable, sustainable business, which means that our decisions about what to open-source need to balance the benefit to the community against the cost to Quantopian. If we give away so much of our work that we enable competitors to leapfrog us and put us out of business, that benefits neither us -- because our efforts to build a profitable business will have failed -- nor the open-source community -- because we will no longer be around to maintain and support the software we open-sourced.

Furthermore, when open-sourcing is done properly, it involves a great deal of time and effort, over and above just changing the status of the Github repository from "private" to "public." For an open-source project to be successful, it needs to be well-documented and well-supported. Also, though we obviously try to maintain high coding standards across our entire application code-base, we rely extensively on shared knowledge among our engineering team; to be perfectly frank, I think a lot of private code would require substantial clean-up before we would be comfortable making it public. :-/

There are certainly other pieces of our code-base in addition to Zipline that we're will eventually open-source, when we have the resources to do the necessary work. There are also pieces that we won't open-source, either because they're too valuable for us to give away, or because they're not valuable enough. By that, I mean that if a particular piece of our application wouldn't be of interest to a large enough group of people, then the benefit of open-sourcing it isn't worth the required effort. Our real-time market-feed ingester that produces the minute bars we feed into live algorithms is a probably a good example of that. The NxCore data feed is really expensive. Most of the people who can afford it are large enough trading shops that they're going to want to roll their own ingesters that are tailored to their environments.

On the specific question of "how much else Quantopian could (and does?) contribute back to the open-source community," we are active contributors to a number of open-source projects, most notably iPython, Jupyterhub, MongoDBProxy (though that project was started by a third party, it might not be an exaggeration to say that at this point the majority of code in it was written by Quantopian), and nose_gevented_multiprocess; we sponsor Boston Python meetups on a regular basis; we frequently give useful presentations at Boston Python meetups that are unrelated to Quantopian; we organize and sponsor regular, free Algorithmic Trading meetups all over the U.S. and occasionally in other countries as well; we organize and sponsor a regular, free lecture series about quantitative trading which uses the Quantopian platform as a teaching tool but is very much relevant even to people who do not use Quantopian; and we've open-sourced a number of other projects in addition to Zipline, including pyfolio, qgrid, pgcontents, coal-mine, qdb, and metautils.

I think, frankly, that we have every right to be proud of our efforts to support and grow the open-source and fintech communities. We have nothing to prove in this area, and our efforts to support and grow these communities will also continue to grow.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

NxCore data feed is really expensive.

How much does Quantopian pay for it? I figured there would be no harm in asking, although I understand you may need to keep the information confidential.

I've asked before, but never really came to an understanding--why is financial data so expensive? What is Nanex doing that is so special that they can charge more than a handling fee for having data pass through their hardware on its way to Quantopian? Aren't they just aggregating feeds from various electronic markets and piping the data to the Quantopian custom ingester? I've gathered that it is very loosely a real-time system anyway, so we aren't talking about fancy links and gear. Or is it just the scale at which Quantopian is operating? Say 40,000 users, times $100 per year per user, and you are up to $4M per year.

Similarly for the historical data used by Quantopian--why so expensive and inaccessible? If the financial industry had some sense, collectively, it would provide free clean data for the masses so that its customers could use financial services more effectively (or maybe they don't want customers in-the-know?). And just imagine if Quantopian didn't have to go through so much trouble keeping data from leaking out ("Oh no! Precious minutely OHLCV bars have escaped the Q fortress! Sound the alarm! Contact the lawyers!").

If we're talking about openess, the discussion should include the data, which is kinda where the real problem lies, since analytical software is basically a free commodity. Take your pick--Excel, Matlab, Python, R, whatever--not an impediment. But for your average Joe, not working in the industry, access to data is a blocking issue (which Quantopian has sorta solved, as elegantly as you could).

Maybe there is a regulatory/transparency argument that data emanating from public markets should be free (or at least offered at-cost)?

Maybe there is a regulatory/transparency argument that data emanating from public markets should be free (or at least offered at-cost)?

I think this is why there's a SIP feed at all. That said, even just the infrastructure for real-time feeds is expensive, multiple T3s when I last checked. If you want the fastest possible feed, you need direct lines to each of the market centers and pay whatever they charge. If you want the cheapest feed, perhaps nanex is the low-budget choice?

So is Nanex just plugging into this SIP thingy? Or are they actually aggregating data from various sources?

Regarding the market-->Nanex-->Quantopian-->IB-->market system, I'm wondering if Nanex is doing all of the heavy lifting, and the ingester is located there? It would make sense, so that then the data stream to Quantopian (hosted by Amazon, I think I heard) would be more manageable, and less expensive.

Grant,
You asked: "Similarly for the historical data used by Quantopian--why so expensive and inaccessible?"

Market data and data for use in creating investment strategies are valuable. It's valuable because people can make a lot of money using it. The markets and vendors charge a lot of money for data because organizations will pay it; it's what the market will bear. This is over and above the cost of the necessary infrastructure.

This model has been successful for companies -- they found a product (data) that has a market (institutions, funds, banks, etc). And so they scale up their organizations to execute on that model. They invest in sales people and their marketing and optimize their engineering for sales to big customers.

Quantopian community members represent a change to that model. The individual Quantopian user is a foreign concept to these data organizations and these organizations are not good at selling to individuals. The smart data vendors realize that this community collectively represents an opportunity for growth. And it's up to us at Quantopian to help them adapt and make the necessary changes. It won't happen overnight. Because so many of you have gathered in one place, I'm optimistic you'll start to see expanded access to data no individuals have ever had before.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Well, I get that there is a status quo, but Morningstar (for example) has just re-packaged public domain information for you, right? I figure it is all required to be reported out to the public, and Morningstar has just made it compatible with your system (or maybe you do all of the work in this area?). Fundamentally, it doesn't sound that hard. So why would it be expensive if there should be lots of competition? Or am I missing something?

By the way, last I heard, Quantopian is a hedge fund, so the idea that you are getting data in the hands of individuals is kinda murky. You're just another hedge fund, except you are trying to get R&D on the cheap by crowd-sourcing (with the hope that your product will be better, too). In other words, you've out-sourced your R&D to the public. The fact that individuals can trade using Quantopian is kind of a carry-over from your abandoned original business plan (at least that's my take). If at some point it doesn't make business sense, you'll shut it down, I have to think.

A further point is that I have to think that the Morningstar's of the world have to deal with the issue of data leakage/piracy when they sell to big institutions. And also who within the organization can use the data per the license. For example, the organization may have 30,000 employees, but only 500 can pull data from a database accessible on the global company intranet. So, the organization has to have controls in place that Morningstar can audit. It seems that Quantopian is just an extreme case, where the distinction between a closed intranet and the open internet have been blurred. And there's also no risk of being fired as an abusive Quantopian user, as there is if an employee of an institution steals data for personal use or re-sale.

I would argue that there's no "selling to individuals" in the exchange here. It is more analogous to putting data out on a giant corporate intranet, with more risk of piracy (which Q has likely addressed by blocking bulk downloads of data and other monitoring/controls).

Hi Jonathan,

Cool. Sounds like there are lots of open-source contributions from Quantopian besides Zipline. I think this could be publicized quite a bit more. I'm a close follower of Quantopian news and that's the first time I've seen a writeup of all your open-source activity. In fact, googling for "Quantopian open source" only delivers hits to Zipline and Fawce's recent post.

And while open-source software is great, what really stands out about Quantopian is open access to information. The open meetups, the lecture series (!), the research platform and the online back-tester with a full real-world execution engine are nothing short of revolutionary in the fintech space. Yes, it's your business model, but then again. Some time ago I used to spend quite a bit of time, money and effort on getting access to this kind of data and education - and what I ended up with was pretty mediocre. Quantopian now offers these things for free in good quality - and the added benefits of a polished interface and a growing community. I bet there's quite a few more challenges to overcome, such as the data leakage issue allured to, but I'm rooting for Quantopian. I hope you'll be able to remain on the open-acess path and make it really big - and then others will follow your path anyways.

Goldman Sachs did release their Java collections code as open source: https://github.com/goldmansachs/gs-collections.

Fawce, Jonathan,

One way to lead the open-source charge would be to get more of your algo and research platform notebook content (both user-contributed and Q-originated) up on github (or an equivalent) and integrated with your platform and this forum (or an improved forum). I know that this is on your roadmap--anybody working on it? When do you see a solution emerging and being implemented? Or do you have bigger fish to fry these days?

Grant

Reasons to open source software:

  • You are an altruist, a proletariat who thinks, like Richard Stallman, that all software should be free.

  • You are an egotist and are yearning for accolades because you think your code is exemplary and should be copied and fawned over.

  • You are a student who wants to share your code in order to get feedback, corrections.

  • Your software is no longer viable or useful so you decide to use it as a marketing ploy and announce to the world that you are releasing important intellectual property into the wild as a gesture of love and peace and goodwill.

  • You are no longer viable and rather than die and take potentially useful code with you, you gift it to the world (different that being a true altruist as no self-sacrifice was required).

  • You are not capable of completing or expanding a viable code base and so expose the core, to date, in the hopes that others can see the potential and arrive in droves to help you build a more complete solution.

  • You want to intentionally leverage the millions of existing and developing programmers who have fewer and fewer opportunities to contribute to important or critical software projects, so you release enough supporting software in order to spur the creation of community and expand the spirit of contribution.

Take your pick.

Red Hat and other open-source companies like TypeSafe basically took off because of early adoption from banks like Goldman and Morgan Stanley.

As others have pointed out, Quantopian is trying to build a profitable, sustainable business, which means that our decisions about what to open-source need to balance the benefit to the community against the cost to Quantopian. If we give away so much of our work that we enable competitors to leapfrog us and put us out of business.

That is a complete wrong idea, I think you guys are missing where the value in your business is.