Quantopian Live Integration - Frequency Of Data Available To Algorithms

Hello All,

(I'm hesitant to start a new thread because I probably haven't read all that I should.)

My understanding so far:

U1. Backtesting in Quantopian is based on streaming 'events' - let's say prices for simplicity - with each event being provided sequentially to the algo. In backtesting this avoids 'look ahead' bias.

U2. Backtest data comprises high quality 1 minute data also aggregated to daily. Any further aggregation or indeed interpolation would be done by the user.

U3. I believe the data is from Thomson Reuters. This has been stated by both Thomson Reuters and Quantopian.

U4. Broker integration will be offered initially via Interactive Brokers.

U5. The same streaming model will be used for backtesting and live trading.

My assumptions so far:

A1. Only 1 minute data will be streamed to live running algorithms.

A2. The reason for A1. above being that Quantopian are targetting algo writers working at above the intra-day level.

A3. Running live or backtesting algos that hold positions for several days through several months means that sub-minute data is felt to not be a requirement.

Regards,

Peter

22 responses

Peter (and all),

Here's my input:

U1. The events effectively stream, but they do not come at random times. The tics are on the historical minute (and at midnight, I think, for daily tics).
U2. Quantopian, to my knowledge, has not backed-up their claim of high-quality data (this is not meant to imply that it is not excellent data for backtesting...they just haven't provided any empirical evidence, which one would assume that their vendor could readily provide).
U3. This is the first I've heard that the Quantopian data are from Thomson Reuters. Dan Dunn recently commented that they do not yet have permission to reveal their source (see https://www.quantopian.com/posts/quality-of-quantopians-data). How do you know it's Thomson Reuters?
U5: I don't think that Quantopian has described, in detail, how streaming will work under live trading with Interactive Brokers, although it is implied since they talk about a seamless transition to going live with an algorithm that has been back-tested on Quantopian. If the "seemless" part is true, then one will be able to use market data available during minute 0 to submit an order during minute 0, to then be fulfilled during minute 1 (but I've never seen this spelled out explicitly by Quantopian for their live trading at Interactive Brokers).

Grant

Hello Grant,

WRT data: is my understanding wrong? I'm used to thinking of Forex tick data - are equities pricies fundamentally different? In Forex there is a closing bar price - based on whatever bar you happen to be viewing which could be 1 minute, but I would not called that a 'minute tick'. A tick is a new price with many occuring each minute or second. Obviously Quantopian events stream by the minute, by design.

WRT data source: I thought Thompson Reuters were mentioned in a webinar the other day. Then I saw this article: http://pandodaily.com/2013/04/02/want-to-take-on-wall-street-quantopians-algorithmic-trading-platform-now-accepts-outside-data-sets/ so 0.1 + 0.1 == 2.....or maybe not!

Regards,

Peter

Peter,

I figure that the Quantopian data (provided by their vendor) are somehow aggregated actual high-frequency market tics. So, OHLCV for each minute are computed for each minute (either on-the-fly or based on stored data).

Grant

Hi Peter,

A3: I'm not really comfortable with this assumption, and it seems like I'm not the only one. Markets have the tendancy to, every so often, move unexpectedly and violently in direction and speed. Even if your desired holding period is several days there may come a situation where you would want to exit your position immediately. A minute later might be too late to save your invested captial from taking a big hit. If Quantopians start trading leveraged instruments like futures, then they could potentially wipe out their account.

Backtests are great and all for getting an idea of how a trading strategy is "expected" to perform, but they cannot account for all "unexpected" events. The point is that it only takes one such adverse event to wipe out many weeks or months of good trades. Personally, I believe it's very risky to believe that something bad will never-ever happen, and that's why I don't feel comfortable with this assumption.

Best regards,
David

Hello David,

Agreed. It's not my assumption - it's my assumption about a (possible) Quantopian assumption. I should have worded it a little better.

Regards,

Peter

@David: I can definitely appreciate your concerns regarding one-minute granularity feeling too long to deal with unexpected events. Two things to consider:

(1) There are many large and very successful quant funds managing 20 billion+ who rebalance daily, weekly, or monthly and have no automated ability to react to sudden market events intraday. Not at sub-minute frequency, not at minute frequency, not even at hourly frequency. That may sound crazy, but it's probably not as big of a liability as you might suspect. Again, I'm talking about automated reaction; obviously in case of armageddon, fund managers can always manually intervene, just as you would be presumably be able to do at any point with a live Quantopian strategy, with or without the one-minute granularity restriction. (2) Your concern is related to sudden, unexpected, violent market events. It's tempting to think about these events like real-life emergency situations where the alarm bell rings and alerts you that you have 20 seconds to get to safety before the reactor blows. In this kind of situation, you clearly don't want to be asleep for the next 60 seconds. But in a sharp single-stock or market-wide price change, it's less obvious what to do next. As an example, your strategy suddenly wakes up on an alert event that a position you're long has dropped 10% in the last 6 seconds. What next? Liquidate? Suspend trading that name? Increase your position? Hunker down and see what happens next? Designing automated safeguards into your strategy to quickly and automatically react to extreme events is actually pretty difficult to do; backtesting it on sparse event data is even more difficult. I'm not trying to be a curmudgeon here: you have every right in the world to say that you want the ability to code automated reactions to extreme events in real-time. I'm just trying to point out that if you lacked such a capability, you'd be in very good company among professional quant and algo traders. For a typical lower-frequency strategy, I really wouldn't consider waking up every minute to be "flying blind" or taking a crazy risk. There is a whole separate discussion topic on how to build appropriate safeguards into algo trading platforms. Typical safety features would often include a combination of "big red buttons" that allow human beings monitoring algos to suspend trading / cancel trading when bad things happen, and low-level controls that prevent algorithms from going off the rails: limits on shares/orders/dollars/messages per second, capital limits, risk limits, limits on the number of open orders/positions, etc. These are the types of safeguards that you'd be crazy to omit. @Tomas We do not have 20 billion to manage. Many of us will manage a few thousand of our money with the possibility to take in other money from other Quantopians and Quantopian's sponsors. Whatever frequency we rebalance we need to know what is happening in the market for the sake of: • Systematic stop loss • Implementation shortfall • Utilise exchange rebates to manage transaction cost If we do not do this sometimes successful backtest strategies will make losses in real life. You cannot expect us to trade blindly and the algo can only monitor the market for a flash every minute when the trade bar comes in. Other issue is that all bars are aligned to actual minutes. This will create crowding in similar strategies which will create a burst of trades. If Qutopian gets enough traction this will give an opportunity to game the trades from Quantopian for a profit. The traders will lose from TC. We need the ability to create bars offset from from the actual minute boundaries. Big fund managers get their management fees regardless of the petty losses. If we trade on our money any loss is very real. So we need to have a good handle on Transaction Cost management. I just quickly read through http://en.wikipedia.org/wiki/Direct_market_access. I'm way down on the learning curve, so perhaps someone can explain if Quantopian/Interactive Brokers will be offering direct market access, or if there will be a middle man? I do not think IB does DMA at least for retail clients. @suminda: first, in response to comments like "you cannot expect us to trade blindly..." etc, I just want to reiterate that I'm not a Quantopian representative, just an ordinary guy like you. I'm not questioning why people might want real time data, I get all that. I'm simply pointing out that there are many categories of quant investment strategies where minute-level data is totally sufficient. I don't think it's accurate to say that big fund managers don't need to care about transaction costs. Between career risk and the pressure from clients, boards, and compliance officers to deliver best execution, I'd argue that these guys actually have a lot more reason to care. Moreover they have the resources to pay for data and professional-grade cost management solutions. So if they have good reason to care a lot about costs and best execution and the resources to pay for whatever they need, and they determine that this feature is overkill .... and yet you are adamant that this feature is a must-have for a free platform, then it feels like there is a disconnect. My post was in response to David, who was talking about the need to code automated response logic to sudden price moves. You cited a number of other reasons why you need tick data, including implementation shortfall and rebate capture. Could you be more specific about why minute bars [sorry, hit send too soon...] ...would be a requirement to help you with these things for a lower-frequency strategy? And the question about minute granularity and gaming is very interesting -- why don't you start a new thread since it's really a separate topic. If you are to re capture part of the TC using rebates you need to provide or take liquidity. This means you have to quote to better bid ask. Without knowing the bid ask and you cannot do this. In certain exchanges the bid ask cannot cross without filling a limit order. You can utilise this for TC / IS only if you have bid ask. I am adamant as I can make use of this if is provided and the impact of not having this in read trading is large. GETCO back the site if I am not mistaken. They will not get backing if there is no viable business plan. I suspect that they will invest in successful strategies and also open up the ability to be able to allocate money for successful strategies. If there is an commercial angle if makes sense to provide the best. I've seen folks refer to the "order book" several times here on Quantopian. Seems like that's where the action is...will it be available, at least in a "read-only format," to Quantopian algorithms? Also, does anyone have links that would explain how all of this works? I had the naive notion that retail electronic markets were direct, in the sense that participants would be buying/selling more-or-less directly with one another (sort of peer-to-peer). Seems that's not the case...there are intermediaries. Thanks, Grant Grant, there are I believe 11 different venues who publish streaming quotes for us equities. Each maintains an order book, representing all of the limit orders currently resting on each venue at various price levels. Some of the liquidity is hidden, meaning that limit orders have been designated with some or all shares "invisible" to all but the market clearing mechanism at that venue. The rest of liquidity is considered "lit" and that is made available via quote feeds. So potentially you could be subscribing to each individual feed directly and then building their order books yourself. But each message you receive from each venue is merely an update that is only meaningful if you have been listening to all previous updates and faithfully maintaining the state of each book after each update. If you know the individual order books then you can also construct a consolidated order book that combines all venues into a master list. This is very valuable info. The hard part is managing all of the message traffic. Right now, if your strategy tracks the top 100 us stocks then zip line has to be able to process 100 bar messages per minute. In order to have real time access to the consolidated order book in these 100 securities, message traffic might increase to 10 million messages per minute or even more. This puts an incredible tax on your backtester from a computational perspective. Live data is also very expensive and a feed like this can easily set you back3000 per month or more.

Probably from a cost management perspective you would want to stick with minute bars and rely on your execution strategy (presumably a broker algo offered by IB) to manage your access to the market rather than trying to trade directly. (Note that you can't actually trade directly without being a B/D but you can get something very close to direct access, using a broker's pipes.)

Basically if you want real time access to the order book, be prepared to pay a lot and have your backtests slow to a crawl. But for certain kinds of algos you have no choice but to pay up.

Thanks Tomas,

So, when an online broker like IB provides a data feed for retail algorithmic trading, how is it derived? Would they typically report OHLCV for the prior minute? What about latency? I'm trying to gauge the relevance of the Quantopian backtester to a real-world trading scenario. Is it a good model? Where are the holes?

I realize that on a minutely basis, this may just be an academic exercise, since you've pointed out that intraday, minute-level trading by individuals may be a losing proposition anyway (due to costs and capital requirements), but I figure it might still provides some insight.

By the way, how's Quantopian gonna make any money if folks are holding onto their positions for days/weeks/months?

Grant

Back tester seems pretty solid for many quant strategies. We need the ability to bridge the gap in implementing the strategy for real as TC can eat your profits. This has a TC model also.

In real world we have to worry about overfills / underflows etc.

Waiting to see what tooling Quantopian provides for this.

@Suminda, just keep in mind that there is an order to the Quantopian feature delivery. When we deliver live trading as a feature (oh, how I'm looking forward to that day!), it will be trading on minute bar data. There are a lot of interesting ideas in this thread. We'll continue to listen closely to what our members want and prioritize accordingly.

I think Tom's point about what is and isn't possible in a fast-moving market is right on. In some ways, this thread is discussing the long-running tension between algorithmic trading and human trading. On one extreme is the trader who sets up their algo and let's it run for months without interference. On the other is the trader who manually approves every trade before it goes happens.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

I am not happy with the 1 minute for live trading. At the moment it is OK for back testing. In the future you can add this to simulate IS / TC management strategies.

Initially let the backtesting be 1 minute and live trading be tick with order book access. It is good to have this in back testing but not needed to start with, but for live trading please provide this.

Suminda, I am guessing that you would be comfortable as long as your quantopian algo could leverage a good broker execution algo to handle your trades. I think the things you mention like capturing rebates are details you should be able to leave to the broker's execution algo or smart router. You certainly could write your own execution algos (which I did at my recent fund out of necessity since we were running a very cost-sensitive intraday strategy), but for a low-freq strategy I think this would be overkill. Just use a good broker execution algo and let it manage all the low-level details. This frees you to focus on developing your alpha. And if you should find that your low frequency alpha is only profitable under optimistic assumptions about rebates and transaction costs then personally I would focus on improving your alpha, not micromanaging execution.

Lets assume strategies return is proportionate to market volatility v. Market follows a Wiener Process. (a * v where a is a constant.) (Ease of mathematics.)

Let r be the expected return in a given time period which is proportional to the strategies volatility which in turn is proportional to market volatility, r = b * a * v

Let c be the transaction cost.

In order to increase our profits we are trying to trade at a higher frequency by subdividing the period to further n points.

Now the TC is n * c

Volatility of the strategy is a * v / sqrt(n) => return for smaller window = b * a * v / sqrt(n) * n = b * a * v * sqrt(n)

Expected gain my increasing trading frequency,

b * a * v * sqrt(n) - n * c = sqrt(n) * (b * a * v - sqrt(n) * c)

Every time you try to increase the strategies alpha by increasing frequency by m you have a motivation to reduce TC such that sqrt(n + m) * (b * a * v - sqrt(n + m) * c) > sqrt(n) * (b * a * v - sqrt(n) * c)

So when you have a alpha generating strategy there is a motivation to balance it at higher frequency with a view to milk it if you can reduce TC at appropriate proportions otherwise a non market making style strategy will have diminishing returns beyond a certain frequency. If a strategy has proven alpha the "milking" strategy makes more sense as reducing TC is less complicated initially until this also will experience diminishing returns.

A strategies knowledge of how it is trading can be used to get better efficiency than broker strategies which are generic.

Also a lot can happen in 1 minute (like a flash crash). If a strategy has built in circuit breakers then we it to continuously look at the market to know what happens.