Back to Community
Possible backtester bug with reverse splits?

Dear all,

In looking over the performance of one of my algorithms in contest 18 and 19, I realized some wild swings due to two short and long positions that had unusual percent gains. In looking over these more carefully, they both underwent reverse splits around the same time and it seems the backtester does not account for this correctly. The two positions in question are EEQ and GNK.

I'm not sure if this has been brought up yet. See the attached backtest and algorithm that shows this, please let me know if I'm making a mistake in my understanding.

Cheers,
Richard

Clone Algorithm
3
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 57ad1c8644ecd41008b52d3d
There was a runtime error.
16 responses

Bump. Sorry this is pretty bothersome, makes me wonder the validity of the back tester and the live trading system.

I believe that reverse splits are being treated as splits with volume and hence tremendously off.

Interested also.

Regular forwards splits can do this too, this is BBD on 4/18/2016. Really makes me loose confidence in their system.......

Clone Algorithm
1
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 57ae013b1f27010ffede6cbd
There was a runtime error.

Another broken reverse split...

Clone Algorithm
2
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 57ae0bb87d65071028fdbc7e
There was a runtime error.

FU doesn't seem to have any price movement in the short time that it traded.

Clone Algorithm
2
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 57ae117e7d65071028fdbd3d
There was a runtime error.

PPD also has a broken split. It went through a 100:1 forward/reverse simultaneous split, quantopian's data only seems to have one of them recorded.

Clone Algorithm
2
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 57ae11d68129b50ffcfe1891
There was a runtime error.

There are other types of data problems too. MX for instance has its EBIT updated in the morningstar data a few days after the earnings were actually announced. This can throw off any backtest looking at EBIT for making decisions.

One final broken chart for today... SXCL seems to have gone in and out of bankruptcy, but is shown tradable throughout right upto 2016.

Update: I forgot to mention that this backtest took a very long time to run. Execution slowed down tremendously when SXCL flatlined.

Clone Algorithm
2
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 57ae10ff1d0b171004cd3100
There was a runtime error.

@Sunil: This happens when you call data.current to get the price of an asset that hasn't traded in a long time. Quantopian seems to be making a database query for the price at the end of the previous minute bar; if not found, the bar before it; and so on. They could maintain a database with prices forward-filled as needed. (There's no need to waste storage on forward-fills of prices nobody has requested; and this forward-filled data could be deleted when nobody has needed it for some time.) Or you could request the current price only when you know that the asset has traded, eg. by making sure the current bar's volume was nonzero.

I am hoping if someone from Q can comment on these issues as it basically puts my trust on strategy testing on hold

The concern I have is that this is something that should have been correct since the first release of the product. The data is what makes this approach work.
There have been numerous complaints (in the forum) and their response has been to fix them one at a time as they come in. This approach is not
going to work. It looks like Q does not have a solution since if they did it would be fixed by now.

What other alternatives are there? Is the data from EODDATA clean?

My take on this data error business is that Q needs to consider seriously a database that would be public-facing, and ideally accessible from within algos. If there are errors, they should be transparent, period. If I were working as a quant within a hedge fund, and my co-workers refused to publish a list of errors in the data, it would be intolerable and unproductive. I've mentioned this concept several times, including in the context of the under-development https://www.quantopian.com/posts/the-tradeable500us-is-almost-here. It continues to be a mystery why it is not addressed in a transparent and accessible fashion. Really, kinda weird. There will be errors and they should be straightforward to avoid, if the information is made available. End of rant.

Well, not quite:

In the absence of a database, here are some that I know about:

https://www.quantopian.com/posts/missing-split-nke
https://www.quantopian.com/posts/missing-split-adjustments-for-lbty-b-and-lbty-k (actually turned out to be LBTY_A & LBTY_B)

Presumably, they have not yet been fixed, otherwise there would have been postings to the forum threads (rather than changes made to a database...sigh).

Hey guys, thanks for reporting all of these data issues. We have a fix in progress for the mishandled splits/reverse splits with each of the following:

EEQ, GNK, HRI, BBD, PPD, BWNG, NKE, LBTY_A, LBTY_B

Richard, to clarify, the issue was the data itself (as opposed to the handling of the data). Of course, bad data has the same symptom: the big jumps in your performance plot that aren't indicative of what actually happened. To fix problems like these, we manually patch the data (usually done in batches).

Sunil, fundamental data issues are a little tricker to fix, but we will look into the issue with MX. The issue with SXCL is a little different. Because we only store a start_date and an end_date for each security. Because of this, there's no way to specify when a security goes in and out of trading. It's a limitation of the design that we're aware of, but don't yet have a good workaround for. If you use data.is_stale() to determine if a price value was forward filled, you can avoid trading a security like this as it will have stale data (forward filled) when it's not trading on a major exchange.

These data issues affect us just as they do you. We're trading our own money on the same infrastructure and the same data. We also evaluate strategies for allocations on the same data. We share the interest of having accurate data and we really appreciate it when the community points out errors. We are looking into solutions to catch these data issues earlier.

Sorry for the trouble. I'm expecting the fixes to the bad splits I listed above to be up in the next week or so.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hi Jamie,

Thanks for your reply. Rather than going case by case on addressing splits, have you all considered having at least a flagging algorithm that signals that a possible split has occurred? I could imagine a monitoring algorithm that notices drastic percent changes in price and suggests these to someone to check up on. This seems like a more robust method than waiting for the quantopian community to stumble upon splits and reporting them.

Thanks Jamie,

Any thoughts on how to handle various errors going forward? At least in my little pea brain, it seems like a natural fit for a public database, along with tools within the research platform & algos to access the database. Listing the offending vendor would be good, too--nothing like global, full transparency to improve quality control. It would be a service to everyone using the data, including the vendors (since presumably they want users to know about errors). Regarding Morningstar, how do they handle this? Have they given you access to a database of all of the unpatched errors that they have on the books (reported by both you and other customers)?

Also, at one point, someone at Quantopian commented that you do checks against independent sources. Could you describe what is done? Or what could be done better? It seems like missing splits would be fairly easy to catch, but maybe I'm missing something?

Hi Jamie,

The fundamental data problem is definitely a lot more challenging, bugs don't show up as obviously in the backtest as a price adjustment. The only reason I spotted this is I went digging into the backtest to see which position was losing the most money, and why, in my backtest.

At least for earnings related data for the backtest, a simple sanity check would be to see if all the earnings related data for a given stock get updated on the same date. There's no reason for instance for the income to change without the ebit changing, and vice versa, as far as I can tell. You should then be able to at least date validate a large number of these factors using earnings calendars, which I imagine would be a lot easier to get than another source of fundamental data.

Grant, I imagine you have in mind a database with all the corrections over the vendor provided data? That is a great idea.

Sunil

I ran into same issues as mentioned by others. I am trying to build strategies using at least 100 stocks and it is virtually impossible for me to identify if my model fails because of corrupt data. I began to question my thousands of hours wasted on Quantopian. Who knows I probably threw away some good models because of corrupt data. Can Q look at this with top priority. If data is corrupt then garbage in garbage out.