Back to Posts
Listen to Thread

What is the relative priority and timing of adding plotting and universe selection (> 10 sids accessible to algorithm)? Just wanted to make sure that they did not fall off the radar screen, since I think users will see a lot of benefit.

I realize that enabling paper/live trading with a broker with real money is a push now, but there are some basic features that are still lacking.

@Grant,

Thanks for asking! I agree that universe selection is "essential basic functionality". Universe selection has been in the works for a long time, and shortly we will be asking for volunteers to alpha test the new functionality.

Plotting is still on the radar, but not in active development. One idea is to add the ability to overlay new time-series on the line chart that has algorithm vs. benchmark performance. While not completely adhoc plotting, we think it might cover a large number of useful cases: looking at spreads over time, plotting a signal, etc.

As always, I'd love to hear your thoughts and ideas.

thanks,
fawce

I ran my first algo with a universe selection last night on staging ;). It's not ready yet, but we hit a major development milestone last night.

Thanks Fawce & Dan,

Thumbs up on the universe selection! Looking forward to giving it a try.

Regarding plotting, ideally, you could just let users write out files and download the data, to be plotted on their pcs. I realize that this presents a licensing problem for you. Would it be feasible for you to generate the plots for users (in a graphics file format), to be downloaded or displayed in another browser tab? Overlaying new time-series, as you describe above, would be a start, but my sense is that users will want to generate all sorts of visualizations of the data. Frankly, it seems kinda backwards to be launching a tool for doing data analysis and developing algorithms without full plotting capability as a high priority. Is there a particular technical challenge? As I understand, it can be done in Python through matplotlib, right?

I'd be interested in other users opinions. Am I off base here?

@Grant: Thanks for the input. I agree that plotting is a critical feature. I thought about this issue a little more but still don't feel like I have the perfect solution. Essentially, there are two ways:

  1. As fawce described, allowing you to add time-series plots in the existing backend
    Pros:
    -well integrated
    -covers quite some use cases
    -history directly accessible
    Cons:
    -not very general, many ways of plotting that do not find into standard time-series line plots.

  2. As you described, giving you access to matplotlib.
    Pros:
    -extremely general and feature rich right off the bat
    Cons:
    -can't easily look at previous values (unless the user handles that case) as by default they will get overwritten.
    -not easy to integrate into our current backtesting UI.

One idea would be to add functionality similar to batch_transform, that gets called periodically (say, weekly, but user-defined). Instead of doing some transform one is creating a matplotlib plot. This figure we save to a file (with a timestamp) and display in some way in the backend (maybe new tab as you suggested, can also be saved). Critically, you could be able to look at plots at a previous point in time.

I'm not sure, am I overemphasizing the point that previous plots would get lost? To my mind, that's really important.

Interested to hear your thoughts!

It is worth noting that these two approaches are not mutually exclusive. So maybe the solution is to implement them both.

Thomas and others,

A third approach would be to leverage the present logging functionality so that users could download the data to their pc and then plot them locally. The file could be a compact zip or binary format with a file size limit. You could also add in a significant delay between downloads by an individual user. This would preclude efficient scraping of data from your site (which is a valid legal concern, as I understand).

As a note along these lines, I ran a 2-week minutely backtest and tinkered around with copy-and-paste from the log output. I found that with one copy-and-paste operation, I could capture 600 lines of log output (not 512 as you specify on your help page). It appears, however, that with some awkward scrolling, I can actually access the entire 2-week log data. So...naturally I wondered how hard it would be to write a script to automatically download the entire log output dataset for a backtest. Eventually, someone will figure out how to do this (and perhaps post the solution), so you might as well beat them to the punch and add a "Download Log Output" button (I'll post a feature request).

All,

When you have your meetups, I encourage you to get feedback on plotting, in all three contexts: discovery, backtesting, and paper/live trading. Visualization is really important (note my use of both bold and italics for emphasis). Particularly for discovery, I advise that you re-think your whole GUI approach. To me, for discovery, something more of a remote desktop would be vastly superior to your browser-based backtester. I am not saying that the backtester should be replaced...it has its role. One way to look at it is that the big players presumably have high-end workstations with flexible "desktop" analytical tools and full access to data. So, if you can figure out how to emulate that scenario (and offer it freely), you'll have a nice tool for discovery (and full-fledged plotting will come along for the ride).

@Grant: I completely agree and it is something we are actively planning and working on currently. I understand that it must be frustrating to use a platform with some critical features missing. It is a priority for us to add those features and I'm really excited about the shape that is taking so stay tuned!

Thomas,

No problem...you guys are doing a fine job...keep up the good work, and let me know if there are ways I can contribute. I plan to keep tinkering on the site, as time allows.

Hi Thomas & all,

A few more thoughts...potentially even better than a remote desktop would be some sort of efficient API that would give users flexible access to the data. If you could do it in a fashion that would be independent of operating system/platform and programming language, that would be best. Part of the API could be to submit jobs to the backtester and have results returned. Supposedly, one of the fundamental economic/legal constraints is that you license the tic data and can't afford to provide it to 7 billion people. I'm still kinda perplexed why tic data is expensive and restricted. The fact that you can now offer daily updates suggests that generating it is probably fully automated and in the end, it should be a dirt-cheap commodity.

Log in to reply to this thread.
Not a member? Sign up!