ARMA Timing Out & R2Py

Hello,

I've recently been playing around with statsmodel a lot... of particular interest is ARMA models.. - I was quite eager to test it out here except I've run into 2 problems

1) ARMA computations are actually quite computationally expensive - especially if you are attempting to try multiple models .. Even relatively basic ARMA computations appear to time out (or just run indefinitely) on the Quantopian platform.. any chance that the time out limit could be raised?

2) This is more a feature request than anything.. but it would be very nice if R2Py or something similar could be added as a module to the platform.. http://rpy.sourceforge.net/rpy2/doc-2.3/html/high-level.html

This would effectively allow us to use R, rather than python exclusively. I'm not certain what the requirements for Quantopian would be on the back end or if this is even possible

Thoughts?

Thanks
Alex

15 responses

Hi,

This would be interesting to have but not sure what the performance penalty would be. As of now running an algo in quantopian with many time series transformations is extremely slow.

Suminda

A secondary source of relative performance: http://julialang.org/.

Hi Alex,

Yes, ARMA is quite slow and causes a timeout. I ran into this problem myself. Certainly we can increase the time limit for computations -- I'll let you know. One related trick is to increase the refresh_period of your batch_transform to recompute only every couple of days.

rpy2 is an interesting project. Integrating it would be quite straight forward so I think this is a very real possibility. One problem is that it might be a security nightmare but we'll take a closer look at that.

@Suminda: The batch_transform will get a 100x speed boost pretty soon so you can look forward to that. There are some other bottlenecks but we already made a lot of progress and starting to hit the limits of what's possible in some areas. Julia is an interesting language but still very experimental with almost none of the modules that make Python so powerful.

There are many ways to speed up python (cython, numba etc) but those mostly work if you have pretty tight loops. Zipline's bottlenecks are in other areas, however, so we can't just drop them in.

Thomas

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Is it possible to provide further information on the following:

What would be the batch transform look like if a batch transform function is called with complex data structures containing maps lists and objects. If there is more colour on how the data structures passed are transformed would be helpful

You can use LLVM to run the subset of Python used perhaps through a modified pypy interpreter applying different optimisations as needed? Easy said than done.

If you want to pass non-scalar items to the batch_transform I would do that using an argument or keyword argument of the batch_transform, e.g.:

@batch_transform
def bt(data, color='red'):
...

...
bt.handle_data(data, color='blue')


Could you give some elaborate examples in in the docs as this is a major feature. Still I do not have 100% understanding on this.

Hello Suminda,

Some more examples are definitely required.

Hello Thomas,

How about a 'technical' webinar on batch transform? Maybe show a couple of example algos where it is used correctly and a couple where it is not required? Let us have access to the algos in advance so we can be armed with questions!

Regards,

Peter

Great idea!

Perhaps also upload the past webinars so we can watch them later (for people in time series) and also update the documentation.

I like that idea. I've been thinking about other webinars to put on, and that's a good topic.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Glad this sparked a conversation!

Thomas, thanks for the tip regarding the refresh window! I can defiantly see R2Py being a security issue but if it were possible, that would be great! Let us know when you have a chance to look at it

Also try to give it a performance boost when added.

Indeed, a webinar on that is a great idea. This might also coincide nicely with the batch_transform speed-up that was merged yesterday into zipline (should be on quantopian soon) and the talib transforms.

Hello Thomas,

That's good news about the talib transforms. In case anyone wants to investigate the original library by Mario Fortier and others it is here: http://ta-lib.org/index.html and the Python wrapper by John Benediktsson is here: https://github.com/mrjbq7/ta-lib

I was looking at TA-LIb options for for Excel today for which there is a commercial package available but I found this which I haven't seen before: https://github.com/stoni/ta-lib which is interesting. I've not got further than running the https://github.com/stoni/ta-lib/blob/master/excel/example.xls example. This works on 32-bit Excel 2010 but not 64-bit. I tried to build for 64-bit with VS2012 Ultimate but it defeated me.

Regards,

Peter

Hi Peter,

Thanks for the references. We are in fact using the Python wrapper by John (wrapped again to turn them into batch_transforms). I'm really happy with our talib interface -- it'll make it very easy to use every function that's available (which is quite a few) with a single line.

Thomas