Statistical arbitrage using Gaussian Copula [deleted post]

My first crack at this. Unfortunately Python has very poor support for Copula models (unlike R).

7 responses

Hey there,

That looks very interesting, can you explain the gist of the strategy, or link to the paper it's based on (if there's one)?
I've had a go at copula type trading before, without much success unfortunately.

Hey Pravin,
Sharing is caring, thanks :-)

And your explanation helps a lot too.

Regarding improvement, better men than i am have proposed to ditch altogether the copula:

Another widely used method for multivariate dependence analysis in the financial sector is the use of copulas [7]. However, copula-based methods also suffer from major limitations in practice; for example, computation of copula functions involves calculating multiple moments as well as integration of joint distributions, which require use of numerical methods and hence become computationally complex [8]. Copula-based methods suffer from other major limitations as well, namely, the difficulties in accurate estimation of the copula functions, the empirical choice of the type of copulas, and problems in the design and use of time-dependent copulas

and also PCA, Instead favoring ICA:

We hypothesise that the observed multivariate financial data may hence be generated as a result of linear combination of some hidden (latent) variables [15, 16]. This process can be quantitatively described by using a linear generative model, such as principal component analysis (PCA), factor analysis (FA), or ICA. As financial returns have non-Gaussian distributions with heavy tails, therefore PCA and FA are not suitable for modelling multivariate financial data, as both these second-order approaches are based on the assumption of Gaussianity [17]. ICA, in contrast, takes into account non-Gaussian nature of the data being analysed by making use of higher-order statistics. ICA has proven applicability for multivariate financial data analysis; some interesting applications are presented in [15, 16, 18]. These, and other similar studies, make use of ICA primarily to extract the underlying latent source signals. However, all relevant information about the source mixing process is contained in the ICA unmixing matrix, which hence encodes dependencies. Therefore, in our analysis we only make use of the ICA unmixing matrix (without extracting the independent components) to measure information coupling.

This paper describe steps to compute N non gaussian factors, and gives a formula for the conditional probability of a return given the unmixing matrix. So up until paragraph four it is relelavant to your methodology, i think(after that they generalize the result).

Since FastICA implementations exist in the python community, and singular decomposition also, it would not be completely impossible to implement in Q, though it would be quite a bit overwhelming to master all the concepts of the paper. Anyway, I still need to improve my algebra and signal processing skills, so still a lot of work to get there.

I just wanted to share this document with the community. tell me what you think.

@Lionel, Thanks for the reference paper. Looks interesting.

I wouldn't discount copula methods yet. There has been recent interest in using copula methods for pair trading with some success.

Hi Aqua -

So what's the story? Did you decide the code has some value, and no longer want it in the public domain?

Yes I decided that there is not much collaboration or exchange of ideas anyway and it feels like a one way track.

O.K. Thanks for the feedback. --Grant

@Aqua thanks for your contributions to this community.
Please don't quit posting. I have learned from your posts but I don't always have time to write.