Back to Community
Introduction to the Quantopian Risk Model in Research

Earlier today we announced the release of the Quantopian Risk Model .

In this notebook, we give an introduction to the Quantopian Risk Model and introduce new APIs for interacting with risk model on the Quantopian Research Platform.

Loading notebook preview...
Notebook previews are currently unavailable.
Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

25 responses

This is the algorithm that is analyzed in the section on perf attribution. It's an updated version of the Optimize API announcement post algorithm that uses the new QTradableStocksUS (announced in this post) and uses our newest APIs for accessing fundamentals

Clone Algorithm
59
Loading...
Backtest from to with initial capital
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
Returns 1 Month 3 Month 6 Month 12 Month
Alpha 1 Month 3 Month 6 Month 12 Month
Beta 1 Month 3 Month 6 Month 12 Month
Sharpe 1 Month 3 Month 6 Month 12 Month
Sortino 1 Month 3 Month 6 Month 12 Month
Volatility 1 Month 3 Month 6 Month 12 Month
Max Drawdown 1 Month 3 Month 6 Month 12 Month
# Backtest ID: 5a0317326279aa458c825cad
There was a runtime error.

Good stuff. Performance attribution is definitely a good addition.

I have one question/request for clarification on a few of the style factors. It seems that the size and value style factors are defined opposite to the typical way they are used in the academic literature. More specifically:

  • Size factor Definitions

    • Academics: Small minus Big
    • Quantopian: Big minus Small
  • Value Factor Definitions

    • Academics: Cheap Minus Expensive
    • Quantopian: Expensive Minus Cheap

I'm sure this wasn't a trivial decision on Quantopian's part. I just want to make sure I am understanding the definitions, particularly in regards to the value factor. My thinking is, "If I have positive exposure to the value factor, I have positive exposure to cheap stocks." However, it doesn't look like this is correct.

The other issue that may confuse users is that the rolling_fama_french plots in Pyfolio appear to use the traditional academic definitions (small minus big, and high book to market (cheap) minus low book to market (expensive)). On the other hand, these plots are labeled specifically, so that may clear up any confusion on the matter.

Any clarification is appreciated.

Also, there is a minor typo in the box plot function in case anyone wants to reuse that function. If I'm not mistaken, the data parameter passed into sns.boxplot() should read exposures.dropna()

def visualize_exposures_distribution(exposures):  
    ax = sns.boxplot(data=exposures.dropna(), orient='h');  
    ax.set_title('Distribution of Daily Factor Exposures')  
    ax.set_xlabel('Daily Exposure')  
    ax.set_ylabel('Factor');  
    return ax  

@Michael Matthews Thanks for your questions. Under the Fama French framework in late 90s, the size and value factors are defined as the relative performance between small-cap stocks minus big-cap stocks and between cheap stocks minus expensive stocks. It is true that academic papers often define them this way. But, it does not have to be this way in practice. For example, if you use the size factor the relative performance between the big-cap stocks minus the small-cap stocks, the factor exposure and corresponding factor returns just need to change sign of the numbers. The exposure weighted factor returns (how much gain or loss obtained from the size factor) are same. For example, suppose SMB factor return on one day is .1, AAPL's factor exposure is -0.25, and its factor exposure weighted return is .1 * (-.25) = -.025. If use BMS, the BMS factor return would be -.1, AAPL's factor exposure would be 0.25, and its factor exposure weighted return is still -.1 * (.25) = -.025.

The way how we define the style factors is closer to the commercial risk models. To be specific,

  • The size factor definition in the Quantopian risk model is Big minus Small (measured in the log of market capitalization)
  • The value factor definition in the Quantopian risk model is cheap minus expensive (measured in the ratio between stockholder's equity and market cap)

We will update our post to more explicitly explain style factors soon.

As far as interpreting exposures, the way you explain them sounds perfect to me. If a portfolio has positive exposure to the Quantopian value factor, its return is positively influenced by the performance of “cheap companies (high-value companies)” in the market. If n portfolio has positive exposure to the Quantopian size factor, its return is positively influenced by the performance of “big-cap companies” in the market.

Here, this risk model is not based on the Fama French 3 factor model, using it could provide you more and maybe different insights from using rolling_fama_french plots in Pyfolio. I encourage you to use both to gain more insights about your algo.

@ Rene Zhang, Thanks for the clarification.

Hi Scott/Rene -

I think I'm following this:

each asset's factor loadings are calculated by running a multiple linear regression of the asset's returns against the factor returns.

However, please clarify the actual model. Generally, I recall that OLS multiple regression is applicable to a general multivariate polynomial. For example, one could use multiple regression to find the coefficients of:

y = b0 + b1*x1 + b2*x2 + b12*x1*x2 + b11*x1^2 + b22*x2^2

From a mathematical standpoint, even though the model is nonlinear, the coefficients can be found using linear algebra alone (even though the equation is polynomial, linear methods can be used to find the coefficients, versus an iterative solver).

Presumably, the model you are using is of the simplest form:

y = b0 + b1*x1 + b2*x2

Or are you including factor interactions, up to first order? For example:

y = b0 + b1*x1 + b2*x2 + b12*x1*x2

Generally, I'd think you'd want to account for potential factor interactions, unless there is a reason to think that they would be negligible a priori.

Would be nice if US$ (UUP), Long term Bonds (TLT & LQD) and Gold (GLD) will be added too.

Hi Scott -

Will the risk model factors be made available as Pipeline factors? I'd be curious how they look in Alphalens.

@Grant

They added the feature as experimental: https://www.quantopian.com/help#risk-model-experimental

@Grant -

Like Luca said, the risk model factors are now available via pipeline. We have a function called risk_loading_pipeline which is an easy way to get access to all the risk model factors. Alternatively, there are some examples on https://www.quantopian.com/posts/quantopian-risk-model-in-algorithms of how to get access to a single factor (along with how to use the risk data in an algorithm).

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Thanks Luca & Abhi -

Just curious - has the underlying code for any of this been published on Github yet? For example, I'm curious if the short_term_reversal is actually simply the 14-day RSI. One thing I'd like to see is if the risk factors make sense at all as alpha factors. I guess I'm still confused regarding style risk factors that aren't really decent alpha factors anyway. If they are just noise, then that would say any random factor that is just noise would be a risk.

@Grant, I don't know if the code is publicly available, but it could be that Q buys the data for those risk factors and so there might be no code at all (the same as Morningstar data).

Each risk factor is available as standalone pipeline factor, so you can run your analysis on each one if you like.

I am puzzled as you are regarding risk factors that don't show any alpha, what s the point of having them in the analysis?

@ Luca -

I hadn't thought of Q buying the data for risk factors. I'm also curious how the risk model would be used to avoid over-weighting of legitimate factors in the fund (e.g. grants_awesome_algo). I guess some proxy would be developed for grants_awesome_algo that could be added to the risk model? Or maybe the risk model is only intended to regulate exposure at the individual algo level, and bears no relation to the composition of the fund (e.g. only regulating well-known "common" risk factors)? I also wonder if the risk model is purely linear or if risk factor interactions are included? Lots of unanswered questions. I guess I should list them in a stand-alone post and see if I can get them answered.

What is the distinction between a risk model loading and a built-in factor? Are they essentially the same beast? For example, can I run ShortTermReversal as a factor and compare it to the built-in factor RSI? How would I run Alphalens on ShortTermReveral as a factor?

How would I run Alphalens on ShortTermReveral as a factor?

Here you can see there are Pipeline builtin factors for each risk factor (e.g. ShortTermReversal, Momentum, Size, etc). So you can run them on Alphalens as you would do with any other Pipeline factor. This would answer the question "Is a specific risk factor an indicator of future returns"? It would be interesting to know if there is correlation between exposure to a certain risk factor and future returns. Let us know if you discover something :)

@ Luca -

Thanks, but I'm still confused. The help page makes a distinction between Pipeline Built-In Factors and Risk Model Loadings. In the discussion of the Risk Model, the term 'risk factors' was being used, not loadings, so naturally I thought new Pipeline built-in factors would be released, but we have loadings. Are factors and loadings two names for exactly the same thing (i.e. a loading is just another name for a factor)? But then why don't the loadings have the (*args, **kwargs) on them? I guess they don't take any arguments? But then wouldn't they still have an empty ( )?

I guess I'm still confused. If Pipeline Built-In Factors and Risk Model Loadings are exactly the same, then why not introduce a set of additional Built-In Factors to be used in the Risk Model? Also, adding Sector Loadings would also seem to be redundant, since we already have Fundamentals to classify by industry sector. Are the Risk Model Sector loadings just a wrapper for classification by sector using Fundamentals? If so, why release the loadings?

By the way, I recall that it was you (Luca) who first showed how to run pipeline in chunks. I'd be interested in running Alphalens on the risk factors back to 2002. Did the pipeline chunking make it into the Quantopian API? Or would I need custom code?

@Grant -

The pipeline chunking mode if officially available. Have a look at this post

I believe the loading factor express how much an asset is "exposed" (regression I guess) to a risk. So when you use the pipeline builtins ShortTermReversal, Momentum, Size, etc. you can access the exposure of the assets to each risks factor (the factor loading). We don't have access to the risk factors computation itself (as I said earlier it is possible that the data is not even computed in pipeline but it comes from an external source) but we can know how much we are exposed to those risks. Anyway it is possible to access the risk factors returns via experimental API get_factor_returns and then run some computation on that, we still don't know how those risk factors are computed but we can access their values.

Thanks Luca -

For the style risk factors, it would be interesting to see them run through Alphalens over a long period. I guess you are saying that since the individual risk factor returns are available via get_factor_returns they can somehow be analyzed using Alphalens, without actually having access to the risk factors themselves? The question in my mind is whether the style risk factors are legitimate in the first place, or if they are just noise. If they look like gobbledygook in Alphalens, then they aren't really factors anyway, so attributing risk to them doesn't make sense (if I'm thinking about this correctly).

Grant, are those the performance you are interested in?

Loading notebook preview...
Notebook previews are currently unavailable.

Thanks Luca -

Maybe. I'll take a look.

Grant, I don't believe it makes sense to use Alphalens to evaluate the performance of the risk factors returns stream. Maybe you want to run the risk factor returns through pyfolio? That would work and it would tell you if the risk factor is legitimate or just noise (even though the cumulative returns plots in my previous NB should be enough to answer the question).

If you used the risk factors available in Pipeline you would get the in output the risk loading for each asset in the universe and if you ran that through Alphalens you would have the answer to the question: "Can I use the asset risk factor exposure as an alpha signal?" That wouldn't tell you if the risk factor is legitimate or just noise itself, that would tell you if you can use that information to generate alpha

I guess the sector risk factor returns are long-only, but the style risk factor returns are long-short (presumably the underlying alpha factor is demeaned, with equal long-short)? Also, I guess this is for all stocks, and would need to be reduced to only the QTradableStocksUS.

This may be a naive understanding of the risk model but, based on a review of the white paper, are the "factor loadings" that you provide for style estimated based on the regression of the epsilon as estimated by sector exposure against z-scored style_k returns ("sub model (1b)" in the white paper)? Or is it security returns against style returns? Or does it matter/produce the same result for trying to estimate alpha unattributed to those risks? Please clarify, thanks.

@dyaz, In the regression for the style factor, I believe the parameters that are being estimated/solved for are the style factor returns themselves. In other words, they regress the sector residuals (dependent variable) against the standardized style metrics (independent variables). For example, for firm size, you would calculate the cross-sectional z-score of the log(Market Cap) for each day. This would be your independent variable for size, which would be used to calculate the coefficient in the regression, that represents the return for the size style factor.

Note, the above applies to stocks in the estimation universe. For the complementary stocks, the opposite is true. It is the style factor returns that are being estimated.(See the Style Factor Calculation subheading in the Methodology section where it talks about the difference between estimation and complementary universes).

This is only my interpretation, so I would look to see if others confirm this.

Just confused a little bit from reviewing the notebook posted above and then the white paper (especially with the fama-french model being thrown around in review).

(1) Are the style factor returns (as provided by quantopian in research) for each day estimated based on the cross sectional regression of the sector epsilon of the securities in the universe against the z-scored style factors? Or are these something like an average return of the factor as in a fama-french model?

(2) Are the style factor loadings (as provided by quantopian) for each security estimated based on a regression of the epsilon of the sector against the average portfolio of returns for each style factor (as one would in the fama-french model)? Or are these the z-scores based on the day t z-score for the estimation universe? Or are these a time-series regression of the sector epsilon for each security against the estimated factor returns (as discussed in question 1 and similar to the outlined process for "complimentary stocks" in the white paper)?

Thanks for the help/clarification!