Risk Model Example: Detecting High Short Term Reversal Risk

Quantopian's risk model allows you to disentangle alpha (specific returns) from risk (common returns). This is done by defining a set of market effects which are known risk factors, and then seeing how the returns of a model/strategy can be explained by those risk factors. Whatever left over is returns specific to that model/strategy, AKA alpha. We'll be releasing a lecture on this soon. For more info now see our risk model page: https://www.quantopian.com/risk-model

In this example we’ll show an algorithm which was built to have high exposure to a known market risk, short term reversal. Short term reversal is a specific form of mean reversion that bets on short term deviations in price reverting to the mean; and uses price data exclusively. It is considered common risk due to its widespread knowledge and use, so it would be considered to have any real alpha. All the performance, positive or negative, obtained from investing in mean reversion would be considered common risk attributed.

Mean reversion more generally is just the notion of modeling a quantity such that bets can be placed on deviations reverting. You can do mean reversion on alternative data, mean reversion on sentiment, mean reversion on specific phenomena. The main issue here is the very simplistic form of mean reversion that short term reversal represents.

Also, keep in mind that managed and intentional risk exposure can be okay, in so far that there is a clear explanation for it and it's additive to the strategy. If you just have random exposure to a factor, that's not good. If you have consistent exposure to a factor that's also not good. If your exposure turns on and off over time depending on intelligent decisions, the portfolio is well diversified, and the on-off toggling actually nets you positive returns, that may be acceptable depending on context. A case we likely don't want is some timing strategy on a single risk factor, but an algorithm that takes on some risk that changes over time can be okay.

We’ll show a performance attribution breakdown for the strategy to give you a sense of how to use it on your own.

For more info, see our lecture on controlling risk exposure during portfolio optimization.
https://www.quantopian.com/lectures/risk-constrained-portfolio-optimization

Note: This post has been edited for clarity.

120
Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

72 responses

And here's the algorithm I used to run my example.

Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
"""This algorithm is designed for validating risk model.
It can be configured to be:
* Momentum or Mean-reversion
* Sector-Neutral or not
"""

from quantopian.algorithm import attach_pipeline, pipeline_output, order_optimal_portfolio
from quantopian.pipeline import Pipeline
from quantopian.pipeline.factors import CustomFactor, SimpleMovingAverage, AverageDollarVolume, RollingLinearRegressionOfReturns, Returns
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.data import morningstar
from quantopian.pipeline.filters.morningstar import IsPrimaryShare
from quantopian.pipeline.classifiers.morningstar import Sector

import numpy as np
import pandas as pd

import quantopian.optimize as opt

STYLE = 'mean_reversion'
SECTOR_NEUTRAL = True

# Risk Exposures
if SECTOR_NEUTRAL:
MAX_SECTOR_EXPOSURE = 0.0010
else:
MAX_SECTOR_EXPOSURE = 1.0

MAX_BETA_EXPOSURE = 0.0020

# Constraint Parameters
MAX_GROSS_EXPOSURE = 1.0 # Can only be 1x levered
NUM_LONG_POSITIONS = 300
NUM_SHORT_POSITIONS = 300

MAX_SHORT_POSITION_SIZE = 2*1.0 / (NUM_LONG_POSITIONS + NUM_SHORT_POSITIONS)
MAX_LONG_POSITION_SIZE = 2*1.0 / (NUM_LONG_POSITIONS + NUM_SHORT_POSITIONS)

class MeanReversion1M(CustomFactor):
inputs = (Returns(window_length=21),)
window_length = 252

def compute(self, today, assets, out, monthly_rets):
np.divide(
-(monthly_rets[-1] - np.nanmean(monthly_rets, axis=0)),
np.nanstd(monthly_rets, axis=0),
out=out,
)

def make_pipeline():
if STYLE == 'momentum':
alpha_factor = Returns(window_length=252) - Returns(window_length=20)
elif STYLE == 'mean_reversion':
alpha_factor = MeanReversion1M()
# Classify all securities by sector so that we can enforce sector neutrality later
sector = Sector()

# Screen out non-desirable securities by defining our universe.
# Removes ADRs, OTCs, non-primary shares, LP, etc.
# Also sets a minimum $500MM market cap filter and$5 price filter
mkt_cap_filter = morningstar.valuation.market_cap.latest >= 500000000
price_filter = USEquityPricing.close.latest >= 5
universe = QTradableStocksUS() & price_filter & mkt_cap_filter

combined_rank = (
)

# Build Filters representing the top and bottom 150 stocks by our combined ranking system.
# We'll use these as our tradeable universe each day.
longs = combined_rank.top(NUM_LONG_POSITIONS)
shorts = combined_rank.bottom(NUM_SHORT_POSITIONS)

# The final output of our pipeline should only include
# the top/bottom 300 stocks by our criteria
long_short_screen = (longs | shorts)

beta = 0.66*RollingLinearRegressionOfReturns(
target=sid(8554),
returns_length=5,
regression_length=260,
).beta + 0.33*1.0

# Create pipeline
pipe = Pipeline(columns={
'longs': longs,
'shorts': shorts,
'combined_rank': combined_rank,
'alpha_factor': alpha_factor,
'sector': sector,
'market_beta': beta
},
screen = long_short_screen)
return pipe

def initialize(context):
# Here we set our slippage and commisions. Set slippage
# and commission to zero to evaulate the signal-generating
# ability of the algorithm independent of these additional
# costs.
set_slippage(slippage.VolumeShareSlippage(volume_limit=1, price_impact=0))
context.spy = sid(8554)

attach_pipeline(make_pipeline(), 'long_short_equity_template')

# Schedule my rebalance function
schedule_function(func=rebalance,
date_rule=date_rules.month_start(),
time_rule=time_rules.market_open(hours=0,minutes=30),
half_days=True)
# record my portfolio variables at the end of day
schedule_function(func=recording_statements,
date_rule=date_rules.every_day(),
time_rule=time_rules.market_close(),
half_days=True)

# Call pipeline_output to get the output
# Note: this is a dataframe where the index is the SIDs for all
# securities to pass my screen and the columns are the factors
# added to the pipeline object above
context.pipeline_data = pipeline_output('long_short_equity_template')

def recording_statements(context, data):
# Plot the number of positions over time.
record(num_positions=len(context.portfolio.positions))

# Called at the start of every month in order to rebalance
# the longs and shorts lists
def rebalance(context, data):
### Optimize API
pipeline_data = context.pipeline_data

### Extract from pipeline any specific risk factors you want
# to neutralize that you have already calculated
risk_factor_exposures = pd.DataFrame({
'market_beta': pipeline_data.market_beta.fillna(1.0)
})

objective = opt.MaximizeAlpha(pipeline_data.combined_rank)

### Define the list of constraints
constraints = []
# Constrain our maximum gross leverage
constraints.append(opt.MaxGrossExposure(MAX_GROSS_EXPOSURE))
# Require our algorithm to remain dollar neutral
constraints.append(opt.DollarNeutral())
# Add a sector neutrality constraint using the sector
# classifier that we included in pipeline
constraints.append(
opt.NetGroupExposure.with_equal_bounds(
labels=pipeline_data.sector,
min=-MAX_SECTOR_EXPOSURE,
max=MAX_SECTOR_EXPOSURE,
))
# Take the risk factors that you extracted above and
# list your desired max/min exposures to them -
# Here we selection +/- 0.01 to remain near 0.
neutralize_risk_factors = opt.FactorExposure(
min_exposures={'market_beta':-MAX_BETA_EXPOSURE},
max_exposures={'market_beta':MAX_BETA_EXPOSURE}
)
constraints.append(neutralize_risk_factors)

constraints.append(
opt.PositionConcentration.with_equal_bounds(
min=-MAX_SHORT_POSITION_SIZE,
max=MAX_LONG_POSITION_SIZE
))

order_optimal_portfolio(
objective=objective,
constraints=constraints
)
There was a runtime error.

Great post! Can I just check the additional filters applied to QTradeableStocksUS() are redundant. For example, the price filter is built in:

Hi Delaney -

Well, now I'm really confused, but admittedly, I don't understand this new risk analysis thingy yet and the rationale for it. For example, you say:

Mean reversion bets on short term deviations in price reverting to the mean, and uses price data exclusively. It is considered common risk now due to its widespread knowledge and use...

How does this make it a risk? Is it a risk for the Quantopian fund, in terms of diversification? Or is there a risk that the effect will disappear (but if it is well-known and profitable, then wouldn't it be gone already)? Would it still be considered a risk, even if it were super-profitable?

Within the analysis tool, how are you attributing the source of returns to mean reversion? You must have a model for what it smells like, and then be able to sniff it out somehow.

I think this is just terminology. I’ve seen Andrew Ang use the same. Risk is anything you will be compensated for. Generally speaking you get either

• steady returns and occasional crashes, like momentum - negative skewed risk

• flat or negative mean returns but then huge spikes up, like buying VIX and other insurance - positive skewed risk

Hmm? So if returns due to mean reversion based on analysis of price data alone is a "risk" and well-compensated, I gather it is still something to be avoided (like cobras, broken glass, and the plague)? Sorry, still confused.

I guess it’s like butter in French food. Tasty, but there is such a thing as too much!

So it is more to do with diversification? Anyway, we'll see what Delaney & Co. have to say. It is very confusing; it seems like they are wanting every algo to be highly scalable and diversified, but I thought this would come from the cobbling together of a large number of algos from their 160,000 users...

@BurritoDan, @Grant, as this involves the metaphor of tasty French food to help us understand risk, perhaps we should pass it to @Max for clarification [intended as friendly joke] ;-))

@Grant, Q may have 160k users, but it would be interesting to know how many of them are actually generating viable algos. My guess would be probably somewhere between about 500 & 1k of those users at most. Anyone able to tell us? Whatever the answer, that's still potentially a lot of algos, and way more diversity of algo writers than most other institutions will ever have!

These are great questions, and get at the core of what "risk" means. It's an often thrown around term, but rarely defined clearly. I'll do my best here.

TL:DR Some risk is fine, but investors have limits in their portfolios. Your algorithm is only valuable if an investor chooses it over alternative investments. As such it's an arms race to see who can come up with alpha with the least risk exposure.

Investing is placing bets, bets say that you will be compensated based on the behavior of some real world process. If you place a bet on a sports game, that's taking risk exposure to the outcome of the sports game. Similarly in finance, if you place a bet on the market going up by buying the SPY ETF, you are taking on risk exposure to the market. If you place a bet on small companies doing well by longing small cap stocks and shorting big cap stocks, you are taking on exposure to SMB risk. The reason it's "risk" is because it is a factor you cannot control, if the market decides to go down, then you were incorrect in your prediction and you lose your bet. If large companies outperform small, you lose your bet, etc.

You can think of risk as a dependency. Why is a risk model valuable? Because without one you are unaware of your dependencies. What if you unknowingly were taking out a large amount of volatility risk, and volatility suddenly switched to a different regime? A risk model lets you know where your returns are coming from. In general risk is classified into a few categories: returns volatility, market and sector exposure, and style risk. Returns volatility is just the standard deviation of your returns, and our model estimates how much exposure you have to sector and style risks.

Now for alpha vs. risk. If you come to me with some algo that just places long market bets, why should I be interested in paying you fees for that algo? I can buy into long market exposure simply and cheaply. Even if you’re planning on trading it yourself, you’ll likely just pay extra transaction fees over buying an ETF. There's no additional work you're doing. If you come to me with an algo that produces returns which are not risk dependent on factors in our model, that's pure alpha (in our model) and the goal. In real life you'll produce some alpha and some risk. The cheapest risk to purchase is market and sector risk, slightly more expensive is style factor risk like value. The most expensive type of risk is fast-moving factors like short term reversal, momentum, and volatility. People who currently lack high quality exposure to those risks may be willing to pay you for high quality risk exposure, but the absolute most valuable thing you can produce are returns which are not dependent on those known risks from our model.

It's by and large impossible to get returns without risk, but everybody tries as best they can to get a good returns/risk ratio (hello Sharpe Ratio). Here are two concepts that should help in understanding how to deal with risk: portfolio level risk budgets and alternatives. Every investor maintains a portfolio. Every portfolio will have a variety of risk exposures. Investors want to minimize and diversify their risk. If you come to an investor with an algorithm that has some alpha, but also short term reversal and value risk exposures, the investor will consult their portfolio. They definitely want alpha, but it seems like they already have a high exposure to short term reversal in their portfolio, on the other hand they have low exposure to value so they have some budget there. They would tell you that they currently cannot invest, but if you can reduce your short term reversal exposure and maintain some of your alpha, they are interested. The second is alternatives, investors will attempt to maximize returns subject to a risk budget. If someone else comes to them with an algorithm with similar alpha and strictly lower risk exposures, they will pick that one every time. If someone else comes to them with similar alpha and different but similar total magnitude (Pareto efficient) risk exposures, then different investors may be interested depending on the existing risk exposures of their portfolio.

The reason it's not cut and dried is that alphas can only be contorted so much. If you have a great alpha it may come with some risk exposure, and totally getting rid of that risk exposure may ruin the alpha. Of course, you should try as much as possible to reduce your risk exposures, and fully understand why you cannot lower them beyond a certain point without eliminating alpha. Not being able to explain why you can’t reduce risk implies you don’t understand the risk. Hedging is an imperfect solution with estimation risk, trading and borrow costs, and can only go so far. In general adding more independent alphas will help diversify and even your exposures out, but at the end of the day it's a game of alternatives. Quantopian is looking for strategies which allow it to stay within its overall portfolio risk budget. Depending on what else has received an allocation, we may have different budgets for different risk exposures, and the best that you can do as a user is try to get them low across the board without diluting your alpha too much. We'll attempt to give as much clarity and feedback as possible about what we're looking for in the future. Remember, if you can produce alpha with little to no risk, someone will pay you for that.

I urge people to follow up in our lectures to understand more of the mechanics behind this. My understanding of risk was greatly helped by seeing how the actual systems are built.
https://www.quantopian.com/lectures/risk-constrained-portfolio-optimization
https://www.quantopian.com/lectures/factor-risk-exposure

This is good info, and I don't want to minimize the educational value of the above (the lectures are great, btw) however, let's not forget that the Quants of the summer of 2007 (the so-called Quant Quake) also had lots of pretty formulas for risk, but they didn't really understand that all their "risk-neutral" long/short portfolios were very similar to each other and when they had to unwind some of those portfolios as part of the deleveraging process that they were forced to undertake due to "external" circumstances, they found that they were mostly trying to unwind the same positions since they had all discovered the same factors and created similar portfolios. Queue the race for the exits and a spiraling deleveraging and large losses on these supposedly risk-constrained portfolios.

This is the danger (risk?) of becoming too enamored with models and over-leveraging "low-risk" portfolios. They work great, until they don't. Quant models in general are still rather lacking in their ability to account for systemic risk and it would be great to see this issue get more focus from the Q folks.

That is an excellent point, our risk model just focuses on some major and well known sources of risk. There are many other sources of risk out there. In general it is very hard to know what the industry at large is doing and how cross-correlated you are with other folks. What's interesting is we have a birds-eye view of cross algo correlation at Quantopian, so we can adjust for within-platform systemic correlation risk. However, we can't know what other quant firms are doing.

One way to improve there is that allocators become more sophisticated and start doing more cross correlations between potential investments and their existing holdings. At the end of the day there aren't a ton of tools that exist to help with this that I know of, but it's a really interesting problem. At the end of the day folks have to be less hubristic and not over-lever.

Thanks Delaney -

Hmm? It almost sounds like if there is no explanation for the returns, it is best. Must be low risk if it is something new and unexplained. Unless customers know exactly what you are doing, it seems you could end up pulling the wool over their eyes, no? Are you expecting them to dig into your risk factor specifications and the code? Seems overly complicated.

How do you reduce mean reversion risk? Implement some momentum trades in the same algo?

If I understand correctly, the risk model ideally would detect both the short_term_reversal and momentum factors and compute the residual return after removing their effects. My naive understanding at this point is imagine constructing a multi-factor pipeline algo which combines the alpha factors as a simple linear weighted combination of N (orthogonal) alpha factors (e.g. alpha = w1*alpha1 + w2*alpha2 + ... + wN*alphaN). If alpha1 is short_term_reversal and alpha2 is momentum (defined exactly as they are in the risk model) then the risk model would find both and remove them, leaving the remaining alpha factors (not included in the risk model, to be sold to the customer as "pure alpha" which seems naive, but what do I know).

I suspect that the risk model is purely linear (e.g. https://en.wikipedia.org/wiki/Linear_model), given that the analysis is described as a series of cascading regressions ("The risk model consists of a series of cascading linear regressions on each asset. In each step in the cascade, we calculate a regression, and pass the residual returns for each asset to the next step."). One question for Delaney is, does the risk model include interactions (e.g. https://en.wikipedia.org/wiki/Interaction_(statistics))?

In theory, I suppose one could exactly cancel one factor with another. For example, if in alpha = w1*alpha1 + w2*alpha2 + ... + wN*alphaN, w1*alpha1 = - w2*alpha2, then effectively by adding momentum one could cancel short_term_reversal. Then, the risk model would not detect any effects due to short_term_reversal. However, this would say that momentum and short_term_reversal are not orthogonal, so I'm confused on this point...

Another question for Delaney would be, is short_term_reversal a whole collection of factors representing this risk? Or a single one? Or perhaps it is identical to his factor above:

class MeanReversion1M(CustomFactor):
inputs = (Returns(window_length=21),)
window_length = 252
def compute(self, today, assets, out, monthly_rets):
np.divide(
-(monthly_rets[-1] - np.nanmean(monthly_rets, axis=0)),
np.nanstd(monthly_rets, axis=0),
out=out,
)


Here's an example mean-reversion algo. I'll post the tear sheet next.

28
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
import numpy as np
import pandas as pd
from quantopian.algorithm import attach_pipeline, pipeline_output
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data import morningstar as mstar
from quantopian.pipeline.filters import Q1500US
import quantopian.experimental.optimize as opt
from quantopian.pipeline.factors import Latest
from quantopian.pipeline.data.builtin import USEquityPricing

def initialize(context):

# parameters
# --------------------------
context.n_stocks = 300 # number of stocks
context.N = 6 # trailing window size, days
context.eps = 1.0 # optimization model parameter
# --------------------------

schedule_function(housekeep, date_rules.every_day(), time_rules.market_open())
schedule_function(get_weights, date_rules.every_day(), time_rules.market_open(minutes=30))

# Attach our pipeline.
attach_pipeline(make_pipeline(context), 'my_pipe')

def make_pipeline(context):

# Define universe
# ===============================================
pricing = USEquityPricing.close.latest
base_universe = (Q1500US() & (pricing > 5))
ev_positive = ev > 0
ebitda_positive = ebitda > 0
market_cap = Latest(inputs=[mstar.valuation.market_cap], mask = ebitda_positive)
universe = market_cap.top(context.n_stocks)

return Pipeline(screen = universe)

"""
Called every day before market open.
"""
context.output = list(pipeline_output('my_pipe').index.values)

context.stocks = [stock for stock in context.output if stock not in context.bad_data]

def housekeep(context, data):

leverage = context.account.leverage

if leverage >= 1.1:
print "Leverage >= 1.1"

record(leverage = leverage)

for stock in context.stocks:
if stock in security_lists.leveraged_etf_list.current_securities(get_datetime()): # leveraged ETF?
context.stocks.remove(stock)

# check if data exists
for stock in context.stocks:
context.stocks.remove(stock)

num_secs = 0

for stock in context.portfolio.positions.keys():
if context.portfolio.positions[stock].amount != 0:
num_secs += 1

record(num_secs = num_secs)

def get_allocation(context,data,prices,b_t):

m = len(context.stocks)
x_tilde = np.zeros(m)
d = np.ones(m)

for i,stock in enumerate(context.stocks):
mean_price = np.mean(prices[:,i])
price_rel = mean_price/prices[-1,i]
d[i] = price_rel
if price_rel < 1.0:
price_rel = 1.0/price_rel
d[i] = -price_rel
x_tilde[i] = price_rel

###########################
# Inside of OLMAR (algo 2)

x_bar = x_tilde.mean()

# Calculate terms for lambda (lam)
dot_prod = np.dot(b_t, x_tilde)
num = context.eps - dot_prod
denom = (np.linalg.norm((x_tilde-x_bar)))**2

# test for divide-by-zero case
if denom == 0.0:
lam = 0 # no portolio update
else:
lam = max(0, num/denom)

b = b_t + lam*(x_tilde-x_bar)

b_norm = simplex_projection(b)

if lam != 0:
weight = np.dot(b_norm,x_tilde)
else:
weight = 1.0

return (d*b_norm, weight)

def get_weights(context,data):

prices = data.history(context.stocks, 'price', 390*context.N, '1m').dropna(axis=1)
context.stocks = list(prices.columns.values)

m = len(context.stocks)

a = np.zeros(m)
w = 0

b_t = 1.0*np.ones(m)/m

for n in range(5,5*context.N+1):
(a,w) = get_allocation(context,data,prices.tail(n*78).as_matrix(context.stocks),b_t)
a += w*a
w += w

a = a/w

a = a/np.sum(np.absolute(a))
a = a - np.mean(a)
a = a/np.sum(np.absolute(a))

allocate(context, data, a)

def allocate(context, data, a):

weights = {}
for i, stock in enumerate(context.stocks):
weights[stock] = a[i]

for stock in context.portfolio.positions.keys():
if stock not in context.stocks:
weights[stock] = 0

objective = opt.TargetPortfolioWeights(weights)
constraints = []

order_optimal_portfolio(objective, constraints)

def simplex_projection(v, b=1):
"""Projection vectors to the simplex domain

Implemented according to the paper: Efficient projections onto the
l1-ball for learning in high dimensions, John Duchi, et al. ICML 2008.
Implementation Time: 2011 June 17 by [email protected] AT pmail.ntu.edu.sg
Optimization Problem: min_{w}\| w - v \|_{2}^{2}
s.t. sum_{i=1}^{m}=z, w_{i}\geq 0

Input: A vector v \in R^{m}, and a scalar z > 0 (default=1)
Output: Projection vector w

:Example:
>>> proj = simplex_projection([.4 ,.3, -.4, .5])
>>> print proj
array([ 0.33333333, 0.23333333, 0. , 0.43333333])
>>> print proj.sum()
1.0

Original matlab implementation: John Duchi ([email protected])
Python-port: Copyright 2012 by Thomas Wiecki ([email protected]).
"""

v = np.asarray(v)
p = len(v)

# Sort v into u in descending order
v = (v > 0) * v
u = np.sort(v)[::-1]
sv = np.cumsum(u)

rho = np.where(u > (sv - b) / np.arange(1, p+1))[0][-1]
theta = np.max([0, (sv[rho] - b) / (rho+1)])
w = (v - theta)
w[w<0] = 0
return w
There was a runtime error.

I guess this is a dud, due to the concentration in short_term_reversal?

60

Hi Delaney -

Quantopian is looking for strategies which allow it to stay within its overall portfolio risk budget. Depending on what else has received an allocation, we may have different budgets for different risk exposures, and the best that you can do as a user is try to get them low across the board without diluting your alpha too much. We'll attempt to give as much clarity and feedback as possible about what we're looking for in the future.

If I understand correctly, certain factors you are considering to be "risks" today may simply be categorized as such because you have enough of them in your fund already. Is this a correct assessment? Or are you saying that if we had access to all of your algos funded to date and ran the tool, we'd find that the risks would be well managed on an individual algo basis (i.e. we wouldn't find any with the kind of short_term_reversal risk I show above)?

@Delaney, thanks for a very detailed explanation on risk, your answers are always very insightful and helpful.

I'm working on getting answers to these questions from our risk expert, Rene. Sorry for the wait and thanks for the questions.

Hi @Delaney,

If you have low total returns but quite high specific returns,
What would you do to get total returns to this high figure? Hedge, add/substract factor to ranking?

@ Grant Kiehne Thanks for your question!

Hmm? It almost sounds like if there is no explanation for the returns, it is best.

About the above question, my answer is "not really". You would want to know what your return sources are. You would like to investigate whether your algo’s behaviors are as you expected, whether the algo returns are mainly from your novel idea, and whether specific returns represent your novel ideas well. This is part of the value of having an economic hypothesis in the first place. It helps you reason your way into algorithm improvements.

I think the main goal of Q risk model and performance attribution is to help Q users to understand their algos and improve the quality and consistency of returns. It could help to

1. Identify the return resources. For example, if I designed a value-based strategy, but all of my returns are coming from the consistent positive large energy sector exposure. Then, I would think it is dangerous since I never thought about how to bet the energy sector smartly. If something bad happens to the energy sector in the future, my algo could lose tons of money. It is not a good idea to figure out why the algo lost money then. Also, you could use it to check whether the behaviors of the algos are as you expected. The factors Q risk model are using are the ones typically used in industry. You could use customized factors. Very convenient. In our lecture, it shows how to use Fama-French factors.
2. Identify risk resources. Understand what risk you take from each factor. Knowing what risk you are taking and what returns you can obtain are also important.

From my view, if I would like to sell an algo to investors, I probably would like to let them know how I came up this new idea, how I tested the new ideas, what risk my algo is taking, what return resources are, etc to persuade people it is a good algo and I did my research work thoroughly and carefully.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Thanks Rene,

I sorta understand the sector returns as a proxy for broad diversification. It is kinda belt-and-suspenders, since if one is using the Optimize API and enforcing sector allocations across a broad universe, it should be in check. In the context of the 1337 Street Fund, I don't really understand the need for broad diversification on an individual algo basis, since in the end, the diversification could come from the large user base of worker bees. However, there is probably a tendency to pile into recent profitable trends, so you'd just end up with a lot of the same strategies. In my lifetime, the dot-com bubble and the housing bubble stand out, where at least some of the associated risk could have been mitigated with diversification. I'd be curious how the Q risk model would address the 2007 quant crisis. My understanding is that it was due to lack of diversification in strategies, across the industry--everyone was doing the same basic thing reportedly (which makes me skeptical of the effectiveness of your approach, as described by "The factors Q risk model are using are the ones typically used in industry").

As far as the Style risk is concerned, my assumption is that you will keep adding (and perhaps removing) factors in this category to manage risk within the 1337 Street Fund. Maybe I'm misinterpreting its dynamic usage, but I can imagine a whole set of unwanted factors, simply because you have them already (but then you'd have to publish them, so perhaps this concept is a non-starter...so I'm not sure how you'll provide this feedback to the crowd).

Regarding economic hypotheses, presumably your Style risks work to some extent. If they were just gibberish, then what would be the point of including them in your risk model. So what are the Quantopian economic hypotheses behind them? If they are valid factors, then there should be some hypothesis behind each, right? Or did you just take them as received knowledge?

It also seems that an economic hypothesis is not necessary, particularly for approaches like the 101 Alphas, Thomas Wiecki's ML example, and your Alpha Vertex PreCog data sets--all, as I understand, amalgamating data from lots of sources. The economic hypothesis ends up being that given enough data, the right algorithm, and a powerful computer, profits can be had.

Hi @Delaney, Q has obviously put a lot of thought & effort into developing this Risk Model. While I find it "interesting" from an academic perspective, I'm still struggling (like @Grant is also) to see exactly how to apply it in practice to improve my algos. Here are some example-type questions to help me understand better:

1). Let's say I have an algo which has cumulative Total returns that rise steadily over time and consist of about a 50/50 mix of "Common" returns and "Specific" returns. Is that good / bad / indifferent, or not a meaningful question?

2) Should I be endeavoring to reduce the RATIO of "Common" returns to either Total or Specific returns without reducing the Total returns?

3) I see in Grant's demo algo above that he has Specific returns (which you equated with alpha in the original post) almost completely equal to his Total returns, while his Common returns (which you equated to risk) are very low, in fact almost zero and even slightly negative. Presumably this is excellent and would be MUCH better than an algo as per my example above in which Common & Specific returns are almost equal. Why then would Grant conclude "I guess this (i.e. his algo) is a dud?"

4). If I have an algo example output showing Cumulative Common Sector returns and I see that a) the largest Positive contribution is coming from the Technology sector, b) very little contribution (close to zero) is coming from the Basic Materials sector, and c) the largest Negative contribution is coming from Real Estate, and moreover these relative contributions are all very consistent over time, what does this mean? Should I then be trying to:
- Increase the proportion of Tech stocks that my algo is trading because this is where most returns are coming from?, or
- Decrease the proportion of Tech stocks because they are already over-contributing compared to other sectors?
- Increase the proportion of Basic Materials stocks because currently they don't contribute enough to returns?, or
- Decrease the proportion of Basic Materials stocks because currently they don't usefully contribute to returns?
- Remove Real Estate stocks because their contribution to returns is consistently negative?

5). If I note that the 3 largest contributors to my Cumulative Common Style returns are Short-term reversal, Momentum, and Volatility, all of which are consistently positive, whereas Size & Value are consistently contributing almost nothing. Does this mean that I should then:
- Increase my focus on Reversal, Momentum & Volatility because these are where my positive returns are coming from,? or
- Decrease my focus on Reversal, Momentum & Volatility because they already contribute disproportionately large amounts? and then
- Increase my focus on Size & Value because these are under-represented?, or
- Decrease my focus on Size & Value because they don't add much return anyway?

6) Currently i feel like you have provided a powerful instrument but i still don't yet know what exactly i should be trying to do with it. If you personally could design your own algos in such a way as to make the various response plots look like anything you wanted, then what would they look like? i.e. in terms of the specific sets of plots available, what are the "ideal" responses that we should be aiming for in our algo designs?

7) In the context of the risk evaluation tool, how exactly have you mathematically defined "Momentum" and "Short_Term_Reversal"?

My apologies if some of these questions are very naieve or might have already been answered elsewhere, but your assistance in clarification will be a great help in allowing me (and hopefully others) to make better use of these new Q Risk tools.

In the initial notebook posted by Delaney, he states "Notice how nearly all of the returns are explained by the common risk returns, while the specific returns are much smaller. You can see this expressed in the initial table, and also in the first plot."

Based on the 1st plot, it actually looks as though nearly all of the returns are explained by specific returns (the green line in the plot). Cumulative common returns (the red line) hovers below 0 for most of the period and then barely moves above 0 at the end of the period. Based on the "Summary Statistics" table:

• total return ~0.79%
• specific return ~0.56% (70% of total return)
• common return ~0.24% ( 30% of total return)

This appears to me that the majority of the return is made up of returns specific to the strategy (not to say that there isn't significant exposure to the short-term reversal factor, as there clearly is exposure there).

Am I misunderstanding something?

Hey all, I wanted to try to help clear up some confusion. I made a mistake when conflating short term reversal and mean reversion earlier, I've since updated the text in the main post to reflect an more accurate explanation. I welcome any additional feedback.

Also, Matthew is correct that the way I explain how returns are attributed in the model is confusion and basically a typo. What I should say is that short term reversal seems to be a massive effect in the algorithm that is not well controlled. Generally, a risk model will never pick up all the effects of risk factors as they're only approximations and can't explain everything that's going on. Things like intraday trading, for instance, will confuse ours. What we see here is the returns largely being attributed to specific risk, but just because that's the case we shouldn't assume that they are really alpha. We need to look at the exposure data to understand how our algorithm is exposed to factors and which need to be controlled better. Only when both the returns attribution and exposure charts look good would I be comfortable saying the algorithm is likely producing uncorrelated returns.

That makes sense. Thanks for the clarification, Delaney.

Hi Delaney -

I fear that you and your Q colleagues will struggle to explain and to justify this risk model jazz without writing a white paper that explains the theory at some fundamental level (using math rather than commented code), the Q implementation, and the justification based on the performance of actual long-short equity hedge funds that define and manage risk in a similar fashion. Such a document would probably be of interest to your prospective customers, as well.

Above, you say:

Short term reversal is a specific form of mean reversion that bets on short term deviations in price reverting to the mean; and uses price data exclusively. It is considered common risk due to its widespread knowledge and use, so it would be considered [not] to have any real alpha. All the performance, positive or negative, obtained from investing in mean reversion would be considered common risk attributed.

This is really confusing. First off, I think this is the first time I've heard a distinction between "real alpha" and (presumably) "fake alpha" (personally, I prefer "genuine alpha" and "faux alpha"). I guess by "real" you mean likely to persist consistently into the future? If algorithmic trading based on short-term reversal is well-known and widely used, then if the market is at all efficient, I'd expect it not to work (which is kinda what your algo shows with a SR = 0.28). So, in the end, your risk model is trying to filter out strategies that don't work. But if they fundamentally don't work, then what is the point in filtering them out, and creating confusion and churn? If short-term reversal doesn't work, then the algo performance will suck (based on SR, risk-adjusted returns, whatever) and you can just reject it on that basis. Generally, there is a long list of factors that don't work; is the end game to add them to the risk model, as they are identified (e.g. you could go through the 101 Alphas and you vast collection of other Pipeline factors and add the ones that are silly to your risk model)? But then what would be the point, if they don't work in the first place? Very confusing. I suppose if you ever publish the details of your risk model, we can see which factors are simply junk in the first place, and aren't really risks at all.

On the other hand, if short-term reversal is well-known and widely used, but regardless, is an anomaly that is still profitable, then why not join the party? I guess the risk is that we are in the midst of a short-term reversal bubble that will burst without warning? But then, if short-term reversal works, wouldn't you want a sprinkling of it, to capture returns during such bubbles? Again, confusing.

The guidance I'm hearing is that if one wants to attempt a short-term reversal algo, then it needs to have some "secret sauce" beyond pure price data, it needs to be fancified in some fashion, even if the dressing up doesn't make it any more profitable than a plain vanilla short-term reversal strategy. In practice, this means driving down the short_term_reversal risk factor attribution metric to +/- 40% or less (presumably lower is better?).

Perhaps you could run Alphalens (back to 2002) on your short_term_reversal risk factor and post it? Maybe that would help clarify things, in terms of its viability as a factor in the first place?

Hi @Delaney,
I think @Grant and i both struggling with a lot of the same things. While he is requesting explanations to help him understand general concepts, I'm coming at it a different way and asking 7 very specific questions related to examples in my post above. Please could you address those specific questions.? Thanks in advance, TonyM.

Here's another example for folks to chew on. Will post tear sheets next.

135
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
from quantopian.algorithm import attach_pipeline, pipeline_output, order_optimal_portfolio
from quantopian.pipeline import Pipeline
from quantopian.pipeline.factors import CustomFactor, RollingLinearRegressionOfReturns
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.data import Fundamentals, psychsignal
from quantopian.pipeline.classifiers.fundamentals import Sector
from quantopian.pipeline.factors import Latest, Returns, AverageDollarVolume
import quantopian.optimize as opt
from sklearn import preprocessing
from scipy.stats.mstats import winsorize

import numpy as np
import pandas as pd

# Liquidity screen parameters
LIQUIDITY_LOOKBACK_LENGTH = 100
UNIVERSE_SIZE = 1200

# Constraint Parameters
MAX_GROSS_EXPOSURE = 1.0
NUM_TOTAL_POSITIONS = 600
NUM_LONG_POSITIONS = NUM_TOTAL_POSITIONS/2
NUM_SHORT_POSITIONS = NUM_LONG_POSITIONS

MAX_LONG_POSITION_SIZE = 2.0/NUM_TOTAL_POSITIONS
MAX_SHORT_POSITION_SIZE = MAX_LONG_POSITION_SIZE

# Risk Exposures
MAX_SECTOR_EXPOSURE = 0.005
MAX_BETA_EXPOSURE = 0.05

# Factor preprocessing settings
WIN_LIMIT = 0.01  # factor preprocess winsorize limit

def make_factors():

class mean_rev(CustomFactor):
inputs = [USEquityPricing.open,USEquityPricing.high,USEquityPricing.low,USEquityPricing.close]
window_length = 5
def compute(self, today, assets, out, open, high, low, close):

p = (open+high+low+close)/4
rng = (high-low)/close

m = len(p)
a = np.zeros(m)

for k in range(1,m+1):
a = preprocess(np.mean(p[-k:,:],axis=0)/close[-1,:])
w = np.nanmean(rng[-k:,:],axis=0)
a += w*a

out[:] = preprocess(a)

class fcf(CustomFactor):
inputs = [Fundamentals.fcf_yield]
window_length = 1
def compute(self, today, assets, out, fcf_yield):
out[:] = preprocess(fcf_yield)

class earn_yield(CustomFactor):
inputs = [Fundamentals.earning_yield]
window_length = 1
def compute(self, today, assets, out, earn_yield):
out[:] = preprocess(earn_yield)

class sentiment(CustomFactor):
inputs = [psychsignal.stocktwits.bull_minus_bear]
window_length = 1
def compute(self, today, assets, out, sentiment):
out[:] = preprocess(sentiment)

class MessageSum(CustomFactor):
inputs = [psychsignal.stocktwits.bull_scored_messages, psychsignal.stocktwits.bear_scored_messages]
window_length = 5
def compute(self, today, assets, out, bull, bear):
out[:] = preprocess(np.nansum(bear-bull, axis=0))

class Volatility(CustomFactor):
inputs = [USEquityPricing.high,USEquityPricing.low,USEquityPricing.close]
window_length = 21
def compute(self, today, assets, out, high, low, close):
p = (high-low)/close
out[:] = preprocess(-np.nansum(p,axis=0))

class Direction(CustomFactor):
inputs = [USEquityPricing.open, USEquityPricing.close]
window_length = 21
def compute(self, today, assets, out, open, close):
p = (close-open)/close
out[:] = preprocess(-np.nansum(p,axis=0))

return {
'MeanRev':              mean_rev,
'FCF':                  fcf,
'Yield':                earn_yield,
'Sentiment':            sentiment,
'MessageSum':           MessageSum,
'Volatility':           Volatility,
'Direction':            Direction,
}

def make_pipeline():

pricing = USEquityPricing.close.latest
base_universe = QTradableStocksUS() & (pricing > 5)
ev_positive = ev > 0
ebitda_positive = ebitda > 0
market_cap = Latest(inputs=[Fundamentals.market_cap])
universe = (
AverageDollarVolume(window_length=LIQUIDITY_LOOKBACK_LENGTH)
)

sector = Sector(mask=universe)  # sector needed to construct portfolio
# ===============================================

factors = make_factors()

combined_alpha = None
for name, f in factors.iteritems():
if combined_alpha == None:
else:

longs = combined_alpha.top(NUM_LONG_POSITIONS)
shorts = combined_alpha.bottom(NUM_SHORT_POSITIONS)

long_short_screen = (longs | shorts)

beta = 0.66*RollingLinearRegressionOfReturns(
target=sid(8554),
returns_length=5,
regression_length=260,
).beta + 0.33*1.0

# Create pipeline
pipe = Pipeline(columns = {
'combined_alpha':combined_alpha,
'sector':sector,
'market_beta':beta
},
screen = long_short_screen)
return pipe

def initialize(context):

context.spy = sid(8554)

attach_pipeline(make_pipeline(), 'long_short_equity_template')

# Schedule my rebalance function
schedule_function(func=rebalance,
date_rule=date_rules.every_day(),
time_rule=time_rules.market_open(minutes=60),
half_days=True)
# record my portfolio variables at the end of day
schedule_function(func=recording_statements,
date_rule=date_rules.every_day(),
time_rule=time_rules.market_close(),
half_days=True)

context.pipeline_data = pipeline_output('long_short_equity_template')

def recording_statements(context, data):

record(num_positions=len(context.portfolio.positions))

def rebalance(context, data):

pipeline_data = context.pipeline_data

risk_factor_exposures = pd.DataFrame({
'market_beta':pipeline_data.market_beta.fillna(1.0)
})

denom = np.nansum(np.absolute(pipeline_data.combined_alpha.values))
objective = opt.MaximizeAlpha(pipeline_data.combined_alpha/denom)

constraints = []

constraints.append(opt.MaxGrossExposure(MAX_GROSS_EXPOSURE))
constraints.append(opt.DollarNeutral())
constraints.append(
opt.NetGroupExposure.with_equal_bounds(
labels=pipeline_data.sector,
min=-MAX_SECTOR_EXPOSURE,
max=MAX_SECTOR_EXPOSURE,
))
neutralize_risk_factors = opt.FactorExposure(
min_exposures={'market_beta':-MAX_BETA_EXPOSURE},
max_exposures={'market_beta':MAX_BETA_EXPOSURE}
)
constraints.append(neutralize_risk_factors)
constraints.append(
opt.PositionConcentration.with_equal_bounds(
min=-MAX_SHORT_POSITION_SIZE,
max=MAX_LONG_POSITION_SIZE
))

try:
order_optimal_portfolio(
objective=objective,
constraints=constraints,
)
except:
return

def preprocess(a):

a = np.nan_to_num(a - np.nanmean(a))
a = winsorize(a,limits=(WIN_LIMIT,WIN_LIMIT))
a = a/np.sum(np.absolute(a))

return preprocessing.scale(a)
There was a runtime error.

Not so good, I guess, due to the consistent concentration in short_term_reversal? But then, does the tear sheet analysis suggest that after the risk factors are regressed away, it might be o.k.?

4

@Grant, @Tony, @Karl, those are all valid arguments. However, they underscore the randomness built in all those processes. As if trying to give meaning to some randomly generated short term phenomenon. And there, the whole rhetoric does not hold up.

For instance, you might have that over the long term a randomly generated price series should tend to its mean. Nonetheless, one could find all kinds of patterns, even if a price series obeyed a Gaussian distribution, which it is not. And that is where the assumptions become convoluted.

I can take 100 randomly generated price series, provide them with external factors, pass them through the Opitmize API under those constraints and get some results. But, whatever factor I would have used, none of them would be relevant going forward. So, you could say I am not surprised by most of the systems presented on Q that follow those criteria, especially if they try to extract short-term trends or reversals that might or might not be detectable going forward.

We design trading systems for forward operation and try to see over past data if some the procedures we might have implemented might have survived over extended periods of time. Because if they do not, we have nothing in our hands to rely on.

The more randomness there is in price series, the more the expected outcome has a zero mean. And this, whatever the trade mechanics we would like to implement going forward, that we add factors or not. All you will find are temporary coincidental occurrences that have no probability measure for being there tomorrow.

@Karl, points well taken. The US market has had an upside bias of about 0.52 over the past 240 years or so. Just positioning oneself all over the place with result in tending to catch this upward drift. And, because of this upward drift, we can not say the market is random, However, we can still say it is random-like. If you detrend a price series, you are left with its random component. The same if you detrend a portfolio.

One part of Modern Portfolio Theory wants to deal with µ or σ, giving you, for instance, a SR measure. These are the same values you will find in: p(t) = µdt + σdW. There are no technique to win on σdW, whereas, you can always win on µdt, on average, simply by holding on long enough on a group of stocks.

People look at Δt over the short-term, in a Markowitz manner, when the real problem is T viewed as a trading strategy's termination time. There is no theoretical system that can give you a reasonable estimate unless you start looking at how far a trading strategy can go. Our guesstimates should be on more solid foundations.

So, no, I do not see the market as random, but I should add one word. I do not see it as totally random, just random-like. Enough to have σdW → 0.

I have not seen anyone come out with a trade vector for 100 trades to be taken tomorrow that came out right. Which is what not random implies, and up to now, has not delivered.

Also note that to make a million bucks with the numbers you presented you would have to make some 800,000 trades. And if each of those trades cost you a minimum of $1.00, you would be left with$200k and not $1M. To get your$1M under those conditions, you would need some 4,000,000 trades having the same strategy signature. If you wanted more than $1M, then the number of trades would go up proportionally. What I am trying to do is to elicit clear answers to 7 specific questions regarding practical application of output from Q's Risk Model. @Guy, i wasn't aware that i had actually presented any argument at all, valid or otherwise! ;-) @Delaney, @Rene, @Ernesto, @Grant, @Karl, @anyone else who can help answer my questions, here they are again: 1). If an algo has cumulative Total returns that rise steadily over time and consist of about a 50/50 mix of "Common" returns and "Specific" returns, is that considered good / bad / indifferent, or not a meaningful question? 2) Should our aim be to reduce the ratio of "Common" returns to either Total or Specific returns without reducing the Total returns? 3) In the demo algo above by @Grant, he has Specific returns (which were equated with alpha in the original post) almost identical to his Total returns, whereas his Common returns (which were equated to risk) are almost zero and even slightly negative. I thought this was probably good, but why then would Grant conclude: "I guess this [algo] is a dud?" 4). If I an algo shows consistent Cumulative Common Sector returns with the largest Positive contribution from the Technology sector, almost zero contribution from the Basic Materials sector, and a large Negative contribution from Real Estate, does that imply that we should be trying to: - Increase the proportion of Tech stocks the algo is trading because this is where most returns are coming from?, or - Decrease the proportion of Tech stocks because they are already over-contributing compared to other sectors? - Increase the proportion of Basic Materials stocks because currently they don't contribute enough to returns?, or - Decrease the proportion of Basic Materials stocks because currently they don't usefully contribute to returns? - Remove Real Estate stocks because their contribution to returns is consistently negative? 5). If an algo has the 3 largest positive contributors to Cumulative Common Style returns from Short-term reversal, Momentum, and Volatility, whereas Size & Value are consistently contributing almost nothing, does this mean we should: - Increase our focus on Reversal, Momentum & Volatility because these are where my positive returns are coming from,? or - Decrease our focus on Reversal, Momentum & Volatility because they already contribute disproportionately large amounts? and then - Increase our focus on Size & Value because these are under-represented?, or - Decrease our focus on Size & Value because they don't add much return anyway? 6) If you could design your own algos in such a way as to make the Q Risk Model response plots look like anything you wanted, then what would they look like? i.e. in terms of the specific sets of plots available in Q's Risk Model, what are the "ideal" responses that we should be aiming for in our algo designs? 7) In the context of Q's risk evaluation tool, how exactly are "Momentum" and "Short_Term_Reversal" defined mathematically? Looking forward to your help answering these questions @anyone, please.... Hi Tony - Regarding the algo I posted later (5a2524b7ea6d2140f870e698), if I'm understanding correctly, its concentration in short_term_reversal of 0.6 is too high. However, I'm not clear if this is misleading, since the cumulative common returns are consistently low; if I understand correctly, short_term_reversal is contributing very little to the overall return. These numbers look really good: Summary Statistics Annualized Specific Return: 6.22% Annualized Common Return: 0.59% Annualized Total Return: 6.85% Specific Sharpe Ratio: 2.49 But then, for the contest (and presumably the Q fund), the algo would be rejected on the basis of short_term_reversal being too high. The game plan, I gather, is that Quantopian would provide an unwanted factor extraction tool as part of the Optimize API, and so if it works perfectly, then the common returns would be reduced to zero, and all that would remain is specific returns (the valuable "new alpha"). Of course, the other problem with the algo is that I just cobbled together a bunch of factors, hoping for the best. And the backtest really needs to be run back to the earliest possible date (bogs down my pc, and I'm not sure that the research platform has enough memory). And each factor examined individually in Alphalens. And on, and on. In the end, I'd also need to formulate an economic rationale of some sort, so it sounds like I know what I'm doing (which I don't). To get an allocation, the specific returns actually need to be explained. My approach, I think, would be to just allow Quantopian to look at the code, and figure it out for themselves. In the end, for the Quantopian team, I think what authors really need is to be able to answer the question "Am I done? If I set the algo aside for six months, and let it percolate, and it is not over-fit, will I have a decent shot at an allocation?" Taking my backtest above (5a2524b7ea6d2140f870e698) as a working example, what is my punch list? @Karl Thanks for your question. Your question is more related to evaluating whether a factor should be invested. It is not really what a risk model is designed to do. A risk model and performance attribution are helping you know what your return sources and risk sources are. You could check whether they match what your algo was designed to. A book helped me a lot in understanding the risk management in the quant finance. I hope it could help you as well. You could read the relevant pages (Chapter 4) “Inside the Black Box” by Rishi Narang. (google book.) https://books.google.com/books?id=aYA0LnecyTgC&pg=PT64&source=gbs_selected_pages&cad=3#v=onepage&q=jogn%20maynard%20keynes&f=false About evaluating a factor, I would look at returns and volatilities. However, I would not only look at them. I would ask myself whether I believe this factor can provide me consistent returns and what its drawdown durations are. I personally would not only investing a short_term_reversal factor defined in the risk model, because i do not think it could provide me consistent returns. @ Grant Kiehne Thanks for your question. We are working on a whitepaper. About the risk model or the performance attribution, like I mentioned, they are the tools for you to understand your algo and, with that new understanding, improve its quality and consistency of returns. By “Improve the quality and consistency of returns,” here, I mean “are your algos doing what they are designed to do?” About whether you should invest short_term_reversal or some other common factors defined in the Q risk model, it is your decision and not a question a risk model can answer. If you believe these factors can provide consistent returns and they cannot be invested by a cheaper way, then you should write an algo and invest them. However, I suggest you to re-think about it - I don’t personally believe that factor is a good stand-alone choice for most purposes. @Tony Morland Thanks for your questions! Let me try to answer them one by one. 1). Let's say I have an algo which has cumulative Total returns that rise steadily over time and consist of about a 50/50 mix of "Common" returns and "Specific" returns. Is that good / bad / indifferent, or not a meaningful question? If I were you, I would ask myself why this happens and whether it matches with my understanding of the strategy. Imagine a strategy that was designed to be an arbitrage strategy, but ends up making a significant technology sector bet in addition to the intentional bet on the algo author's arbitrage idea. Even if it made tons of money from the consistent positive exposure in technology, it does not mean it is fine. This algo is risky, because the algo author may never think about how to intelligently bet in the technology sector. If something bad happens to the technology sector in the future, the algo could lose tons of money and the algo author has would have a very unpleasant surprise. It’s also dangerous to make up ‘just so’ stories that in retrospect explain what’s going on. It’s important to let your fundamental understanding of the algo explain what’s going on, and use the risk model to look for deviations from that. An algo that made money for unknown reasons is more risky than an algo that lost money for an unknown reason, because most people pay more attention to figuring out why they lost money rather than why they made money. 2) Should I be endeavoring to reduce the RATIO of "Common" returns to either Total or Specific returns without reducing the Total returns? Before answering this question, let me explain a little bit about why the algo author needs a risk model. The risk model is helping the algo author to improve the quality and consistency of returns. What I hope is the performance attribution can help you understand your algo in a deeper and more comprehensive way. If you think your algo is taking exposures to some unexpected factors, you could try to lower the corresponding factor exposures. I think the goal is to let your algo do what it is designed to do and what it is good at as much as possible. So, if my algo is an arbitrage algo, I would hope that it focuses on making money from its arbitrage edge. There are some methods to lower the common factor exposures. 1. Limit them by using constraints or penalties in the optimization function. Please refer these as examples, https://www.quantopian.com/lectures/risk-constrained-portfolio-optimization https://www.quantopian.com/posts/introduction-to-the-quantopian-risk-model-in-research http://nbviewer.jupyter.org/github/cvxgrp/cvx_short_course/blob/master/applications/portfolio_optimization.ipynb 2. Hedge - hedging out the factor exposures is also an option, but it is not an easy option for all of the strategies. One has to have a proper hedge vehicle. 3. In the step of designing an trading strategy algo, it may be good to avoid designing a strategy just purely based on these common factors. For example, some strategies are just purely and simply buying small size stocks or buying stocks with strong gains or losses in short term. While lowering the common factor exposures, you will want to observe how your algo’s behaviour changes and be able to explain why the behaviour changes. 3) I see in Grant's demo algo above that he has Specific returns (which you equated with alpha in the original post) almost completely equal to his Total returns, while his Common returns (which you equated to risk) are very low, in fact almost zero and even slightly negative. Presumably this is excellent and would be MUCH better than an algo as per my example above in which Common & Specific returns are almost equal. Why then would Grant conclude "I guess this (i.e. his algo) is a dud?" Like I mentioned, the first goal of the Q risk model and performance attribution is to provide a tool for users to improve the quality and consistency of algo returns. I would not evaluate an algo by this way. Knowing the sources of return and risk is very important. About the Q fund portfolio, we hope to allocate to algos that are not just consistently and highly exposed to the common factors. 4). If I have an algo example output showing Cumulative Common Sector returns and I see that a) the largest Positive contribution is coming from the Technology sector, b) very little contribution (close to zero) is coming from the Basic Materials sector, and c) the largest Negative contribution is coming from Real Estate, and moreover these relative contributions are all very consistent over time, what does this mean? Should I then be trying to: - Increase the proportion of Tech stocks that my algo is trading because this is where most returns are coming from?, or - Decrease the proportion of Tech stocks because they are already over-contributing compared to other sectors? - Increase the proportion of Basic Materials stocks because currently they don't contribute enough to returns?, or - Decrease the proportion of Basic Materials stocks because currently they don't usefully contribute to returns? - Remove Real Estate stocks because their contribution to returns is consistently negative? Not really. All these numbers are just try to provide you the insights of your algos. Before running the performance attribution, I would ask myself one question - what should my algo’s returns rely on based on my design? While reading the performance attribution, I would ask: • Do the exposures match my intuition? • If not, why? • Does my algo have unexpectedly high or low exposures in any category? • Do these unexpected exposures matter? • Could they ruin my algo in the future? • What if I lower these unexpected exposures? 5). If I note that the 3 largest contributors to my Cumulative Common Style returns are Short-term reversal, Momentum, and Volatility, all of which are consistently positive, whereas Size & Value are consistently contributing almost nothing. Does this mean that I should then: - Increase my focus on Reversal, Momentum & Volatility because these are where my positive returns are coming from,? or - Decrease my focus on Reversal, Momentum & Volatility because they already contribute disproportionately large amounts? and then - Increase my focus on Size & Value because these are under-represented?, or - Decrease my focus on Size & Value because they don't add much return anyway? I do not suggest you to adjust positions based on returns. They are just historical returns. 6) Currently i feel like you have provided a powerful instrument but i still don't yet know what exactly i should be trying to do with it. If you personally could design your own algos in such a way as to make the various response plots look like anything you wanted, then what would they look like? i.e. in terms of the specific sets of plots available, what are the "ideal" responses that we should be aiming for in our algo designs? This is a book which helped me a lot in understanding risk management in the quant finance. You could read the relevant pages (Chapter 4) from google book version of “Inside the Black Box” by Rishi Narang. https://books.google.com/books?id=aYA0LnecyTgC&pg=PT64&source=gbs_selected_pages&cad=3#v=onepage&q=jogn%20maynard%20keynes&f=false 7) In the context of the risk evaluation tool, how exactly have you mathematically defined "Momentum" and "Short_Term_Reversal"? We will release it soon to the community. Basically, the momentum factor is defined with “last 11 month returns, ending 1 month ago”, and the "Short_Term_Reversal" is defined with “Relative Strength Index”. @Rene, Here is the pair strategy I start to design.  stocks, wt = symbols('AMZN', 'ADBE', 'SHY'), [.15, -.1, 0.75] def initialize(context): context.first = True schedule_function(trade, date_rules.month_start(), time_rules.market_open(minutes = 65)) def trade(context, data): if get_open_orders(): return if context.first : for i, stock in enumerate (stocks): if data.can_trade(stock): order_target_percent(stock, wt[i]) context.first = False ''' START 06/01/2007 END 12/04/2017 '''  Below is tear sheet with a risk model. The results of the strategy completely matches with my understanding of the strategy. Can you point me specifically where is the information helping the algo author to improve the quality and consistency of returns. The strategy holds AMZN(217), ADBE(-226), SHY(936) since 2007-06-01 Where these exposures came from? Exposures Summary Average Risk Factor Exposure Annualized Return Cumulative Return momentum 0.46 -1.45% -14.23% value -0.66 -0.08% -0.83% short_term_reversal -0.13 0.31% 3.26% volatility 0.07 -0.48% -4.94% 2 Click to load notebook preview Hi Rene - Thanks for the feedback. It will be interesting to read the white paper on the Quantopian risk model. Also, it will be interesting to see the details on how you define the short_term_reversal factor (presumably it can be expressed as a Pipeline factor). Regarding my algo above (5a2524b7ea6d2140f870e698), I'm still perplexed. The top-level stats look good: Summary Statistics Annualized Specific Return: 6.22% Annualized Common Return: 0.59% Annualized Total Return: 6.85% Specific Sharpe Ratio: 2.49 Yet, the guidance from the proposed contest limits suggest I need to keep working on it, to get short_term_reversal down (and possibly volatility). Is it worth the effort, given that the total annualized common return is 10X lower than the annualized specific return? I'm just not following why I would want to spend more time on it, unless you are saying that I'd never get an allocation unless it conforms to the proposed contest limits. Am I done? Or not? And if I'm not done, should I wait for the patented Q Factor Extractor or fiddle around with it myself? Or maybe you already have a factor extractor, and I just don't know how to use it? @ Vladimir I am not sure if you noticed the warning message in the performance attribution. The Q risk model does not cover the asset SHY. Your algo only trades 3 assets and the influence of missing one of three assets in the analysis could be significant. /usr/local/lib/python2.7/dist-packages/pyfolio/perf_attrib.py:172: UserWarning: Could not determine risk exposures for some of this algorithm's positions. Returns from the missing assets will not be properly accounted for in performance attribution. The following assets were missing factor loadings: [u'SHY-23911'].. Ignoring for exposure calculation and performance attribution. Ratio of assets missing: 0.333. Average allocation of missing assets: SHY-23911 78732.373459 dtype: float64. warnings.warn(missing_stocks_warning_msg) @Rene, SHY is very close by returns to Money Market or cash so removing it from the algo should not significantly change results. stocks, wt = symbols('AMZN', 'ADBE'), [.15, -.1] def initialize(context): context.first = True schedule_function(trade, date_rules.month_start(), time_rules.market_open(minutes = 65)) def trade(context, data): if get_open_orders(): return if context.first : for i, stock in enumerate (stocks): if data.can_trade(stock): order_target_percent(stock, wt[i]) context.first = False ''' START 06/01/2007 END 12/04/2017 '''  Below is tear sheet with a risk model. The results of the strategy completely matches with my understanding of the strategy. Please answer my questions: Can you point me specifically where is the information helping the algo author to improve the quality and consistency of returns? The strategy holds AMZN(217), ADBE(-226) since 2007-06-01 Where these exposures came from? Exposures Summary Average Risk Factor Exposure Annualized Return Cumulative Return momentum 0.22 -0.74% -7.49% value -0.35 -0.03% -0.36% short_term_reversal -0.08 0.20% 2.11% volatility -0.00 -0.19% -2.03% 2 Click to load notebook preview Hi @Rene, I appreciate your help and my thanks to you for your detailed explanations, but they still largely leave the intent of most of my questions unanswered. . @ Delaney, please can you also assist us here. Maybe my questions are just too naieve, but I assume the new Q Risk model is actually intended to be USED in a very practical way, rather than something of only academic interest, and so I am trying to understand HOW to use it. (As an analogy, imagine that you know a lot about internal combustion engines but have never actually driven a car before and want to know what the gauges are telling you regarding your required actions in very practical terms. If the fuel gauge shows “E”, this implies that you need to do the following: Pull into a gas station and fill up. You don’t need more info about internal combustion engines.) . Context: All algos referred to in my examples / questions here are a) Equity Long Short, and b) Beta neutral, dollar neutral, balanced between all sectors, and c) Make equity selections Long/Short based on the following criteria: - Stock fundamentals in terms of value & quality using Morningstar data, and - A mixture of both Momentum & Mean-reversion considerations using individual stock price data covering the range from 1 week to 1 year. . Q 1) @Rene, your comment: "It’s also dangerous to make up ‘just so’ stories that in retrospect to explain what’s going on". Yes, that is a perfectly reasonable comment, but I do not understand its relevance to my question. I was not trying to explain anything, but simply asking whether or not Q's new Risk tool can actually tell us anything meaningful based on the relative size of Common vs Specific returns? If so, then what exactly (if anything at all) does the relative size of Common vs Specific returns actually imply? . Q2) Thanks for the references to lectures, etc. I have read them, and I will read them again. But, similar to Q1, the Risk tool tells us the relative size of Common vs Specific returns, so then what should we actually do with that info? How can we use it? You write: " I think the goal is to let your algo do what it is designed to do". Well OK, but if the algo is NOT performing as well as we want, then how specifically can we actually USE the info from the new Risk tool to help us improve the algo? . Q3) @Grant: As this question relates to your comment about your algo, and no-one else seems to be answering it, please can I direct it specifically to you. Based on Q's Risk model, why did you infer that your example algo "seems to be a dud"? . Q4) OK, thanks, I will come back to this again later. . Q5) You write: "I do not suggest you to adjust positions based on returns. They are just historical returns.” Well yes, of course, fully understood, but then what (if any) useful guidance for improving our algos can we get from the Risk tool's analysis of sector returns? Is this output from Q's Risk model just something of academic "historical interest", with little use for helping us improve our algo? However if we can (as I hope) actually use it in some way, then HOW should we go about doing that? . Q6) I agree that Rishi Narang's book is a good one. I bought it bach when it was first published and have re-read it several times since. But you are still not addressing the question that is specifically about Q's new Risk tool. If this tool is useful, then what are the sort of "ideal" responses that we should be aiming to see in Q's Risk model outputs from good algos? . Q7) OK, thanks. I await the more detailed explanation that you mention will be released. Hi Tony - Regarding my "dud" comment, I am simply referencing the proposed contest constraints published by Jamie on https://www.quantopian.com/posts/a-new-contest-is-coming-more-winners-and-a-new-scoring-system : Style Exposures -40% to 40% in each style. Presumably, the limits for the new contest also correspond to accept/reject criteria for new allocations (I'm guessing Q may be more relaxed in applying them retroactively to existing allocations). I am confused, however. If Annualized Specific Return >> Annualized Common Return, the Annualized Total Return is decent, and the Specific Sharpe Ratio is reasonable, then it would seem like I'm done (to pick up your driving metaphor, my GPS is saying "You have arrived at your destination."). Yet, if I have a Style Exposure outside the range -0.4 to 0.4, then I need to fix it, and I'm not done. And if I'm not done, then what should I do? Taking the example Jamie posted on https://www.quantopian.com/posts/a-new-contest-is-coming-more-winners-and-a-new-scoring-system, we have: Summary Statistics Annualized Specific Return: 8.25% Annualized Common Return: 0.25% Annualized Total Return: 8.51% Specific Sharpe Ratio: 2.22 I guess it needs to be patched up, too, since volatility is too high? But why bother, since the top-line numbers look really good, with the common return practically in the noise. I'm also wondering why Q published the example in the first place, rather than allocating capital to it (the question was asked, but I'm not sure Jamie ever answered it). Again, totally confusing. From a practical standpoint, it says to me that I could put in a lot of work, get really good Summary Statistics, and still not have any decent shot at an allocation. Which brings me full circle to my basic point of not ever really knowing whether one is done or not. As you say, it is not an academic exercise. Either the algo is complete (~50/50 shot at an allocation), and should percolate for at least six months (to mitigate other risks), or it is not, and needs more work. @Grant, thanks for your prompt response & comments. @Delaney, I think we desperately need a "How to Actually USE the Q new Risk Model output - Practical Users Guide for Dummies" like me. Preferably with minimum theory (its covered elsewhere) but with some good "IF Risk Model shows A THEN do B to algo" type of clear practical examples. Compare with metaphor for newbie pilots: "IF altimeter is spinning rapidly towards zero, THEN pull up on the stick". @Rene, I just reading your answers to Tony Morland 7 straightforward questions. You actually fully answered only to Q #7. I wondering why there are no straight answers to other 6 questions? In your attempt to answer Q#1 there is very interesting statement: An algo that made money for unknown reasons is more risky than an algo that lost money for an unknown reason, because most people pay more attention to figuring out why they lost money rather than why they made money. How you came to it and how it helping to manage money? Hi @Delaney, @Rene, @anyone else at Q: . I’m patient & persistent, so I will continue until we get to achieve clarity & understanding with answers, and I thank you in anticipation, Q staff, for your kind patience with me, just the dumb-ass engineer that I am. . Honestly, at the moment, I feel like Q is making this topic of risk modeling appear more difficult & complicated that it needs to be. I wonder why the obfuscation? I get the impression that Q is endeavoring to present this risk model to the quant world as being something very new, learned, special & unique. Hardly. Many quants have risk models, they are available commercially, and as yet I have not been able to discern what is special about Q’s model, other than it is free. That’s great, but currently how it works is not very clear. . My own personal concept of "risk model" is really very simple and easy to understand. Here it is: . There are 2 types of risk: accepted or known risk = “good”, and unaccepted = unknown risk = “bad”. My terms “good” & “bad” here are just relative but are easy to understand. “Good” risk is the risk that you know that you are taking, and it is simply the price associated with getting the alpha you want, just like a known & accepted cost of doing business. And if you can’t achieve enough alpha to compensate for that risk, then just move on and try something else. “Bad” risk on the other hand is the risk you didn't even know that you had, and/or is not associated with the alpha that you are seeking. A simple example will illustrate. . Let's say we want to generate alpha by buying low-PE stocks. Of course there is risk associated with this, just like any other strategy. That risk comes from the fact that some low-PE stocks are low-PE because they are simply bad. It may also come from the possibility that suddenly growth stocks will become more fashionable and people will start dumping their low-PE stocks. But if the alpha we achieve with low-PE stocks is high enough then we can accept those risks and, in that sense, they are “good” risks. . But now perhaps we find that a large number of low-PE stocks are tobacco stocks, and moreover that most tobacco stocks have low PE ratios. We didn’t plan to invest heavily in tobacco stocks and it is definitely not part of our algo design. So this is a “bad” risk, partly because we didn’t plan it, and partly because we expect that, as people become more health conscious, tobacco stocks will fall even further, making them even lower PE (which was our aim), but NOT more attractive as investments for us. . So we have the known / acceptable / “good” risks associated with low-PE stocks, but also “bad” / unacceptable risks that were unplanned and were initially unknown to us associated with the stocks from one particular industry, especially an industry that we do not favor. When we find out and identify the subset of our low-PE stocks that have this undesirable “bad” risk component (i.e. the tobacco stocks), then we can screen out those particular stocks. . Now the limitation of this concept (my own personal risk model) as explained here, is that it is largely descriptive. It forces me to think about things, but it doesn’t explicitly measure. That is what I need, and where I hope Q’s risk model will help. . When I see from Q’s risk model that one of my algos has a completely unplanned and therefore, in my terminology, “bad” risk exposure to the Real Estate sector, then I can choose whether or not to accept that, or else to do something about it, perhaps by modifying my strategy to lighten up on Real Estate. . But at the moment, the problem with Q’s risk model (at least for me) is that all I can see are a series of colored lines on display plots. We need to know WHAT these are, and HOW exactly they are defined and calculated. Only then can we figure out how to use them wisely, but Q does not seem to be very forthcoming with that vital info. . For example, is the line marked “Real Estate” (RE) showing me "risk" as defined by some measure of my total$ exposure to RE stocks? Or is it showing me a measure of the maximum Drawdown% over the last month that was associated with the RE stocks in my portfolio, or is it showing me"risk" as the rolling volatility of RE stocks in general irrespective of whether or not they are in my portfolio, or is it …. etc.?
.

So, I’m still hoping for a lot more clarity & transparency about this and other aspects of Q’s risk model in general … please!

@Karl, I asked that question of @Rene in
and, if I recall correctly, she said that the Short Term Reversal was the 14-day RSI.

I'm supposing that this RSI-based risk factor will eventually be published as a Pipeline factor (along with the other risk factors)? Is this a correct assumption?

Hi @Alan, @Karl, @Grant,
If you say that Short Term Reversal Risk in Q's Risk model is the 14-day RSI, then i believe you, but frankly I am both amazed and horrified!
We all know what RSI is, and we all know that the classical RSI indicator used a period of 14 bars, but how can Q possibly equate that to "risk"?

First of all what actually is risk in a trading context?
Surely it relates to the probability of loss? Surely it also relates to the potential size of any such loss and therefore to market exposure? Surely it also relates to whether a position is Long or Short? Therefore surely in a portfolio setting the overall risk is the sum of aggregate risk associated with Long positions and Short positions. And so on. All of these are parts of risk. How could "Risk" meaningfully be expressed by neglecting all these things?

Some examples:
- If the market is trending up and we hold a Long position then surely the risk to that position is relatively small compared to the risk associated with holding a Short position of equal size with the same instrument?
- If we are endeavoring to pick turning points then, if there is no underlying trend, the risk associated with picking market tops & bottoms is slightly different due to the generally different shapes of tops & bottoms in equity markets but more than that, if the market has some underlying trend component as well as a cyclic component, then the risk associated with picking market bottoms is inherently less that that associated with tops if the underlying trend is Up.

What might be meaningful would be if Q used some measure of "trendiness" vs "mean-reversion potential" as one part of the input to calculating mean reversion risk. But simply calling an RSI indicator "mean reversion risk" seems absurd. What about all the other very significant aspects of risk that go way beyond what can be inferred from any simple TA indicator, even if a 14-period RSI was a good indicator to use (which I doubt)? How does RSI go into the calculation of "probability of loss".

Surely there must be a lot more to this thanjust "Q uses a 14-period RSI". Surely Q would be seen as a joke if this goes out to the Quant world as Q's idea of a "Risk Model". Is someone pulling our legs here or what?

I'm really perplexed why Q appears to be cagey about the details of their risk model (e.g. see Thomas W.'s comment "I'm not sure the exact specification is helpful as it allows to work around the specific definition of the factor, without reducing the actual exposure" which really floored me, in the context of Q's open paradigm...later in the thread he appeared to have a change of heart, but I have yet to see the risk model details published).

I gather that Q is using the 14-day RSI factor as a proxy for short_term_reversal exposure. Presumably, it captures something that either they have plenty of already from licensed algos, view as entirely undesirable, or that they can add in without having to pay licensing to authors (e.g. by simply using their built-in RSI factor...although as a purely crowd-sourced fund, I'm not sure if Q will add in any algos/factors of their own). The thing is, if the 14-day RSI factor going forward will be used exclusively for short_term_reversal factor attribution, then it should be re-named 14-day_RSI style risk to remove the mystery.

To my point above, if the 14-day RSI is not really a profitable factor long-term anyway, then I'm unclear how it is a risk. What does it look like with Alphalens? Is it a factor that anyone would select in the first place, using the Q framework? Or is it total junk? My basic point is that to do style factor attribution, I'd think that the set of factors would need to be sometimes persistent and reasonably profitable, and not just noise. If the 14-day RSI is just noise, then I'm not sure it makes sense for style factor attribution.

The built-in factor is not configurable, as far as I can tell:

class RSI(*args, **kwargs)
Relative Strength Index
Default Inputs: [USEquityPricing.close]
Default Window Length: 15

Would the default window length need to be changed from 15 to 14, to be the (presumably) exact factor used to compute short_term_reversal (a.k.a. 14-day_RSI)? If so, does anyone have an example?

@ Delaney/Rene/Q team -

I asked what I thought was a very simple, germane question: For the algo posted above (5a2524b7ea6d2140f870e698), am I done? Or do I have more work to do? It is not an academic, hypothetical, ethereal question. I'd like to know, as a working example, would I have a decent shot at an allocation with such an algo as-is? If not, what do I need to do to whip it into shape (or make the changes yourself as an example, and post the new code)? I actually would be glad to work on it (for free, by the way), if I knew what to do. And if you can't do a quick assessment and give me definite feedback, then I'm really confused about how you intend to provide feedback to users (granted, you presumably have 160,000 active users to deal with, so I'm asking for a high-touch approach, but I'd expect if you used it as an example, many others would benefit).

Hi @Karl,
Firstly thanks for the link to the webinar. I will watch/listen with great interest now to what @Delaney has to say.

I completely agree with your perspective that, for us and other algo writers like ourselves who are taking this seriously, Quantopian are indeed our client, just as their investors are their client. I also completely agree that we need to have as clear as possible understanding of exactly what is involved in fulfilling Q's needs and that, in addition to any other implicit design specs, we should take the guidelines of the Quantopian Open Contest and Allocation Criteria as design briefs. As you say, Q has indeed provided some great tools for us to work with, and obviously Q would like us to be very cognizant of risk in its various forms.

Your comment, echoing @Rene, that: "... risk metrics are not to prOscribe even as it prEscribes alerts to inform the designer" is indeed an excellent comment, and clearly indicates Q's good intention to help us by giving us risk-related and other tools to use, while allowing us the freedom to do our own algo design work in whatever way we believe to be best.

Everything about this is absolutely fine by me, and I like it. The part that I am having a problem with however is that:

• It is still not clear exactly how the Q Risk Model actually works, what it calculates, or even if those calculations are appropriate as measures of risk ... ?
• It is not clear what the Q Risk Model is actually measuring or what its output represents, or what the units of "risk" are in the Model's output.... ?
• Without knowing these things, it is not clear how to actually use the risk model in an effective and practical way.

I will refrain from further comment until I have watched the webinar at least once and carefully digested its content in case there is something I am still missing here, but I certainly agree with @Grant's comments about the pressing need for more examples and more open explanations, so that committed algo writers can at least understand the Q Risk Model well enough to either use it effectively or to offer suggestions for its improvement.

The Idea to Algorithm: The Full Workflow Behind Developing a Quantitative Trading Strategy is a high-level view of the Q workflow (basically a fleshing out of A Professional Quant Equity Workflow). I just watched it and it is useful, but high-level (by design).

I have a general sense for the sector "risks" as really being a way of diversification across the broad market, which intuitively makes sense. The style risks are a bit more nebulous, particularly in light of the fact that for them to work correctly in informing the crowd, Q will have to include factors that they already have enough of in their fund. For example, eventually, they will want no more algos based on StockTwits. Per Delaney's description of the risk model, they would then need to add a StockTwits style risk factor. I suppose this is the game plan, that we should eventually see a whole host of style risk factors, that basically reflect the weightings of the Q fund? But then, it would seem unusual to provide this level of detail to the public. But maybe it all would end up in their public fund prospectus in the end? Kinda confusing...

I just watched Delaney's webinar. It does provide an excellent overview of the workflow associated with quant strategy development. Many thanks to you for that @Delaney; a great webinar! Although it was not specifically focused on the risk model side of algo development, Delaney made several points that gave me some additional insights.

1) The first key point for me was Delaney's comment that, upon careful analysis, new algo models are often found to give very similar results to other pre-existing models, either of risk or of alpha, and that, as he said: "anything that is already in a risk model is probably NOT alpha".

Now in that context, I think that maybe I start to make some sense of Q's use of RSI with default value 14 in relation to mean reversion (MR) risk. RSI-14 is probably one of the most over-used and worn-out indicators employed by people looking for reversals. As Delaney says, in seeking alpha, what we need to do is to find something that is new & innovative, not something that is just a re-hash of old over-worked ideas. The implication being that the more we re-work old ideas, the more risk we are implicitly taking and the less benefit we are likely to find. In that sense, the greater the similarity between an algo's results and those from a pure RSI-14 strategy, then probably the lower the potential for reward and the greater the inherent risk. So then, if the statement: "Q uses RSI-14 for mean reversion risk" actually means that what the Q risk model does is to to calculate the correlation between an algo's returns and those from a pure RSI-14 strategy, and then equate some function of that correlation to risk, then I think it does make good sense. However my question then is: Is this actually what Q's risk model is doing with regard to MR risk? @Delaney, @Rene, please can you clarify?

2) Another nice idea that Delaney mentioned is the concept of a "totally risk aware" algo that could dynamically adjust its risk constraints as it goes, in accordance with current market conditions. I think it is a very neat idea, but before attempting to implement it based on Q's risk model, we still have a long way to go and need a much better understanding to Q's Risk Model itself. So we come back to: ..... Please Q, give us a lot more detail & clarity about the Risk Model!!

@ Tony - I gather that basically they are doing multi-variate regression of a model linear model of the risk factors. Something along the lines of:

r = b1*F1 + b2*F2 + b3*F3 + ...+bN*FN + a

The b's and the residual a are found via effectively what amounts to a curve fit to the returns r over the risk factors F. Somewhere it was mentioned that it is done serially, in a loop, but I'm not clear why one would do it that way; normally, multivariate regression is done in one shot. Anyway, presumably this will all become clear in the Q risk model white paper being drafted (and whatever code is subsequently put in the public domain).

In the lecture Delaney mentioned something to the effect of a lot of technical indicators basically being some form of a few effects, including momentum and short_term_reversal, so I guess if those are captured sufficiently, one doesn't need a long list (e.g. every technical indicator known to mankind). I am curious though, if short_term_reversal will be defined as 14-day RSI indefinitely, or if its definition will be tweaked along the way. It should really be frozen, and other risks added (e.g. short_term_reversal_1, short_term_reversal_2, etc.).

EDIT: Mathematically, it is straighforward to include cross-term interactions of factors in the risk model. I gather that this was not done. Why?

@Grant, as I have not yet watched Delaney's webinar, I can only go by your description above regarding a multivariate regression of a linear model. Programactically, this is achieved by Optimize API where the objective function is to maximize alpha (factors or combination of factors that creates alpha or excess returns) subject to the 8 constraints they laid out which are different risk measures that are designed to migitate risks through diversification and risk dispersal methods. The optimization algo does this serially in a loop as it adjust weights of your trading universe to achieve optimal alpha of the model.

Personally, my weapon of choice is deep learning, a multi-layered neural network that does something like a nonlinear multivariate regression. Unfortunately, Q doesn't allow AI libraries like Keras, Tensorflow , Theano to name a few, that does this type of algo. I guess because it will take up a lot of their computing resources. They do cater to some basic machine learning algos such Random Forest, SVR which I tried and get constant computational timeout because of longer training periods and / or too many factors that renders them useless.

@ James -

Yeah, I suspect the folks you do this sort of thing for real (a living) have a similar Pipeline-type factor discovery process and implementation, but the factor combination step has more horsepower. With the risk model, Q is completing the basic workflow provided by their ex-CIO on https://blog.quantopian.com/a-professional-quant-equity-workflow/ with a simple/naive factor combination step. Makes sense that they wouldn't get caught up in factor combination fanciness just yet.

I posted an example here:

https://www.quantopian.com/posts/multi-factor-long-short-equity-w-slash-rsi-based-short-term-reversal-risk-factor-nullification

Basically, I just added the RSI factor to my multi-factor algo, and presto!--no more short_term_reversal risk. Try commenting out the RSI factor, and you'll see that it works.

@Karl,
Your statement: "... insights into existing regimes ..... would transcend the technical indicators..." is, I believe, essential for taking trading to a much higher level, and the key is that little word "regimes". Personally I believe that understanding the market regime, and in particular seeking the answer to the question: "What regime is the market in right now? -- i.e. trending, mean reverting, or something else?" is THE most important question in trading. If you have that answer, then you know the way to trade, and the choice of "indicators" or even "entry signals" becomes of minor importance. If the market is trending, then trade in the direction of the trend using any kind of trend following (TF) technique, if the market is mean-reverting, then trade that using a MR technique, and if you are not sure whether the market is TF or MR, then stand aside. What always amazes me is how little thought & effort most people seem to put into the question: "what regime is the market in now"?

@James,
You write: " my weapon of choice is deep learning, a multi-layered neural network" That may indeed be a great choice, and i wonder why you select it. I did a lot of work with NNs years ago, and results were not as good as I hoped, but I know the technology has come a long way since then. However most times when i see people using any sort of ML for trading, i end up thinking their efforts are mis-directed, using great tools but then trying to apply them in seeking answers to the wrong questions, such as generating trading signals, and then the results turn out disappointing. I'm not currently using any sort of ML, but i am very interested in knowing what specific questions intelligent traders would be seeking to answer with the new generation NNs?

@Tony: I think "regime" is a metaphor for order and system that one can postulate as probable premises :) I actually meant the regime of technical indicators that are used to visualise and describe "risks".

Let's take short-term-reversal as a "risk" by RSI indicator, in mitigation I can think of several measures:

1) Optimize the constituents in the portfolio with respect to each individual exposure to the market, ie. not to curtail alpha generation/aggregation but to seek an optimised outcome in positioning size and execution that are market specific, allowing the opt.order_optimal_portfolio() to perform as and when the portfolio interacts with the live market. No loss in alpha, possible casualty by constraints.

ps: Dan Whitnable posted an example of the measure here.

2) Impose risk exposure mask, a priori in Pipeline to filter out defined risks, ie. CustomFactor by risk parameters. Possible loss in alpha, no new constraint to optimise.

ps: Grant Kiehne posted an example of the measure here.

3) Modify process workflow to adapt, or better to improve performance metrics. No loss in alpha, defray casualty by constraints, compliant risk metrics.

I am sure our colleagues will have more to add here.. indeed as Q are working to provide more tools for 1 and 2, I am adapting the codes for 3.

@Tony,
I was an early adopter of using plain vanilla neural networks in financial prediction way back in the early 1990s. I had limited success back then. My first portfolio using NN made a cool 55% in the first year, only to start decaying in the second year, when I made an executive decision to stop trading it before all profits are gone. The culprit, as you've touched upon with Karl, is regime change. And you hit it in nail, being able to identify in what regime the market is would be the game changer. So at that time my solution was to retrain the NN after confirmation of the regime change. I was not totally satisfied with this stop and go approach.

Fast forward to present, many key developments in the field of AI have come about together with increase computing power. One of these key developments was the discovery of Prof. Hinton that stacking more hidden layers in the NN architecture does a better job dechipering nonlinear relationship among the input variables and thus better prediction accuracy. This is now termed 'deep learning" and very effective for stationary data like image recognition, computer vision and such spatial domain. Non stationary data such as financial time series remains elusive. However, a new architecture in neuron design called Long Short Term Memory (LSTM) looks very promising for time series analysis and non stationary data because it incorporates some kind of memory, thus making it possible for determining long term dependencies in the temporal space.

I attack financial prediction as a spatio-temporal problem, space and time domains, that is why my only inputs are variations of prices. I also believe in the principles of Chaos Theory as a possible solution. Deep Learning architecture with LSTM neurons seems to me like the right tool /fit for my hypothesis on the dynamics of price evolution. I am hoping that it also captures regime changes and adapt accordingly. The same questions still holds, how accurate are your predictions in out of sample data, is the model able to adapt to regime changes and are your results consistent in training, validation and out of sample tests? Hope this helps!

Impose risk exposure mask, a priori in Pipeline to filter out defined risks, ie. CustomFactor by risk parameters. Possible loss in alpha, no new constraint to optimise.

The risk of this approach (I think) is that it is not point-in-time. In the example I posted, if whatever is causing the short_term_reversal goes away, then the RSI correction will be correcting nothing and then become a risk. One actually has to measure how much RSI exists, point-in-time, and extract it only in proportion to its relative contribution.

In theory, one could do sequential optimizations. The first would clean the combined alpha of risk, and then the second would do everything else. The Optimize API would work for this, since one can just get the new weights out, and then stick them back into the Optimize API for ordering.

@Karl, I'm curious as to what inputs and target variables you use in your DNN for feature engineering? Q's current infrastructure does not allow for any meaningful application of DNN as they eat up a lot of memory and computational resources. I requested Q before, to no avail, to allow offline implementation of DNNs or any other computational intensive methods then be able upload results into Q platform for further processing. The problem,I think, is you cannot download their data for offline implementation.

@Karl, very interesting, thanks!

@James, as we share some similarity of background in terms of NN application & interest, and as you & I and @Karl obviously share some interest in where modern DNN / ML might usefully go from here, i think this deserves a separate thread. Do I have your permission to copy your/our last few posts on this topic and kick off a new thread with it?

@Tony, yes by all means. Really needs a new thread, as this might be off topic here. Thanks.

@ James, @Karl, @ Grant,
Splitting off to a new thread on DNN / ML entitled "DNN and beyond", so that the original topic of the Risk Model can continue on here without mixing the two separate issues. Cheers, TonyM.

Hi @Grant, still here to talk with you about Risk. Cheers, TonyM.

I have an algo that uses timing and momentum to allocate between about a dozen etf's. When I ran it on to Q's new risk profiler it gives me following message-
6 assets were missing factor loadings, including: IJH-21507, IJR-21508, SCHM-40709, SDY-27806, SH-32268..XIV-40516. Ignoring for exposure calculation and performance attribution. Ratio of assets missing: 0.545. Average allocation of selected missing assets:

IJH-21507 76714.164737
IJR-21508 75168.679782
SCHM-40709 57937.700329
SDY-27806 64434.527755
SH-32268 -22257.148870
XIV-40516 121426.075993
dtype: float64.

What is causing this message?

@ Delaney & Rene -

Would it be possible for me to write my own code to beat down the short_term_reversal risk factor (still using the Optimize API)? For example, reportedly in risk_loading_pipeline the built-in RSI pipeline factor is used. But suppose I wanted to replace it with something else? For example, I'm curious if there is excessive portfolio churn in managing the short_term_reversal risk via the Optimize API, and so smoothing the built-in RSI might help (while still allowing the Optimize API to keep the short_term_reversal in check). One could use this, for example:

RSI_SMA = SimpleMovingAverage(inputs=[RSI()],window_length=3)


Generally, what is the time line for publishing:

1. Optimize API code;
2. Risk model code;
3. Risk model white paper (describing the rationale and technical details behind the risk model)?

At this point, it kinda feels like we are dealing with a black box (risk model) within a black box (Optimize API).

Hey Grant,

You can write your own risk factors as Pipeline factors and incorporate them into the optimizer using the FactorExposure constraint. You can find the syntax for passing in your custom risk constraints in the API docs.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Hi Maxwell -

Yes, I understand that I could write my own risk factors, and I see how things are done for the beta risk factor and the FactorExposure constraint. What I don't understand is how you compute the output of risk_loading_pipeline, such that I could do my own computation of the short_term_reversal risk factor (and the other factors, for that matter) and replace yours with mine to see the effect.

I gather that you might be using something along the lines of:

https://www.quantopian.com/lectures/factor-risk-exposure

If so, maybe I could attempt to reverse engineer what you are doing under the hood to compute the output of the risk_loading_pipeline since for whatever reason, y'all are cagey about it (counterproductive, in my opinion...I understand the need not to share information about the hedge fund, but you should be completely open with computational details and code).

I would be curious too to see how we could use FactorExposure constraint with a user defined risk factor, that it I would like to see how we can compute the risk factor loadings that we need to pass to FactorExposure constraint. Even if the risk factor loadings is just the linear regression of each asset against the risk factor, which I am not sure it is, I don't see how we can easily compute that in Pipeline.