Alphalens - a new tool for analyzing alpha factors

Alphalens is a Python package for performance analysis of alpha factors which can be used to create cross-sectional equity algos. Alpha factors express a predictive relationship between some given set of information and future returns. By applying this relationship to multiple stocks we can hope to generate an alpha signal and trade off of it. Developing a good alpha signal is challenging; so where can we (Quantopian) make things easier, and where do you (the quant) add the most value? We figured that a common set of tools for analyzing alpha factors would have a big impact.

By being able to analyze your factors in Research first you can spend less time writing and running backtests. Consequently, this allows for faster iteration of ideas, and a final algorithm that you can be confident in. Building a rigorous workflow with Alphalens will make your strategies more robust and less prone to overfitting - things we look for when evaluating algorithms.

We think that workflow looks something like this...

1. Universe Selection: define the universe of tradeable components; the universe should be broad but have some degree of self similarity to enable extraction of relative value. It should also eliminate hard to trade or prohibited instruments.
2. Single Alpha Factor Modeling: define and evaluate individual expressions which rank the cross section of equities in your universe.
3. Alpha Combination: combine many single alphas into a final alpha which has stronger prediction power than the best single alpha. This is often due to the noise in each alpha being canceled out by noise in other alphas, allowing signal to come through.
4. Risk Model: define and calculate the set of risk factors you want to use to constrain your portfolio.
5. Portfolio Construction: implement a process which takes your final combined alpha and your risk model and produces a target portfolio that minimizes risk under your model.
6. Execution: implement a trading process to transition the current portfolio (if any) to the target portfolio.

Alphalens is only one part of this process. It lives in step #2 - Single Alpha Factor Modeling. The main function of Alphalens is to surface the most relevant statistics and plots about a single alpha factor. This information can tell you if the alpha factor you found is predictive; whether you have found an "edge." These statistics cover:

• Returns Analysis
• Information Coefficient Analysis
• Turnover Analysis
• Sector Analysis

Using Alphalens in Quantopian Research is pretty simple:

1. Define your pipeline alpha factors

class Momentum(CustomFactor):
inputs = [USEquityPricing.close]
window_length = 252
def compute(self, today, assets, out, close):
out[:] = close[-20] / close[0]


alphas = run_pipeline(alpha_pipe, start_date=start, end_date=end)


3. Get pricing data

assets = alphas.index.levels[1].unique()
pricing = get_pricing(assets, start, end + pd.Timedelta(days=30), fields="open_price")


4. Run the Alphalens factor tear sheet.

# Ingest and format data
factor_data = alphalens.utils.get_clean_factor_and_forward_returns(my_factor,
pricing,
quantiles=5,
groupby=ticker_sector,
groupby_labels=sector_names)

# Run analysis
alphalens.tears.create_full_tear_sheet(factor_data)


### Open source

We think Alphalens fits in nicely with the rest of our open source ecosystem. Anyone in the world is now able to explore their idea with Alphalens, backtest their strategy with Zipline, and analyze the results with Pyfolio. We wanted to create a product that was platform agnostic, accessible to everyone, and scalable. This is incredibly important because we want anyone who contributes to Alphalens to have ownership over the project. Over the next few months we expect to uncover bugs and merge contributions. Alphalens is in its infancy and we can't wait to see the kind of changes it will go through.

Enjoy,

James and Andrew

### Resources

Here are some places to check out too:
- Alphalens Docs for an analysis of a professional alpha factor.
- Alphalens Github repo
- Example notebook from the repo to use anywhere

### Get Started

Not sure where to start? The Alphalens tutorial is the best way to try out alphalens for the first time.

594
Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

111 responses

EDIT: This notebook contains code for an older version of the Alphalens API. Please use the notebook in the original post above.

For a quick walkthrough on how to read the graphs give this notebook a once over.

232

This sounds very useful indeed. It's a real pain in the butt to backtest over and over to see if a factor has a meaningful return distribution.

wow. Great tool!

Hi James,

Can this be applied to relatively short-term (e.g. 5 day) mean reversion of prices, using minute bars? Or is it restricted to the pipeline daily bar data? I can give it a try using daily data (perhaps by using the full OHLC values), but my algo uses smoothed minutely data, so it would be nice to stay consistent.

Grant

Grant,

So when we were designing the package we wanted to make it not specific to Quantopian/Zipline so anyone could feed in any sort of signal. Since we think of a traditional equity alpha factor as something that cross-sectionally ranks a large universe of stocks you could feed into Alphalens any properly formatted data that does that. With that in mind we didn't explicitly design Alphalens to be compatible with minute level cross sectional rankings but there is nothing in the code that would prohibit that from working. So if you can format your data in a pandas MultiIndex, similar to what you see in the documentation you should have some success.

This looks great!

For those of us who can learn by example, any possibility someone could whip up an algo that we can clone (while the world moves in appetency, on its metalled ways)?

Steve,

Alphalens isn't designed to be used inside of an algo. It's meant as a tool to help you find an "edge" and then you can trade that edge in a normal algo. Alphalens is just one step in the workflow from idea to algo.

Thanks Christopher -

I'll take a look when I get the chance. The factor would be the ratio of the mean of a stock's price to its price (where "price" is a smoothed value). Basically, the question is, on any timescale, does the mean reversion factor have predictive value. I suppose you are saying the new tool with some cajoling and interpretation could spit out an answer, yes or no. I wouldn't be doing any ranking, just calculating the factor across a universe. But I suppose I could rank if necessary, although the algo I've constructed doesn't rank. Would the algo need to use ranking for this tool to make sense?

Grant

Ranking is just an example. In effect you are assigning some value to a given equity that is the result of your alpha expression/calculation (in your case a ratio) and presumably there is some information embedded in that value (like high ratios are going to have higher future returns). Alphalens will look at these values and help you determine if they are predictive and over what time period.

James, so what you are saying is that a factor, by definition, is some kind of rank, where the higher ranks predict higher future returns than the lower ranks. This lens allows us to see how effective that prediction is, and how it performs in different sectors, or once conventional 3-factor or 5-factor alphas are removed?

In other words, have I discovered something new?

One important point is that each alpha factor can be considered a weakly predictive pricing model. You are not requiring that the model be able to predict future prices (predicting returns is equivalent) with high accuracy or for any given stock. Instead you are requiring that across many stocks and accepting that the model will be noisy, you can generally sort higher performing assets into the top and lower into the bottom. As long as your top quintile is outperforming your bottom quintile, your strategy should be okay (ignoring constraints like slippage and turnover). The trick is then combining multiple weakly predictive alpha factors by averaging. This averages out some of the noise in each and produces rankings that will more and more consistently split top and bottom performing assets.

You also need to check each new factor you make against existing known factors and factors in your model. This can be done by looking at the correlation between the values or the correlation between the returns produced by the ranked portfolios. If your new factor has signal left over once you've regressed the returns against other known factors (positive alpha), then you say that there's something new your factor brings to the table and add it to your model. You may have to reduce the weights of the factors that are correlated with your new factor to avoid overweighting the components that already exist.

There are other steps to producing a good algorithm, such as portfolio construction and risk management. We are working on lectures touching on all of this and hope to provide examples as well.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

Guess I'll have to give it a go, but I'm curious how it handles identifying the range of factor values that tend to have predictive power. For example, for small deviations from the mean price, one might expect reversion, but if things get kicked way out of whack (e.g. bad news for a company) then reversion would not occur.

That would be a great thing to check in alphalens. Create a mean reversion factor that rather than strictly ranking something higher the more it is under a past price, has some response curve where after a certain point it starts to penalize it. You could then also combine with another factor intended to capture information about the larger movements, among others.

Thanks Delaney -

I hadn't considered the idea of a penalty function/response curve, but it would be easy enough to create one.

Thanks Delaney, most insightful

Few other things that might be useful to add:

• How long does the "edge" last or half life of the "edge" (alpha decay) / is the edge logn term or short term or both
• Evaluate a large number of factors at once and get a rank tear sheets based on a scoring formula
• Use parts in trading (outside pipeline) and pipeline - If we want ICs or other factors calculated why reinvent the wheel, just import the calculation and use

One suggestion, if possible, would be to generate one or more high-level consolidated figures of merit (the same would apply to the tear sheet). On your example, http://quantopian.github.io/alphalens/, you have 42 tabulated numbers, followed by lots of charts. Can it all be summarized, before diving into the details? It is hard to take in all in with a single glance.

I'm really excited about this tool. It allows you to more clearly separate out and individually analyse the prediction-part of your algorithm. If you don't have that, no amount of risk-control or good execution will save you. It also is an additional layer of protection against overfitting (less free parameters).

@Suminda:

How long does the "edge" last or half life of the "edge" (alpha decay) / is the edge logn term or short term or both

That is what the days keyword argument is for. By default, it shows you the ability to rank over the next day, as well as 5 and 10 days ahead. One would expect that the IC decays the longer into the future you look.

Evaluate a large number of factors at once and get a rank tear sheets based on a scoring formula

That's a good idea, there's certainly work to be done to analyse multiple factors. Note that you can also combine multiple factors into a single one (as Delaney mentioned) and then analyse that mega-alpha with alphalens.

Use parts in trading (outside pipeline) and pipeline - If we want ICs or other factors calculated why reinvent the wheel, just import the calculation and use

Yes, that would be a great feature.

Grant:
If you just want a single number, look at the Information Coefficient (avg rank correlation coefficient), or the Information Ratio (mean(IC) / std(IC) -> Sharpe Ratio). A rough rule of thumb is that an IR of >= .05 is good.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

One more thing, if the context where you would use this tool or the outputs are not clear, it's safe to just wait a bit for the full release. We are currently preparing materials that will make this all clear and put things in perspective. This pre-release is really just to get some initial feedback and find potential bugs.

@James, what is the quintile order that Alphalens expects? I would normally expect quintile 1 to be the best, but your notebook shows quintile 5 performing best for your low vol factor. I also note your forum post defines momentum as past price / latest price, the inverse of convention.

So there are no baked in assumptions about which quantile the best alpha scores should be in (perhaps smaller values are better, or bigger values are better). That being said bigger values are in the bigger quantiles, that is just the behavior of pandas qcut. I think it makes sense that the larger numbers are in the larger quantile, in the same way that if you are in the 99th percentile you are greater than almost all other values.

I'd also take a look at the ranking universes by factors lecture for a more in depth explanation of quantiles and how we think about them in the context of alpha factors.

But surely you do put some meaning into the quintiles, e.g. in the graphs, like Top minus Bottom, and Factor Weighted Long/Short? Also, the IC is negative if you get your quintiles the wrong way round.

They're pretty much all symmetric metrics. In algorithms if you are losing money it's usually because your strategy's model has no predictive power and you are losing money on trading costs; therefore you can't just invert the sign on the trades. Alpha factors aren't like that, because they're just predictive rankings that ignore trading costs, you can just invert them if they're negative or rank one way versus the other. As long as your IC is consistently greater or lower than zero (but not both), then that's fine.

Yeah that is true, but it still provides you information about the factor. It is easy enough to transform your factor so that larger scores are the better scores. For example, say you had some factor where smaller values are better, and you could multiply your entire factor by -1, feed it back into Alphalens. In the end what we are trying to do is extract some meaningful information about the quality of the alpha signal, that it differentiates between winners and losers. The natural extension of that is to think about the two extremes i.e. top and bottom quantiles.

Does the tool somehow compensate for gaps in trading? For example, every week, the market is closed for at least two days straight over the weekend (sometimes 3 if there is a long holiday weekend). Does it ignore gaps and treat the data as if it were sampled uniformly in time (i.e. equi-spaced)? Or does it somehow interpolate, so that the data are uniform (i.e. do you simulate trades overnight/over the weekend)? If you interpolate, is it linear interpolation? Or polynomial?

Also, will the docs eventually explain what you are doing mathematically/statistically? What is an information coefficient in this context and why should I care? And why do I keep hearing the term rank? In my field (physical science/engineering), it is unusual to do analyses by ranking. More generally, is there anything "new under the sun" here? Or is it the same workflow and toolbox that has been used by quants forever (no criticism--gotta start somewhere--but it just means that Q quants won't have any fundamental advantage based on the workflow/tools).

Generally most data on the Quantopian platform only is available for tradeable days. This may change as we implement instruments like futures which change 24/7, but as of right now it uses pricing data pulled from the get_pricing function, which will treat non-trading days as if they never happened.

Our lectures will explain why information coefficient is used as the main deciding factor. There is already a lecture on Spearman Rank Correlation, which when applied to current alpha factor values and future returns is referred to as the Information Coefficient.

https://www.quantopian.com/lectures#Spearman-Rank-Correlation

Additionally, the reason ranks are used to determine whether a predictive relationship exists between your current alpha factor values and future returns is the noisiness of financial data. In other disciplines such as physics, processes tend to be better behaved, more replicable, and have lower variance. As such the statistics taught in those disciplines do not really account for the types of conditions present in financial data. These conditions include non-stationarity, strange distributions, fat tails, and generally a lot of autocorrelation complexity and irrationality induced by human behavior. Taking the rank of a series of values rather than simply the raw series allows you avoid a lot of issues that would confuse normal correlation. You are looking at the data through a simpler lens and asking whether there exists a general relationship, and this can avoid issues like outliers. There is more information in the lecture.

As far as toolbox, in my experience traveling and talking to many folks, I'm pretty confident that alphalens falls in the top 10% of tools available at Quant funds. Many don't even have something like alphalens collected in one place, and will have various parts spread out among various teams. I would say that having alphalens as a complete and open-sourced package actually does give us an advantage over most quant firms, many of which underestimate how much being open sourced can add the quality of a piece of software.

In addition to Delaney's important points on robustness, it turns out that often enough it's good enough to be able to rank stocks. For example, if you long the top 10% and short the bottom 10% and you got the ordering correct, you will turn a profit irrespective of what the overall market does (e.g. they could all go down, just the top 10% not as much as the bottom 10%). Here, we're more concerned with relative performance, rather than absolute, hence the rank statistics.

I did a quick read through:

https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient

I suppose you are ranking factors and returns, and then applying the Spearman's rank correlation? If I understand correctly, you are measuring the degree of monotonicity in the functional relationship between a given factor (x) and the returns it generates (y) on various timescales?

I'm not sure treating non-trading days (and overnight market closures) as if they never happened is o.k. You have unequally spaced data in time, with relatively large gaps. Your approach probably needs to be justified. Stuff happens when the market is closed, and the longer it is closed, the more stuff can happen. Making a forecast on Friday for a Monday open is different than making a forecast on Monday for a Tuesday open, for example. And there tend to be larger jumps overnight because of corporate announcements, etc. than within the trading day.

It'd be great to see some examples on minute bars (perhaps smoothed and then decimated). How would one specify the time of day at which to compute returns? Say I wanted to trade at 10:30 am every day? Or every Tuesday?

You're understanding of the IC is correct, basically "how correlated were my predictions with what actually happened?"

So this is actually an area that we spent some time thinking about. You'll note that in the examples we feed in open prices instead of close, we did this for a reason: basically the concerns you voiced above. We had to make an assumption about when a prediction is made and we decided that predictions are made as close as possible to the future behavior they are going to model, and that when they are made they incorporate all relevant information. So by feeding in the opening price we can get the earliest value to when the prediction was made and analyze the alpha decay from there. Alphalens makes no assumptions as to the implementation of the alpha factor, so it will look at a 5 day prediction made on Tuesday in the same manner as a 5 day prediction made on Monday. When we were writing Alphalens we were thinking in the daily timeframe, so this is a place we could probably make the package more robust. In addition I think the days parameter is a misnomer, this should be changed to periods. Here is the issue, #62. You should be able to feed in a signal that is generated only on Mondays (for example), and look at its performance N periods out.

In regards to your last point I'd say that Alphalens is trying to decouple the signal generation and prediction step away from the execution step. If your algorithm's performance is extremely sensitive on time of day and execution then Alphalens probably isn't the right tool to be using.

Hi James,

I skimmed over some of the code on github, and it appears that you are geared toward daily bars exclusively. Although perhaps an elegant match with pipeline (which only works in daily bars...I never quite understood that decision), my sense is that you are dealing with noisy and limited input data, and as a result, effectively pushing the trading timescale way out. Firstly, my understanding is that your database of daily OHLC bars are values from single trade events, so there will be a lot more noise than is necessary, compared to data that has been smoothed on a timescale relevant to the intended trading frequency. Also, you end up computing statistics on very small sample sizes. It seems you'd want to be sampling in time, with at least 30 samples, if not more, for each lag. For example, if I'm interested in a 1-day lag, then I would sample 30 prices, and look at the returns at 30 times the next day, each lagged by 390 minutes. Similarly for 5- & 10-day lags. Alternatively, I could smooth the data and sample once (but then you lose the sampling statistics). Using a single value, such as the opening price doesn't seem like the right approach (especially when you have the minute data!).

Unless you want the whole workflow geared around low-frequency trading (weekly?/monthly/quarterly), my sense is that you need to have a path to work with the nice minute bar data you support. For pipeline, maybe you could support user-defined input data, e.g. smoothed at the minute-level and then decimated down to daily frequency, to keep compatibility with your basic engine?

Or maybe it all works like charm using daily bars, and single trade event data? I ain't no expert. What do your experts say?

@James
Looks very interesting. It worked with the time periods you have, but when I tried setting this up for a longer time period (starting 2005-01-01) , and execute the run_pipeline() cell:

Note, I also set "universe = AverageDollarVolume(window_length=20) > 10**7" which probably causes a lot more data.


Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/gevent/greenlet.py", line 327, in run
result = self._run(*self.args, **self.kwargs)
File "/build/src/qexec_repo/qexec/algo/fundamentals/loaders.py", line 59, in query_for_cols
File "/build/src/qexec_repo/qcommons_repo/qcommons/db.py", line 53, in retryer
return func(*args, **kwargs)
File "/build/src/qexec_repo/qexec/algo/fundamentals/loaders.py", line 114, in query_for_array
ALGO_DATE_COL_NAME,
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 147, in __exit__
self.close()
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 826, in close
conn.close()
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 882, in close
self._checkin()
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 758, in _checkin
self._pool, None, self._echo, fairy=self)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 645, in _finalize_fairy
connection_record.invalidate(e=e)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 542, in invalidate
self.__pool.dispatch.invalidate(self.connection, self, e)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/event/attr.py", line 256, in __call__
fn(*args, **kw)
File "/build/src/qexec_repo/qexec/db.py", line 135, in receive_invalidate
dbapi_connection.get_backend_pid())
<Greenlet at 0x7efdb25a0f50: query_for_cols(([valuation_ratios.fcf_yield::float64], [asset_cla)> failed with InterfaceError


@Matthew: This error can occur when your Pipeline takes too long to run, so it's normal that you would see it if you increase the timespan and universe size. It's possible to get around this timeout by splitting up your Pipeline timespan into shorter spans, using run_pipeline for each shorter span, and concatenating the results.

Disclaimer

The material on this website is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory services by Quantopian. In addition, the material offers no opinion with respect to the suitability of any security or specific investment. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. If you are an individual retirement or other investor, contact your financial advisor or other fiduciary unrelated to Quantopian about whether any given investment idea, strategy, product or service described herein may be appropriate for your circumstances. All investments involve risk, including loss of principal. Quantopian makes no guarantees as to the accuracy or completeness of the views expressed in the website. The views are subject to change, and may have become unreliable for various reasons, including changes in market conditions or economic circumstances.

@Nathan Thanks! If anyone else hits the same thing, this seems to work:

results = [run_pipeline(pipeline, '%s-01-01' % year, '%s-12-31' % year) for year in range(2005,2016)]
results.append(run_pipeline(pipeline, '2016-01-01', '2016-06-30'))
results = pd.concat(results).fillna(value=0.)


Very pythonic!

@Matthew I am not sure if above will work if you have a window length > 1. You need to carefully work out the overlap otherwise the 2nd period start after an interval. If say length we are looking is 1 month every slice will approximately missing 1 month of results from the start of the slice.

Also let's request Q to fix these pipeline timeout issues and general slowness dealing with historic fundamental pipelines.

Hi James -

Any feedback on the trading timescale? It seems that before you get too far into this thing, you'd better understand the timescale the workflow will support. Intuitively, it seems you are undersampling for anything more frequent than trading every week, at an absolute maximum frequency, but monthly is more like it.

Grant, James actually had a great insight in response to your question. Alphalens is completely agnostic to the time-scale of the data you pass in. Thus, if you pass in minutely factors and minutely returns, you can specify days=(1, 5, 10) and it will compute the predictiveness over the next 1, 5 and 10 minutes. That's why days is a bit of misnomer and should be periods. Of course, pipeline still only works with daily frequency.

@suminda : I'd think get_pipeline should fix that automatically for you (by back-adjusting the start date for the window size), though maybe it doesn't right now? Else you'll have misleading results even with one call.

It turns out that get_pricing is also flaky - I have been getting a lot of "your kernel died" errors.
I tried breaking that up by years too, but it doesn't seem to help. I suspect the dataframe is just too large.
However, that makes the framework difficult to use in practice - studying an alpha factor over a short timeframe doesn't seem great.

I think you misunderstood what I said.

results = [run_pipeline(pipeline, '%s-01-01' % year, '%s-12-31' % year) for year in range(2005,2016)]


If you split like this and use factors with lookback / windows, for the 1st part of the window you do not have data. So the calculation starts when the window is full.

So it should be something like:

results.append(run_pipeline(pipeline, '2010-12-01', '2011-12-31'))
results.append(run_pipeline(pipeline, '2011-12-01', '2012-12-31'))


The actual start date corresponds to the number of trading days is the look back window beyond the cut off data (Dec 31) otherwise you do not have a full window for the factor calculations.

@suminda, I'm just saying that run_pipeline should use the passed start_date - window_size as its effective start date internally (if it doesn't already) so we don't have to compensate manually for this. That'd be much more intuitive, I think.

Thanks Thomas,

Yes, I realize that there is a github issue to work in periods. I guess I was thinking more in terms of the overall workflow paradigm. The universe selection would seem to be a good fit for pipeline, but then for factor analyses, it seems like one would want some combination of pipeline (which presumably will remain a daily-bar tool) and trade data and factors at the minute level (or maybe minutely smoothed and then decimated). Otherwise, I think a basic Nyquist argument can be applied that you'll be under-sampling for trading at a daily frequency; a rule of thumb is that one would like a minimum of 10 samples per period (the Q data supports 390 per day--actually a bit more, since it is OHLC data). So, I'm wondering if an example could be cooked up that starts with pipeline as a rough filter, then pulls minutely data (e.g. 20 days x 390 minutes per day), and runs the alphalens back to 2002, to see if it works at the minute level? This might give some insight into developing the alphalens tool for application to minute bars.

If that is happening then the it would be a bonus for splitting, but having said this we should be able to use the library without workarounds with normal pipelines. Since pipeline can essentially look across all assets and long histories with even many factors and filter the system should be capable of handling this optimising data access and calculation internally.

What's a "tear sheet?"

That "days=(1,5,10)" through me for a loop, until I sifted through the comments.

Very interesting stuff. Takes time to digest.

Recommend reading Grinold & Kahn, "Active Portfolio Management."

Grant here is a notebook i just cooked up that calculates a minutely moving average ratio as the factor. While the nomenclature is off, as well as a few of the plots that do some downsampling, Alphalens is still able to handle a factor that is calculated at a different frequency without error. I think that speaks more to the workflow on the side of generating and formatting the alpha signal vs the package itself.

46

@Abraham Kohen

"Tear sheet" is what we term the collection of stats and graphs Alphalens outputs.

James, alphalens is a very good improvement to the already great factor tear sheet. Well done! I'd like to make some suggestions and I would really appreciate your opinion.

• Regarding 'Factor Weighted Long/Short Portfolio Cumulative Return' and 'Cumulative Return by Quantile' plots. Only the 1 day forward returns graph is plotted. I expected to see plots for each day/period passed to create_factor_tear_sheet ('days' argument) .
I might have a factor that shows alpha only after X or Y days and I might not be interested to see 1 day forward returns plots, so I would pass days = (X, Y) to create_factor_tear_sheet and I would expect to see the plots for X and Y forward returns. At least it would be nice to have this feature as an option if not the default behaviour.

• Regarding 'Top and Bottom Quantile Daily Turnover' plot. It would be great to have an option to plot the turnover for all quantiles (in a separate plot maybe). More important, the plot shows only the daily turnover, but It would be more interesting to plot the turnover after X days, for each X in 'days' (the create_factor_tear_sheet's argument). This suggested behaviour is a superset of the current one because a user can add '1' to 'days' argument to get the current behaviour.

• Regarding 'Factor Rank Autocorrelation' plots. As above, the plot shows daily autocorrelation, but It would be more interesting to plot the autocorrelation after X days, for each X in 'days' (the create_factor_tear_sheet's argument).

• The choice of what forward days to analyse with create_factor_tear_sheet is somehow arbitrary in my opinion. I had the same problem when I used the factor tear sheet (by Andrew Campbell). For this reason I created a function that plots the average cumulative return of each quantile, with standard deviation, over a configurable period of time (this is very similar to the plot you get when running an event study). That helped me understanding the average performance of each quantile and to decide what days should I investigate with the tear sheet.
Do you think something like this would be worth to be merged to the project ( I don't want to waste time creating pull request for something that is not going to be merged) ?

Thank you for this very much needed tool.

Hi James -

In your minutely bar example above, you use:

simulation_data = get_pricing(universe.columns,
start_date="2016-01-04",
end_date="2016-03-04",
frequency="minute",
fields="price")



Does this imply that to run alphalens back to 2002, one would need to pull minute bar data across all securities in the universe for the whole time period? And how does it handle changes in the universe versus time (i.e. drops/adds)? For the algo I've been working on, presently, it re-balances once per week and uses a trailing window of 5x390 minutely closing prices to compute the factor used for long-short forcasting, and I typically run backtests starting in 2002.

Probably, I could create a loop, pull data weekly, smooth, compute the factor at a fixed time each day, and decimate, so that the dataset input to alphalens is manageable.

Sorry, kinda sketchy I know...just starting to sort out how I would use the tool.

@Luca

For the sake of the open sourcey-ness of the project can you create an issue in the public repo, I just don't want other potential contributors to fee like they need to be Quantopian members to stay in the loop. Also there are a couple of your ideas above that I'm particularly jazzed about and just don't want to clutter the thread.

Yeah Grant you'll probably have to be a little creative with how you create the signal, as pulling in all of that minute data would be burdensome. I think you probably have the right idea - just iteratively generate your factor values weekly and then append to some signal variable. You should have success with passing it into Alphalens though at a weekly frequency. Also know that you can access each of the Alphalens plots and performance computations individually so you have much finer tuning of parameters/resampling and alike. And to answer your question about how it handles changing universe, the answer is pretty simple: if we can't calculate forward returns for a security we just remove it from that specific sample.

Hi James,

You might consider how to support minute bars natively within alphalens, given that the trading platform is minutely. There is a kind of signal conditioning/smoothing/filtering/decimation problem that I think most users will need to tackle (unless they are happy working with daily OHLCV data). This is the general point I'm trying to make. It is a bit more than simply switching from 'days' to 'periods' in your code. Above, you say "Define your pipeline alpha factors" and I'm suggesting that because pipeline only works on daily bars, users will need tools and guidance on how to work with minutely data over as many market cycles as the data will support (currently back to 2002). This doesn't mean that trading would be be every minute, but rather daily to weekly, which I think is consistent with the style of trading you are looking to support for the fund. Am I off-base thinking that one should be working with minute bars for daily/weekly trading?

yeah I totally understand what you're saying, in fact just over the past week as we've been able to get Alphalens in front of our friends, and other friendly companies I've been able to see that flexibility in default frequency and implementation details (like over weekends, or even overnight) is something we need to think about.

And to answer your last question in a way that probably won't be satisfying I'd say use the frequency of data that creates the best model.

James -

It would be interesting if you could share the feedback from others here. What have they said?

Regarding the frequency of data, I'd reiterate that using market data from a limited number of single trades may not be the best approach (in fact, it would only seem to apply to high-frequency trading, which Q does not support). It'd be interesting to hear from others, but if you're thinking is that pipeline will ideally natively support daily/weekly trading, without augmenting it with minutely data, I'd do a head-scratch. Do you have access to the Point72 folks? What are they doing for their long-short strategy development? Are they successful with daily bars?

It has actually been incredibly validating. The folks we have talked to so far have actually developed similar pieces of software in house. Overall they've been fairly impressed with the product we've been able to build in such a short amount of time. Alphalens has most of the features they have in their products (in one case the defaults days=(1,5,10) were the same!)

A few of the noticeable features they have in their products, are trailing twelve months performance, they would like the ability to down-sample to 1mo, 3mo, 6mo, more flexible grouping, quick similarity comparison with other well know factors and dynamic universe selection. Those last two are a little harder to implement as they require the backing of data, and since we are trying to make Alphalens useable by anyone it's not likely those will enter the development queue.

James -

Good to hear that y'all aren't off in left field. It is interesting that folks are also doing their own software development in-house. Given the size and maturity of the hedge fund industry (and finance in general), I would have thought that there would be a host of software vendors who provide packages. This is certainly the case in other mature technical disciplines.

What feedback have you gotten on daily versus minutely versus other frequencies? Bars versus tick data? Etc. I'm wondering if my intuition about your apparent focus on daily bars is correct? Or maybe it is consistent with a kind of slow-motion, trading/investing, long-short, portfolio-management type algos versus day/swing trading styles, and you are wanting slo-mo algos?

1. Universe Selection: define the universe of tradeable components; the universe should be broad but have some degree of self similarity to enable extraction of relative value. It should also eliminate hard to trade or prohibited instruments.
2. Single Alpha Factor Modeling: define and evaluate individual expressions which rank the cross section of equities in your universe.
3. Alpha Combination: combine many single alphas into a final alpha which has stronger prediction power than the best single alpha. This is often due to the noise in each alpha being canceled out by noise in other alphas, allowing signal to come through.
4. Risk Model: define and calculate the set of risk factors you want to use to constrain your portfolio.
5. Portfolio Construction: implement a process which takes your final combined alpha and your risk model and produces a target portfolio that minimizes risk under your model.
6. Execution: implement a trading process to transition the current portfolio (if any) to the target portfolio.

It'd be interesting to have a consolidated discussion regarding the various elements. We now have your thread, the one presumably corresponding to #1 (https://www.quantopian.com/posts/the-tradeable500us-is-almost-here), and Jonathan Larkin seems to be chiming in on the topic in his blog post (http://blog.quantopian.com/the-foundation-of-algo-success/). Is there anybody there heading up the whole thing? Is he/she interested in engaging users on the topic?

It would seem that the workflow is missing an idea generation step, before step 1. Before picking a universe and exploring a factor, one needs to know which universe to use, and what factor might work with it and in what fashion. This is perhaps the most important step, and relates directly to your "Strategic Intent" requirement on https://www.quantopian.com/fund.

"Universe Selection: define the universe of tradeable components; the universe should be broad but have some degree of self similarity to enable extraction of relative value. It should also eliminate hard to trade or prohibited instruments."

While it is true that sometimes what looks like positive alpha is a manifestation of "hard to trade" as seen in the bid-ask spread, one should not necessarily exclude hard to trade instruments from the universe, as the positive alpha can often exceed the cost of hitting bids / lifting offers.

For relative value trades, for example, in pairs trading, the harder to trade instrument, all else being equal, would be the "driver" of your trade, while the easier to trade instrument would be your "hedger." The hedger instrument is the one whose "driver" price is contingent upon, and it is the instrument which is transacted immediately after the "driver" is executed.

Thanks. --Grant

Hi James -

A question came to me, in the context of the discussion on https://www.quantopian.com/posts/implementing-and-launching-deep-learning-algo . If I'm understanding correctly, the alpha discovery process, with alphalens being the primary tool, is being approached as a traditional, manual hypothesis-testing approach. I get an idea, code it up, run it through alphalens, review the results with my eyeballs, and then decide if it is a keeper or if I should throw it back in the lake. In theory, I could replace myself with a computer that would do a better job, and could look at a gazillion alphas, going back to 2002, mixing and matching all of the Q data. It could churn away, while I do other things. Then, once I have some candidates (say N=30 or so), I could then look at each of them, mainly as a quality check, with your alphalens tool. If they look reasonable, I'd proceed to the next step in the workflow of alpha combination (which, as I understand, would be a step implemented by me, not Quantopian...I can only get paid if I go through the whole process and then wait 6 months for out-of-sample paper trading data).

So my question is, assuming you will be offering some form of automated alpha discovery, should the alphalens tool include the ability to call it as a function, returning some overall figures of merit? I'm just wondering how it would plug into an automated process for finding alphas? Or is the idea that users would rely on other resources, as suggested in the blog post:

Need some ideas? Try a Google search for “equity market anomalies” or, even better, an SSRN search of the same.

It just seems like I'd put in an awful lot of work to find even a single decent alpha, and still need N more to move onto the next step in the workflow (if I only needed one, then there would be no "Alpha Combination" step). And Q has a bunch of relevant datasets that could be mined. Why are you only directing users to a Google search or the open literature (although these approaches could bear fruit, too)? This traditional path would seem to be uphill, if the goal is for each user to find N alphas, with N >>1. Or maybe the idea is for users to have 2 <= N < ~5, with lower Sharpe, and Q would then do the combining of algos, to get to N>>1 and a much higher Sharpe?

Cheers,

Grant

So I'd argue that no one single number is sufficient to determine that one alpha factor is better than another, you lose a lot of information by doing that (though this could be said about any summary statistic).

I honestly think that having some idea grounded in reason about a market inefficiency or that in some way predicts returns is really wise. Take a look at Pravin's thread. I think that is a great example of someone finding an idea and trying to implement it. By no means am I saying that data-mining all of the data sets we have available is futile, but I think trying to think about how these data sets interact with each other and how that affects future returns is a much more robust way of developing trading algorithms.

combine many single alphas into a final alpha which has stronger prediction power than the best single alpha

To me, many implies N>>1. And presumably, each factor would need to have a significant degree of statistical independence. It sounds daunting. I'm just trying to digest the proposed workflow, and how it might apply to an individual Quantopian user (who might feel overwhelmed and discouraged by the prospect of your N>>1 implication).

If you look at this paper, 101 Formulaic Alphas (http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2701346) it would seem to bear a lot of similarity to what you are proposing. The suggestion there is that a relatively large number of factors can be combined, if one can find a set for which each has transient, but significant predictability. Are y'all still interested in https://www.quantopian.com/posts/the-101-alphas-project? If so, how might an individual user derive a large number of alphas, as illustrated in the paper?

Another comment regarding the overall workflow -

Has anyone there at Quantopian written algos that would serve as an existence proof that the workflow, as applied on the Quantopian platform, will work? In other words, do you have an algo or set of algos that serve to show that if the workflow and associated tools are established, it'll produce the desired outcome. It just seems that before putting a lot of work into the whole thing, it would be nice to have some sense if it'll work. In particular, there is some vision for a multi-factor (e.g. N ~ 5 or more) pipeline-based algo that would fit in your Q fund. Does such a thing exist? Could it be shared? Funded?

There is a lot of advice dispensed by Quantopian (no doubt much of it sound), but it would lend a lot of credence to it if you showed that you know how to walk-the-walk with some actual, real-money examples.

API for alphalens tool has changed. or maybe I'm missing something

import alphalens

alphalens.tears.create_factor_tear_sheet(factor=results['factor'],

prices=pricing,

sectors=results['Sector'],

show_sector_plots=True,

days=(5,10,20),

sector_names=MORNINGSTAR_SECTOR_CODES)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-10-052b33a3cb86> in <module>()
6                                          show_sector_plots=True,
7                                          days=(5,10,20),
----> 8                                          sector_names=MORNINGSTAR_SECTOR_CODES)

/usr/local/lib/python2.7/dist-packages/alphalens/plotting.pyc in call_w_context(*args, **kwargs)
41                 # sns.set_style("whitegrid")
42                 sns.despine(left=True)
---> 43                 return func(*args, **kwargs)
44         else:
45             return func(*args, **kwargs)

TypeError: create_factor_tear_sheet() got an unexpected keyword argument 'sectors'


Correct, a few days ago we shipped a newer version of Alphalens (it now has community contributions!). The API did change a bit, I've updated the post with a new notebook.

Thank you James et al for this package. It's an awesome way test a hypothesis of a possible correlation really quickly.
I took Thomas' notebook and put in one factor of Quantopian's partner data, which I figured should have a really good predictive power.
It's the PreCog Top 100 Securities which forecasts 4 day stock returns and costs 100 a month. So what I expected is that it aligns really good with the 5 day analysis in alphalens. However it was only okay... Or did I read the results wrong? What do you guys think, is it worth the100 based on this alphalens analysis?

53

@Jonny, This is great, thanks for running that. I think that factor actually looks pretty good in terms of the stats (5-day annualized information ratio of 4.756 is nothing to sneeze at). Mostly it seems that is carried by the top quintile where the factor is most predictive. Ideally we'd test this over a longer time-period to get more certainty. But yeah, depending on your expectations it's also not mind-blowingly awesome. Whether it's worth \$100/month is a more difficult calculation with many variables (like how much capital you can deploy to a strategy).

Keep in mind that factor's usefulness is not tied to merely its IC performance. For one, you want to trade off of a model that contains many factors. Having multiple independent factors can produce a 'greater than the sum of the parts' alpha that works well over longer timeframes. Combining many weaker factors is a key part of the quant workflow, and often having a very independent factor (not correlated to known factors) is more valuable than having a factor with a super great IC by itself.

Additionally, sometimes factors can be used in clever ways. If you have a factor that you know many people follow, then you can use that to gain information about when you may want to trade or not trade. For instance, if you have a factor that you know lags the market, you can use that as a sanity check in your other analysis. It might also be a very stationary risk factor you can use for measuring how exposed your alpha model is to some risk source.

Basically, there are more ways to use factors than just their predictive capacity, and the usefulness of a factor is not just its IC.

@Thomas, @Delaney, Thanks for your replies! Yes, I already was working on combining several weak factors. In Addition to the PreCog Top 100 Securities I've done some research on social media data (see my reply on this thread).
I put them all together into Thomas' machine learning notebook together with some other predictors which did well on the AdaBoostClassifier. I changed it to a naive bayes GaussianNB (what Mikko suggested), but Mikkos result was only a tiny little bit improved (to 50.86% on the test set).

My universe is much smaller as I only selected training data where I had enough social data and only assets which were in the PreCog Top 100 Securities.
However I saw the log-loss was pretty high, which means that I have stronger predictive power in the predicted probabilities.
And in fact, if I select the top and bottom decile, there are 61% of the predictions correct.

I feel like this is basically the idea of the Long-Short Equity Strategy, did I understand this correctly? Definitely trying to attend the webinar tomorrow.

So my plan is to put all this together in one factor in Thoma's second ML Notebook, and than use the result probablity as a ranking factor in an algorithm. Is that the desired workflow?

23

@Thomas, @Delaney,

I've been looking at AlphaLens, and have been trying to figure out what is a "good" factor versus a "meh" factor.
I'm having a hard time figuring out the calculation that gives @Jonny a "nothing to sneeze at" "4.756" for annualized IR (Information Ratio? versus Information Coefficient?).

What is it's(IR Annualized) definition with respect to the Spearman rank-order correlation coefficient ?
Also, what is it's range, and what is the good-bad value interpretation of that range?
Thanks!
alan

@Alan

The IR (Information Ratio) is merely the mean of the Information Coefficient (IC) divided by the standard deviation of the IC, where IC is defined as the Spearman Rank Correlation Coefficient. In fact we actually use the implementation of the Spearman that you linked too.

mean(IC) / Std(IC)

In general we want the higher IR the better -- it implies that the signal is consistently predictive. In general a bad IC (and by extension IR) would be close to zero, where as a good IC is around > 0.03 (though this can be lower if it has a large spread). In the end, we are looking for significant non-zero predictability. For an example of a great alpha factor check out the README on github, or this good short term alpha from a community member.

In the end, we are looking for significant non-zero predictability.

It would seem that this calls for a point-in-time hypothesis test, where I want the probability of mistakenly assigning predictability to be low (i.e. assuming non-zero predictability when it is actually zero). When I look at the results published by Jonny Langefeld above on the PreCog Top 100 Securities data, it is not obvious what is going on. Looking at the plots of Forward Return Information Coefficient (IC), how should they be interpreted? They are very noisy. At first-glance, it is not obvious that there is any persistent predictability. The overall mean is non-zero, but that may be due to the IC swinging in and out of predictability. For trading, it would seem that one needs to know, point-in-time that a given factor will be predictive over the upcoming period. A hypothesis test needs to be applied, across all factors, to determine which ones are valid.

Combining many weaker factors is a key part of the quant workflow
Alphalens isn't designed to be used inside of an algo.

If Alphalens is not integrated into an algo, then I don't understand how it will be useful to distinguish point-in-time weak factors from ones that do not pass a test of predictability.

Is the Q team working on a lecture series or tutorial ?

Without such background detail training on these interfaces I am afraid we just keep clogging forums with random questions while trying to figure out how to interpret and use these wonderful tools you guys are putting together.

Please put together a detailed video on how to really use these APIs in particular the ML and alphalens examples.

Thanks

is there a white paper about Alkphalens or could someone explain in a few sentences of how this works? How do you determine if a factor is predictive? Doesn't that depend on the model used? Is there a specific model in Alphalens?

Hello Ricardo,

We're currently working on a lecture covering precisely this. We hope to release it in the next few weeks, I'll follow up on this thread when we do.

To give you a quick answer, determining the predictive power of a model can be done largely agnostic of the model itself, and Alphalens does not assume any model when doing so. Of course each model is based on certain economic hypotheses about market behavior, so testing these conditions more specifically can be a good way to further validate your model.

Lionel Ouaknin Oct 11, 2016 Oct 11 7:18 AM API for alphalens tool has changed. or maybe I'm missing something

29

James, Delaney, and co,

Really enjoying using alphalens - thanks for all the hard work on this!
Just a quick question. I'm using alphalens now on some of my own price data (from the JSE in South Africa), and I'm curious to know how alphalens handles stocks that get delisted or disappear from the universe during the span of the test? I'm testing over a 16 year span, so there are quite a few stocks that drop out along the way, and I think this is causing the IC and return graphs to have some strange jumps, bumps and even gaps in them. Is this something that you guys have thought about?

Thanks

@Richard: If a stock gets delisted its prices will be nans and we drop them at that point from the alphalens estimation. If you are using get_pricing() the prices are also adjusted so it's probably also not merges or splits. Can you post your alphalens tearsheet (or send it to help if you don't want it public)?

@Richard: stocks that don't have a factor value or whose factor value is nan are dropped from alphalens estimation too. Alphalens doesn't use a static universe but a dynamic one, where stocks enter/exit dynamically depending on the factor dataframe index.

Thanks Thomas, thanks Luca, that's exactly the info I was looking for - I managed to isolate the problem: the set of prices I was using had zeroes in it where I needed nans. Working fine now, will let you know if any further issues.

How to install this package on an Anaconda Python distribution (3.5)? Running "pip install alphalens" or "conda install alphalens" doesn't work...
Can anyone provide some help here? Thanks!

Mario: Can you try pip install git+https://github.com/quantopian/alphalens?

That worked. Thanks Thomas!

Hi, how can I make sure to avoid lookahead bias? If I feed in both my factor data and open price data starting on Jan 1st, does alphalens know to 'shift' the prices when computing returns?

@Christian

The factor Series index contains a date for each factor value. The prices DataFrame index contains a date too, so it must contains, for each date, the securities prices known at the time the factor values are calculated. To make it clear, let's see how Alphalens is used in the example NB. The factor Series is extracted from the pipeline output. The pipeline gives back a value for each security every day, and this value is computed from the information available before market open (even in research, to match backtest behaviour). So, what kind of price should we insert in prices? Alphalens needs the prices known at the time each factor value is calculated. As the values are calculated before market open, we use market open prices to build prices, as those are the prices we could get if we had bought those securities that day after calculating the factor.

Hope this helps, but if you are still confused have a look at this function that computes the factor forward returns, that should make clear how to avoid lookahead bias.

Luca, thank you very much, that clarifies it! The key thing I wasn't aware of was that the factor value on a given day essentially corresponds to the value of the factor on the previous day.

As a sidenote, I guess in practice one would enter at the open and exit at the close, so that could be something to be 'improved' in alphalens (rather than computing returns based on open-open).

Christian, if you are investing lots of money your orders might take hours before being filled. So ideally you enter your orders at market open. If you enter the orders near market close then you have to re-enter the orders at market open anyway, to fill the missing positions.
Also, usually you don't close all your positions at market close and enter the new ones the day after at market open, because you might already partially hold the new positions. So when you rebalance your portfolio you calculate the difference between your current positions and the new ones, then you enter the orders for this difference. This saves transactions const.

Luca, thanks I understand this better now! Basically we rebalance once (at the open) (by diff'ing the old and new portfolios), and so the return we obtain in this case is the open-to-open return. Got it.

Hi, I could suggest adding a summary of the annualised return/volatility/Sharpe of each quantile and/or top minus bottom portfolio and/or factor weighted portfolio. I would find it useful, but not sure if it's outside the scope of alphalens. Let me know what you think. Thanks.

Hi,nice tools. I found alphalens code on github and there is an example folder in it which shows two examples. There's also .jpg pictures showing the result , I'm wondering how to save the results as a .jpg file? Expecially the dataframe

I am doing my first steps with alphalens and I do not undestand why
the min, max of different quantiles overlap.
In the shared notebook 'Quantiles Statistics'
max of factor_quantile 1 is -0.016364 and min of factor_quantile 2 is -0.037706.
In my understanding quantiles partition the axis to ranges that do not overlap.
Also why the count varies? It should differ at most by 1.

@Ph Quam, because those statistics are computer over the full time period of the factor values. So it might be possible that today quantile 1 max value is greater than quantile 2 min value of another day, even though it is guaranteed that today quantile 1 max value is smaller than today quantile 2 min value.

@Ph Quam, I forgot to answer to the count issue. The counts are the cumulative count of the daily amounts. So even a small daily difference can result in a big cumulative one. As a side note, the daily difference can be more than 1 in the current implementation of the quantization function.

Ok, I got it. Quantiles are calculated grouped within a day (as the trading algo would do), but the displayed minimums are along all days.
As for the daily quantile counts they can differ by more than 1 if at the quantile boundary there are several equal values. And the cumulative counts can differ up to the number of days if days differ at most by one, which is not constrained.
Thank you.

I am looking at tears.create_event_returns_tear_sheet() and plotting.plot_cumulative_returns_by_quantile() and trying to understand what they actually plot.
My intuition is that alphalens should try to emulate faster what an algo with zero commission and slippage does.
In these 2 methods however I have the feeling that the cumulative thing that they do is not what an algorithm does.
The above methods if I understand correctly treat each day as a start of a series and the returns of all these series are summed.
This is equivalent of a trading algorithm that buys everyday a stock that it considers prospective, ignoring the fact that it already owns the stock from the previous days.
The correct thing for alphalens to do should be to add the returns series to the cumulative plot only when a symbol changes its factor quantile.
This is what an algorithm will do - the first day when a stock enters quantile 1 it is bought/sold, and the next days no changes occur if the quantile is unchanged.
Currently alphalens adds returns series every day the symbol is in the same quantile.
Did I correctly interpreted the code?

@PH Quam, your intuition is mostly right. Those plots show what the returns would have been if an algorithm had held a specific quantile every day, giving equal weight to every stock of that quantile. The stocks might be different from the previous day or not, it doesn't matter because transaction costs are not considered. The forward returns are re-calculated every day, computing an average of the sum of the return of each stock in the quantile for that day.

@Luca,
The problem is in performance.average_cumulative_return_by_quantile(). I do not know if from statistics point of view its does the right thing, but from
user perspective it's result is hard to interpret. What I expect is to get an event study - analyze the average (among different (start_date,symbol) pairs) response of a single event 'entering quantile 2'.
Does alphalens or quantopian have an Event Study implementation?

9

@Ph Quam, looking at your NB I can now understand your doubts. You should report quantile 2 only in the days you expect to enter that events and not in the following days too. Every single day marked as quantiles 2 is considered in the statistics, so event the second and third days after the event are part of the average returns. See attached NB.

Regarding event study implementation please see this and this.

10

The tweak you made in the notebook gets what I expect for quantile 2, but it breaks quantile 1. Q1 is padding in the sample, but in a real setting we want to evaluate all quantiles fairly.
The quantiles are generated by alphalens and it should take care to not buy tickers it already owns. Alphalens should not average results across subranges
of a symbols streak, only across different symbols with the same (streak_start_date,quantile) for the day.
This sentence in the github sample notebook:
"By looking at the cumulative returns by factor quantile we can get an intuition for which quantiles are contributing the most to the factor and at what time." is not correct for the current implementation. The plot time is moving target, every day the reference date is incremented. When users look at
the plot they may perceive that a signal should be exploited between days 3 and 10 with respect to day 0, but they probably fail to account that
day zero is actually every day in the forward period.

@Ph Quam, I believe I understand your point and It's only a matter of understanding why Alphalens calculates the forward returns the way it does.

Alphalens expects to receive in input a factor that ranks the stocks in a certain way, so that top and bottom ranked stocks perform the opposite (positive vs negative returns) in the days (periods) after the ranking is computed. To verify the ranking scheme, Alphalens groups the stocks in quantiles and computes the average of the forward returns of the stocks in the same quantile. Those average quantile forward returns are calculated by Alphalens every single date the input factor DatFrame has values for (every date is considered a new starting point for forward returns calculation). This makes sense because we want to test the factor quality every time the factor is computed. All in all, what we are looking for is to answer the question: After the stocks are ranked with our factor, what happens on average to the quantile returns?

Ideally a factor applies is ranking scheme every trading day , but this is not compulsory, e.g. a factor might generate values only on Mondays. In this example the factor DataFrame would only have values for the dates that correspond to Mondays while the other days would be nan or not present at all. In that case Alphalens would calculate statistics only for the days where the factor has values.

The above explanation on how Alphalens works easily matches the behaviour of an algorithm that every day, before trading starts, does the following:
- run the factor (e.g. pipeline)
- groups the results by quantiles
- enter positions to match the quantiles it wants to trade

Hopefully this helps.

I updated the NB to show how you can compute q1 and q2 as if they were two events. It works fine, but it's not a good example for Alphalens, as this is not the intended use of the tool.

8

@Luca,
I think that we both agree that some code has to take care to avoid repeatedly buying the same symbol every day.
Where we disagree is whether this code should be inside the factor or inside alphalens.
"Alphalens expects to receive in input a factor that ranks the stocks in a certain way". Actually instead of just ranking you impose a dual mandate on factors, the second mandate being control of exposition (no repetitive buying).
My understanding is that alphalens is made to quickly test arbitrary data series (morningstar fundamentals ratios, daily temperatures somewhere, milk
prices in bangladesh) whether they have alpha, which can't meet the expectation that they are capable of control of exposition.
So the whole discussion is solved by a small factor wrapper that can do this outside of alphalens and my idea is that it is the sensible default because
probably few people can visually recognize that the symbols are bought every day the factor is in some quantile and not NaN.
To implement exposition control in the factor, the factor needs as parameter "periods_after" with which average_cumulative_return_by_quantile() will be
called.

@Ph Quam, each symbol is not bough every day, Alphalens properly model the forward returns. Still every day is considered when calculating the statistics.

Let's imagine we want to analyze the factor performance at 22 days/periods (this correspond to an algorithm that rebalances its portfolio monthly as there are 22 trading days per month on average). Alphalens would calculate the quantile forward returns every 22 days, considering the starting and end prices of the stocks that were present in the quantile at the beginning of this 22 days period and never changing this stocks set. Alphalens builds the cumulative returns time series from those forward returns. Anyway, there is an issue: the performance built in this way are very dependent on the particular trading day we start calculating the cumulative returns. E.g. In a year, an hypothetical algorithm that uses that factor would only trades 12 times, one trade every 22 days. This doesn't give us meaningful statistical results. To overcome the issue Alphalens builds 22 cumulative returns results, where each cumulative return time series starts on a subsequent day from the previous one. This way all the possible outcomes for a rebalancing period of 22 days are covered and Alphalens returns an average of those results. This is much more meaningful.

BUT, if you are really trying to evaluate the performace "on specific dates" (e.g. an event study ) then just make sure your factor has values only on the particular dates you are interested to evaluate.

Hope this help, even though I start believing I am missing your point ;)

There is an issue with dividends. get_pricing() does not adjust for dividends and splits. history() adjusts but is not available in research.
This will make alpha look worse if some symbols pay dividends. And splits can affect performance even more (are they filtered?).

Hi,

This is a great package.
I am trying to look at the code and make sense of it and notice that there are lot of demean in the factors and returns.
I think when you want to analysis for a long short portfolio, you always demean the returns. But I didn't get why.
It will be great if you could give a explanation on when and what you want to demean, and why you need to do it.

Thanks.

Kudos to the developers for this great tool.
I just tried the original notebook with some modified dates but that produces some strange error.
The following two lines are in my notebook:

results = run_pipeline(pipe_low_vol, '2017-05-30', '2017-05-30')
pricing = get_pricing(assets, start_date='2017-05-30', end_date='2017-06-30', fields='open_price')

And the error I get is:

/usr/local/lib/python2.7/dist-packages/alphalens/performance.pyc in factor_alpha_beta(factor_data) 209
210 reg_fit = OLS(y, x).fit()
--> 211 alpha, beta = reg_fit.params
212
213 alpha_beta.loc['Ann. alpha', period] = (1 + alpha) ** (252.0/period) - 1

ValueError: need more than 1 value to unpack

Could anyone explain why this is happening please?

Thanks,
Yiran

@Yiran Huang, would you mind posting your NB? That would make easier to help you and discover what's wrong with the code

@John Wu, the concept of return demeaning is the whole point of a long-short portfolio (dollar neutral actually). With this kind of portfolio the returns don't depend on the overall market performance (thus the demeaning by market mean returns) but on the spread between the long leg returns and the short leg returns. After having performed the returns demeaning you are actually looking at the performance relative to the market mean returns instead of absolute returns. This makes sense only if you are running a dollar neutral portfolio.

@Luca

NB attached, thank you.

12

@Yiran Huang. the problem is this:

results = run_pipeline(pipe_low_vol, '2017-05-31', '2017-05-31')


if you expand the date rage, at least 2 days, it works fine.

@Luca

Thank you again for quickly pointing out the issue. Now the flow works on my own alpha experiments as well.
Could you elaborate a bit why this is causing me problems as the original example NB was running the pipeline for 1 day?

@Yiran Huang, Alphalens is thought to analyse a factor over a period of time and using it on 1 day period is a corner case that breaks the code. Also I am not so sure why you need to run Alphalens on 1 day only.

The second notebook has an error, maybe can also be updated.