The Wisdom of the Crowd: How the Crowd Helps us Selects the Best Stock

If you were to pick just one stock which one would you pick and how?

How much would you win, and how much risky is that strategy?

# Introduction

I ran the strategy described here from 2005-01-01 to 2019-06-06 and it made 1926% while the SPY benchmark made 206%. The drawdown was -0.61%.

This is a strategy that picks the Amazons, Netflixes, Microsofts and Apples of the day, invests patiently in them while a better opportunity appears.

# Strategy

The strategy is based on picking the stock highest in AverageDollarVolume and sticking with it until another stock kicks it out of the leading spot. Over the long term, such a strategy leads to much higher returns than the baseline SPY index.

While the strategy described is not something one may want to run in practice, it demonstrates an anomaly - why would such a simple strategy even exist in the first place.

This post is made with this question in mind and hopes to evoke some useful answers.

I called this the "Wisdom of the Crowd", because ranking the stocks by AverageDollarVolume is like a popularity contest in which the crowd picks a winner, and the strategy just follows the winner. The crowd leaves a trace via high price and high volume for us to follow.

The details are available in the source code of the backtest.

# Evaluation over multiple time periods

Period Algorithm Benchmark(SPY) Notes
01/01/2003 - 01/01/2006 69% 49%
01/01/2006 - 01/01/2009 -15% -24%
01/01/2009 - 01/01/2012 297% 48%
01/01/2012 - 01/01/2015 96% 74%
01/01/2015 - 06/01/2018 61% 38%
01/01/2018 - 06/01/2019 12% 6%
06/01/2019 - 01/02/2020 49.12% 25% Out of Sample after publication

# Caveats

• Holding a single stock is risky.
• The selection criterion is based on AverageDollarVolume only. This may lead to False Positives, if for example the volume was associated with price decline, or it could be an issue due to a bubble forming effect and quick burst
• This is a long term strategy suitable for a bull market. In a long bear market, holding stocks is of course leads to losses, but if the stock have sound fundamentals, they will rebound more than the index, when the bear market is over.
• Tax Efficiency. Selling appreciated stocks leads to paying taxes. The strategy does not consider taxes.

# Discussion

• The distribution of returns among many stocks in a SHORT term (e.g. up to 2-3 years) is close to Gaussian (and is symmetric)
• The distribution of returns among many stocks in a LONG term (e.g. 15 years) is Zipfean (and highly asymmetric). This means that the rich get richer phenomenon is in place. If fact, is this confirmed by the distribution of the market capitalization.

The strategy hints that perhaps there is a way for a non-professional investor to end up long term in the high end of this Zipfean distribution.

At last, the strategy was inspired by the book, "Dual Momentum Investing: An Innovative Strategy for Higher Returns with Lower Risk", by Gary Antonacci.

# Disclaimer

This presentation is for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation for any security; nor does it constitute an offer to provide investment advisory or other services by the author or anyone else. Nothing contained herein constitutes investment advice or offers any opinion with respect to the suitability of any security, and any views expressed herein should not be taken as advice to buy, sell, or hold any security or as an endorsement of any security or company. This disclaimer was adapted from Quantopian's own disclaimer.

96
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
"""
This is a template algorithm on Quantopian for you to adapt and fill in.
"""
import quantopian.algorithm as algo
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.filters import Q500US
from quantopian.pipeline.factors import AverageDollarVolume

def initialize(context):
"""
Called once at the start of the algorithm.
"""
# Rebalance every day, 1 hour after market open.
algo.schedule_function(
rebalance,
algo.date_rules.every_day(),
algo.time_rules.market_open(hours=1),
)

# Record tracking variables at the end of each day.
algo.schedule_function(
record_vars,
algo.date_rules.every_day(),
algo.time_rules.market_close(),
)

# Create our dynamic stock selector.
algo.attach_pipeline(make_pipeline(), 'pipeline')

def make_pipeline():
q500us = Q500US()

screen = (dollar_volume.top(1) & q500us)

pipe = Pipeline(
columns={
'dollar_volume': dollar_volume,
},
screen=screen
)
return pipe

"""
Called every day before market open.
"""
context.output = algo.pipeline_output('pipeline')['dollar_volume']
# These are the securities that we are interested in trading each day.
context.security_list = context.output.index

def rebalance(context, data):
"""
Execute orders according to our schedule_function() timing.
"""
old_stocks = list(context.portfolio.positions.iterkeys())
old_stock = None
if len(old_stocks) > 0:
old_stock = old_stocks[0]

new_stocks = list(context.output.index)
new_stock = new_stocks[0]
if old_stock is not None and old_stock != new_stock:
order_target_percent(old_stock, 0.0)
if old_stock != new_stock:
order_target_percent(new_stock, 1.0)

def record_vars(context, data):
"""
Plot variables at the end of each day.
"""
record(leverage=context.account.leverage)

def handle_data(context, data):
"""
Called every minute.
"""
pass
There was a runtime error.
13 responses

Very interesting, you might be onto something there. Thanks for sharing!

Fascinating! I rerun your algo and in 2019 only suggest to buy AMZN which makes sense (at least to this amazon stock owner). It might also be interesting to look at the change in avg dollar volume over a specific period (say bi-weekly) and see what stocks are being highly sold over bought.

The drawdown was -0.61%.

Drawdown was -61%.

AverageDollarVolume is like a popularity contest in which the crowd picks a winner, and the strategy just follows the winner.

I'm not sure if I agree with this point. For every buyer there is a seller. Dollar volume doesn't tell you anything about whether the crowd is voting up or down.

The share price of every asset on the market is already set by the best "fair value" guess of the crowds. That's how the supply and demand of the markets work. In other words, "price discovery" is "wisdom of the crowds." What your strategy suggests is that heavy dollar volume is an indication that the market's voting mechanism (price) is guessing too low. In other words, the crowds are not wise.

Biggest caveat for this strategy is that the result reflects very few data points. Holding only one position per day means only ~220 data points per year. This is not very statistically significant.

My point about statistical relevance was perhaps not very clear. It's easier to overfit when applying a rule very narrowly, because with fewer datapoints it's easier to happen upon spurious correlations. (Just as you're more likely to get 5 heads in a row when tossing a coin than 10,000 heads in a row.) We already know AAPL and MSFT have done exceptionally well during the bull market, and we also know that they have been at the top of dollar volume much of the time. But is there any causation that connects the two? Or is it just happenstance?

What happens if you apply the same rule to the top 20 stocks by dollar volume? Does it outperform the remaining 480 of the S&P500? Surely if it's a wisdom of the crowds matter, then the trend should extend at least down into the top 4%, no? However, it does not hold. Alpha is but 0.01, which is within the margin of error, or otherwise explained by the fact that AAPL is carrying the weight for the rest of our sample.

5
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
"""
This is a template algorithm on Quantopian for you to adapt and fill in.
"""
import quantopian.algorithm as algo
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.filters import Q500US
from quantopian.pipeline.factors import AverageDollarVolume

def initialize(context):
"""
Called once at the start of the algorithm.
"""
# Rebalance every day, 1 hour after market open.
algo.schedule_function(
rebalance,
algo.date_rules.every_day(),
algo.time_rules.market_open(hours=1),
)

# Record tracking variables at the end of each day.
algo.schedule_function(
record_vars,
algo.date_rules.every_day(),
algo.time_rules.market_close(),
)

# Create our dynamic stock selector.
algo.attach_pipeline(make_pipeline(), 'pipeline')

def make_pipeline():
q500us = Q500US()

screen = (dollar_volume.top(20) & q500us)

pipe = Pipeline(
columns={
'dollar_volume': dollar_volume,
},
screen=screen
)
return pipe

"""
Called every day before market open.
"""
context.output = algo.pipeline_output('pipeline')['dollar_volume']
# These are the securities that we are interested in trading each day.
context.security_list = context.output.index

def rebalance(context, data):
for s in context.security_list:
order_target_percent(s, context.output[s] / context.output.sum())

for s in context.portfolio.positions:
if s not in context.security_list:
order_target(s, 0)

def record_vars(context, data):
"""
Plot variables at the end of each day.
"""
record(leverage=context.account.leverage)

def handle_data(context, data):
"""
Called every minute.
"""
pass
There was a runtime error.

For comparison's sake, here's the original strategy but with AAPL as the benchmark. As you see, it does not outperform AAPL.

To explain better what I'm pointing out : You could come up with a strategy where you go long one stock each day -- the rule being each day pick a random stock whose ticker starts with AA. You could rationalize it as the beginning of the alphabet having some psychological effect on the markets, and therefore the first stocks in the alphabet will perform better. No doubt the strategy would handily outperform the market in a backtest. But obviously there's no correlation between the tickers' place in the alphabet and returns. AAPL had tremendous success. They had the iphone. It had nothing to do with their dollar volume or place in the alphabet.

4
Total Returns
--
Alpha
--
Beta
--
Sharpe
--
Sortino
--
Max Drawdown
--
Benchmark Returns
--
Volatility
--
 Returns 1 Month 3 Month 6 Month 12 Month
 Alpha 1 Month 3 Month 6 Month 12 Month
 Beta 1 Month 3 Month 6 Month 12 Month
 Sharpe 1 Month 3 Month 6 Month 12 Month
 Sortino 1 Month 3 Month 6 Month 12 Month
 Volatility 1 Month 3 Month 6 Month 12 Month
 Max Drawdown 1 Month 3 Month 6 Month 12 Month
"""
This is a template algorithm on Quantopian for you to adapt and fill in.
"""
import quantopian.algorithm as algo
from quantopian.pipeline import Pipeline
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.filters import Q500US
from quantopian.pipeline.factors import AverageDollarVolume

def initialize(context):
"""
Called once at the start of the algorithm.
"""
# Rebalance every day, 1 hour after market open.
algo.schedule_function(
rebalance,
algo.date_rules.every_day(),
algo.time_rules.market_open(hours=1),
)

# Record tracking variables at the end of each day.
algo.schedule_function(
record_vars,
algo.date_rules.every_day(),
algo.time_rules.market_close(),
)

# Create our dynamic stock selector.
algo.attach_pipeline(make_pipeline(), 'pipeline')

set_benchmark(symbol('AAPL'))

def make_pipeline():
q500us = Q500US()

screen = (dollar_volume.top(1) & q500us)

pipe = Pipeline(
columns={
'dollar_volume': dollar_volume,
},
screen=screen
)
return pipe

"""
Called every day before market open.
"""
context.output = algo.pipeline_output('pipeline')['dollar_volume']
# These are the securities that we are interested in trading each day.
context.security_list = context.output.index

def rebalance(context, data):
"""
Execute orders according to our schedule_function() timing.
"""
old_stocks = list(context.portfolio.positions.iterkeys())
old_stock = None
if len(old_stocks) > 0:
old_stock = old_stocks[0]

new_stocks = list(context.output.index)
new_stock = new_stocks[0]
if old_stock is not None and old_stock != new_stock:
order_target_percent(old_stock, 0.0)
if old_stock != new_stock:
order_target_percent(new_stock, 1.0)

def record_vars(context, data):
"""
Plot variables at the end of each day.
"""
record(leverage=context.account.leverage)

def handle_data(context, data):
"""
Called every minute.
"""
pass
There was a runtime error.

@Stefan -- I don't mean to disparage your strategy. It's a good idea and one I've investigated myself. I hope my comments come across as constructive and insightful. That's at least my intention.

@Vladimir -- A strategy that produces a significant return is only useful if it's predictive. A "significant return" that is the result of overfit or spurious correlations is of no use to any investor -- institutional, retail, whatever -- because once you start trading it with real money it will no longer deliver a significant return.

A buy-and-hold AAPL strategy also produced a "significant return." So why muck around with algorithms? I think we both know the answer.

There is no predefined instruments in this strategy so do not use symbols in yours argumentation.

Oh, come on. You don't have to explicitly define instruments to target them. The point is as soon as your strategy selects only one stock at a time, you are essentially overfitting to individual stocks. Over the span of a 14.5 year backtest, this strategy has only held 5 different stocks the whole time. That's one stock on average every three years.

Any strategy that only holds 5 different stocks ever, and one of them is AAPL, is going to look impressive.

In period from 1999-2007 I was trading end of the day time zoning arbitrage strategy

Congratulations. However, that has nothing to do with this strategy. This is not an arbitrage strategy. It does not take advantage of some unexploited artifact of market structure. Rather, this strategy has no legitimate rationale. It has simply latched onto a historically profitable spurious correlation. This strategy has a 65% drawdown in the simulation, meaning it would probably experience much worse in real life. Your <5% drawdown won't help at all with this strategy. There's no comparison. Total red herring.

don't ruin a good algo "The Wisdom of the Crowd " by sciolism.

It's not a robust strategy, plain and simple. But go ahead and trade it if it passes your sniff test.

It appears the above post has been deleted (this morning), but nonetheless.

Applying the concept of “pressure points” includes more than just adding leverage.

For instance, the average gross leverage used in my charts below was about 1.26 compared to your 1.50. Albeit, sometimes a little higher, sometimes a little lower. The reason it is termed an average.

However, leverage is something you can control from within the program, or from outside as some just in time portfolio level directive. But those are all choices one can make. After all, we are the ones programming these things.

Our equity curves do exhibit some similarities. And, I would agree, they do look-alike.

However, the major difference is one of scale as can be seen in the following charts extracted from the simulation tearsheets.

We are in a game where scale matters and ultimately it is where it will make all the difference.

Cumulative Return

Return Distribution

Portfolio Metrics

Guy,

Yes, I deleted all my posts in this thread because the topic was distracted by Mr. Hawk in the direction of destroying the good algorithm
using the "statistical insignificance" sciolism.
I have always admired your ability to use pressure points to send any algorithm to the sky, not only this time.

Thank you all for your comments. I agree with all points including Viridian Hawk.
Upon further investigation, I found an issue with AverageDollarVolume picking stocks on the sell-out. I published a new strategy here https://www.quantopian.com/posts/fixing-the-wisdom-of-the-crowds-1. This strategy actually flips and holds multiple stocks at the same time.

Additionally, I updated the strategy with the last 6 months out-of-sample . The strategy has performed as before.

I also ran the strategy on mid-cap technology companies and other sectors.
The strategy worked only partially on the oil industry, but not on any other industries.
The reason that the strategy works is the "big get bigger" in technology.

Hi Everyone

Just a silly question here, what is the difference between DollarVolume and Daily Volume.
Is DollarVolume simply Volume * Price for the trading day? If yes will script not be biased towards stocks with high price and big volume? If yes, what will be the implication of using traded volume as rank, since it will be real wisdom of the crowd?
Sincere apologies noob here so questions might be very redundant to everyone.

DollarVolume is price x volume (ie number of shares traded). This is the total dollars that were traded.
Daily Volume is simply the number of shares traded.
One typically follows the money. If a lot of people are putting their money into a stock, regardless whether the stock is priced at $1 or$1000, then it means something. Simply using volume doesn't mean a lot. Compare a penny stock XYZ priced at $.02/share vs AMZN priced at$2000/share . Both may have a daily volume of 1 million shares. However, stock XYZ has only $20,000 interest (maybe just one person?) while AMZN has$2,000,0000,000 worth of interest.