Finding the largest stocks at a given time in history?

I am brand-new here. I'm hoping there is a way to figure out what the x largest stocks were at a given time, for example:
What were the 10 largest stocks by january first 2010?
The term 'largest' can be either based on fundamentals or based on value both is fine.

I hope someone can help me to figure out how to write a function that does the job above.

8 responses

Hello Benjamin -

Very doable. It is built into the Pipeline API, with something called .top(). For example, see:

https://www.quantopian.com/posts/contest-32-entries

I don't have time now to write a simple example, but if you dig around in the help docs/tutorials, you should be able to find some example of how to use the .top() thingy (by the way, there is no .bottom() as far as I know, but one can exclude the top N, giving the remaining bottom).

You may also want to look at the built in factors MarketCap or AverageDollarVolume depending upon if you want the largest capitalized companies or the largest average traded dollar volume respectively. Use them in conjuction with .top or .bottom as Grant mentioned to make a filter. (BTW there is a '.bottom' thingy) See the documentation under built in factors https://www.quantopian.com/help#built-in-factors .

@ Dan - Thanks for pointing out .bottom(). At one point, we only had .top() but now we have both. Seems we should also have a .middle() huh? : )

@ Benjamin - I'd also recommend, if you are looking to write algos for the Q fund, using the Q1500US base universe. It is "trade-able" and also filters out a lot of junk that is not appropriate for their fund. You might also ask "How about the S&P 500 or other common indices?" They are not available as part of the Q API.

Never said thanks - but hereby thanks a lot Grant and Dan. Really appreciate it! Very good suggestions. I'll look more into the proposed methods.

@Benjamin Biegel

You are very welcome. Good luck to you.

Note there is a percentile between widget too. See the help page.

Also that can be used with 'not' to reverse it. Here the tilde means not, at ~b.percentile_between. An example ...
This is going for just those with high momentum in one direction or the other, screening out those in the middle between lo and hi like 30 and 70.
m is short for mask. &= is adding to the mask. There was also an 'a' to go with the 'b' which is added to columns. The semi-colon for two lines in one makes it easier to remove a factor with one comment (aside from the second one needed, in columns).
Anyway, figured this might help some of you who might want lows and highs, I found it helpful for efficiency in trying things and readability once accustomed to it.

b = MomentumRatio(window_length=wl, mask=m); m &= (~b.percentile_between(lo, hi))


m &= (~b.percentile_between(lo, hi)) simplifies needing to combine in two separate percentile_between like (percentile_between(0, 30) | percentile_between(70, 100)). For a shorter line less likely to wrap.

@Gary, you could do the same with: ( ~b.bottom(30) & ~b.top(70) )