Making Of An Index: Standard Statistics

5 min readFeb 20, 2017

Last time we looked at how the Cowles Commission recreated a historical index giving us data from 1871 to 1940.

Making Of An Index: The Cowles Commission

I do a lot of backtesting in my articles. Digging into historical data can lead to interesting insights. But have you…

medium.com

I talked a bit about the Standard Statistics Bureau and the weekly price index that they maintained between 1918 and 1956. Let’s go back in time a bit and look again at Standard Statistics. (Somewhere along the way Standard merged with Poor and become “Standard & Poor” and I’ll be a bit haphazard about when I use which phrasing.)

The Weekly Index (1918–1956)

From 1918 to 1956 Standard created a weekly index of approximately 200–400 companies (the numbers varied over time but it was essentially “all the companies”). Once a week, they would gather up all the price data and compute an index.

When we talk about “the historical returns of the S&P 500” this index would appear to be exactly what we mean. The only problem is that most people don’t use this index.

What?!?

The Daily Index (1925–1956)

A few years after creating the weekly index, Standard also created a daily index. They created the index in 1928 but went back and calculated historical daily values for the index from December 1925 onwards. Since gathering the data and calculating an index every day was laborious before computers, the daily index was comprised of fewer companies. In fact, it had just 90 companies.

That’s right, between 1925–1956 the “S&P 500” was really the “S&P 90”. To make things even more confusing, it was hard-coded to be:

50 industrial companies
20 railroad companies
20 utility companies

In other words, it wasn’t just the 90 largest companies. The goal was to include the major firms in representative sectors of the market. Of course, over time the committee at Standard would change the number of sectors represented (for instance, they added a financial companies sector in 1977) and the number of companies in each sector. But keep in mind that it was determined by a committee — not by the actual market.

The birth of the S&P 500 and the death (and loss) of the weekly index (1957)

In March 1957 Standard made two big changes:

The “daily index” was expanded in scope from 90 companies to 500. This was the birth of the S&P 500 that we know today.
The “weekly index” (which, at that time, had over 500 companies in it) was discontinued and the data essentially thrown away.

What do I mean by thrown away? The Standard & Poor Index Committee decided that the historical record of the S&P 500 would be based on the “daily index” (of 90 stocks) rather than the “weekly index” (of ~500 stocks).

If we want the historical record the S&P 500 to attempt to reflect the entire economy, then it hardly makes sense to use a smaller set of stocks.

Does it make a difference? Yes.

Between December 1940 and December 1956 the “daily index” (of 90 companies) had an average annual return of 9.56% but the “weekly index” (of 500 companies) had an average annual return of 9.08%.

Does that mean that the historical record of the S&P 500 is overstating the performance of equities? (Yes.)

No fixed allocation (1988)

Remember how we said that the “daily index” was hard-coded to have a certain number of railroad companies and a certain number of utilities and so on?

That actually kept happening until 1988. Up to that point the committee said that the S&P 500 was:

400 industrials
20 transport companies (since it was more than just railroads)
40 utility companies
40 financial companies

When they got rid of the fixed-allocation and let the market decide the allocations changed to:

388 industrials
16 transport companies
43 utility companies
53 financials

While it probably doesn’t affect the final results substantially, it is still dubious that for most of the S&P 500’s history it used fixed-allocations decided by a committee instead of being a truer representation of the economy.

S&P 500 prior to 1926

As we saw above, the Standard & Poor company only calculated index data going back to 1926. But they provide index data before that. Where does that data come from?

They used the result of the Cowles Commission. Unfortunately, they use Cowles’ first edition and not the revised editions that he worked on subsequently.

Ibbotson’s SBBI

So what does all of that mean?

The most common source of returns data is probably Ibbotson’s annual Stocks, Bonds, Bills, and Inflation report (often referred to as Ibbotson’s SBBI) which provides historical information from 1926 to the present. A new version is published every year, with updated information.

Ibbotson is based on the “daily index” from 1925 to 1956, rather than the broader “weekly index”. This is most likely because by 1976, when Ibbotson started publishing his report, the “weekly index” data had been lost for 20 years.

Ibbotson uses month-end prices, avoiding the problem (discussed in my previous post) of averaging data.

Not averaging is good. But not using the broadest selection of stocks possible is less good.

Siegel & Shiller

Robert Shiller and Jeremy Siegel are two of the most famous financial economists. Both have published books that rely on the historical performance of stocks since 1871. What data do they use?

Both of them use the first edition of the Cowles Commission for 1871–1925. Remember that this data used averaged returns and had some errors that were later corrected by Cowles.

From 1926 to 1956 Shiller takes the “S&P 500” historical data (remember, this is “daily index” based on only 90 companies) and averages the January values. Averaging isn’t great but presumably he was trying to keep consistency with the Cowles methodology.

Siegel, on the other hand, uses the CRSP index from 1926-present. What’s the CRSP index?!? We’ve never mentioned that one? Without going into details, it is another index provided by another mob. Siegel isn’t necessarily wrong here (remember that even the “S&P 500” changes over time as well) but it does mean that it is harder to compare his results to someone who is using the “S&P 500” consistently.

These days, Shiller’s data is what most people use when talking about backtesting the “S&P 500”, in large part because it is freely available online.

To summarise, here are the concerns with the “S&P 500” in common usage:

Returns from 1871–1925 are averaged over the month of January. This will “hide” volatility, making it look like less than it actually was.
Returns from 1871–1925 are based on the first edition of the Cowles Commission, which has errors corrected in later editions.
Returns from 1926–1956 are based on a set of just 90 stocks, rather than a broader representation of the market.
Returns from 1956–1988 are based on fixed-allocations within sectors.

Next time, we’ll look at what some industrious researchers have done to try handle these issues, as well as researchers that have tried to extend that data series back past 1871.