When we left off we had begun to gain some confidence that using CAPE in a PMT calculation was a viable idea. At least, it didn’t seem totally terrible. But all we had really done was look at a handful of scenarios and note that, in a qualitative “looks to me” kind of way, many of them had clear benefits and none of them had any definitive drawbacks. Let’s see if we can get a bit more quantitative this time around.
Combining CAPE with PMT to tame retirement withdrawals, part 1
Valuations are one of the enduring will-o-wisps of investing. They seem too important to ignore…but almost impossible…
The first thing we hoped we would get from using CAPE was smoother income. When CAPE goes high, before a crash, we’d withdraw less. When CAPE goes low, during a crash, we’d withdraw more. “Smooth income” is defined in a straightforward way here — the standard deviation of (real) income each year.
This already looks promising and checking the mean & median standard deviation confirms it.
mean stdev of income
cape 8044.279715median stdev of income
There are only 6 cohorts where CAPE doesn’t have a lower standard deviation of income, versus 102 where it does. The 1899 cohort is the “best” for PMT, it had the best relative performance for standard deviation of income.
By contrast, the “best” for CAPE was 1975.
We run into the usual problem with standard deviation — is upside volatility really all that bad? Is it so terrible that you are withdrawing $120,000 a year instead of $80,000 a year? On the other hand, there is a full decade where CAPE is withdrawing substantially more. So while the standard deviation of income isn’t a slam dunk for CAPE — because of our hesitation about upside volatility being unduly punished — it is a strong piece of evidence in favor of CAPE.
Another desired outcome from using CAPE is to avoid crashes early in retirement. Of course, we can’t actually “avoid” them. What we really mean is something more like, “realise we are (probably) in a bubble, so withdraw less money than otherwise, that way when the crash happens we don’t have to cut back as much”.
I think there are two reasons this is a worthwhile goal. First, I think there’s a fair amount of anchoring early in retirement. The amount you withdraw your first year is going to a number you remember. Withdrawing less than that, regardless of the reason, is going to feel like a kind of failure. Second, when deciding whether to retire we are, unsurprisingly, focused on the income we’ll be able to withdraw in the first few years and deciding whether that’s sufficient. We’re more likely to have concrete plans (“finally go visit my sister in Ireland”).
To test for this we’ll use the Ulcer Index but apply it to our (real) annual withdrawals in the first five years of retirement. (I’ve did some brief sensitivity testing and also tested it with withdrawals in the first ten years of retirement and it didn’t substantially change any results below.)
(Remember, a lower number is better, much like standard deviation.)
CAPE does worse 28 times (22%) and better 102 times (78%). What’s more when CAPE does do worse, it does worse by less. When PMT wins, it wins by (on average) 0.022. When CAPE wins, it wins by (on average) 0.376.
What’s more…most of CAPE’s underperformances are clustered in the late 1910s and early 1920s. Of the 28 underperformances, the top seven are the years 1918–1925. The worst two were 1921 & 1922.
What went wrong? CAPE was very low in that time period. Between 4–6. Since we use the inverse of CAPE that means our underlying PMT calculation is using a rate of 16% — 25%. Our very first withdrawals very high. Look at those numbers! Over $100,000 from a $1,000,000 portfolio that is invested 50% equities & 50% bonds. The 1920s were very good (well, until 1929) but not quite good enough to live up to that.
So withdrawals fell from stratospheric heights to merely really good numbers. Still, I think we should take the lesson that extreme numbers (at least, extremely low ones) should make us cautious that CAPE is pushing things too far.
When CAPE excelled on the Ulcer Index
CAPE did best with the years 1917, 1916, 1973, 1915, and 1929. All cases where it helped smooth out a large crash in the first 5 years. (There was a depression in 1920–1921).
Overall, I think the Ulcer Index results are additional strong evidence in support of using CAPE.
Starting off on the wrong foot
One thing we’re concerned about is that CAPE (or plain PMT) might start out telling us withdraw “too much”, a number that turns out not to be sustainable. The Ulcer Index calculation is one way to look at that. Another way is to calculate the Certainty Equivalent Withdrawals (CEW) of the entire 30-year span and the calculate the ratio of the first year’s withdrawal to the CEW. In practice it would look like this:
- In 1881, our first year’s withdrawal using PMT is $44,587.
- We calculate the CEW for 1881 using PMT is $48,409.
- The ratio is 44,587/48,409 = 0.921
A number that is greater than 1 indicates we started out “too high” and subsequently spent most of our retirement withdrawing less than we started out doing.
When PMT is greater than 1, it has an average value of 1.25. When CAPE is greater than 1, it has an average value of 1.17. When PMT wrong it is “more wrong” than CAPE.
1969 was the worst year for PMT by this metric.
And 1915 was the worst year for CAPE by this metric.
1968 was a year when they both did poorly on this metric (though PMT did worse).
This metric is less clear cut in favor of CAPE, though the nod still goes to CAPE. CAPE does appear better but the improvement is slight (1.17 vs. 1.25) and there are some scenarios were CAPE does poorly. Though, again, they tend to be scenarios were the CAPE values were quite low (i.e. that 1918–1925 period again).
On to part 3…
This is already lengthy, so we’ll cut it short here.
We looked at three broader metrics: standard deviation of income, Ulcer Index for income, and ratio of the first year’s income to the Certainty Equivalent Withdrawals for the entire period.
All of them showed evidence in support of using CAPE, though the strength of that evidence varied. We also saw that CAPE didn’t always do better. We saw some cases where it did worse than plain vanilla PMT. In particular, when CAPE values were quite low (especially under 10), it seemed that CAPE often led us astray.
That might seem like an academic concern. The last time CAPE was under 10 was 35 years ago in the 1980s. But should make us question that perhaps CAPE overcorrects when it is “too far” off normal.
Still, given what we’ve seen so far, I think we’re, say, 80% justified in adopting CAPE in our PMT calculations for two goals:
- If we want to smooth our income during bubbles & crashes, CAPE seems to work pretty well for that.
- Using high CAPE10 numbers as a indicator that withdrawals should be tamped down somewhat to make future cuts less painful.
I’m still not 100% convinced, so I’ll look at a few more things next time. What about CAPE is especially low? Or especially high? Has there been any kind of regime shift in its usefulness since the 1980s or 1990s? And, finally, look deeper into “failures” to become comfortable with the various modes of failure.