In “When Should Retirees Retrench?” Gordon Pye introduces his Retrenchment Rule, which is a PMT flavored retirement withdrawal scheme that I briefly wrote about previously. As I’ve written about before [here and here], any PMT implementation needs to make assumptions about the rate used and the timespan to work across.
Pye takes an interesting approach to decide which rate to use: he used Monte Carlo analysis to try all the rates and then picks the one that works “best” on average.
(The table of his results is reproduced above, at the start of this article.)
What counts as “best” for Pye isn’t well defined in his article. It seems to based on eye-balling the charts and making a (somewhat subjective) judgment. The 8% discount rate is the highest at ages 100, 95, and 90; it isn’t the highest at ages 65–85 but it still much higher than the standard “withdraw 4% advice”.
I think — based just on Pye’s table reproduced above — you could also make a strong case for a 10% discount rate. Yes, it drops below 4% withdrawals at ages 95 and 100. But not much below 4%. And based on Blanchett, Bernicke, and others we know that retirees’ expenses go down in retirement.
When you squint and look at it, it seems like Pye is doing something similar to McClung’s HREFF formula — withdrawals below 4% are penalized harshly and withdrawals above 4% see diminishing returns.
Pye’s table raises some questions that I thought would be interesting to explore. All Monte Carlo analyses are sensitive to their inputs.
- Pye uses “the S&P 500 for periods starting in the early 1960s and ending in 2006”. He argues that this is reasonably conservative because the mean and standard deviation from this period are worse than for most other time periods.
- Pye uses 100% equities. “Including some fixed income issues in the portfolio has little effect on the results of the simulation.”
We’re going to be experimenting with different assumptions and see how the 8% Retrenchment Discount Rate (RDR) holds up.
First, let’s see if we can find a good metric we can use for determining which RDR is “best” from a set of possibilities. I hinted above that McClung’s HREFF might work. Let’s try plugging in Pye’s results and see what HREFF thinks.
(Technically, I’m only using the numerator from HREFF here. But that’s sufficient because the denominator would be the same across all of these. That means this is generating a kind of Certainty Equivalent Withdrawal with a floor.)
First let’s try HREFF with a floor of 4.0 (HREFF-4).
RDR-4 = -0.016
RDR-6 = 1.882
RDR-8 = 3.355
That picks RDR-8 as the winner, just as Pye did. However, RDR-8 wins by a landslide when we use 4% as the floor; nothing else is even close. Something about that just feels off; it should win but not by that much.
The more we drop the floor, the better alternatives to RDR-8 perform. Once the floor drops below 3.3%, RDR-8 is no longer the top performer.
When we drop down to a floor of 3%, RDR-10 has taken a noticeable lead and RDR-6 is close enough you can begin to make a case for it as well.
RDR-4 = 3.518
RDR-6 = 4.563
RDR-8 = 5.089
I will somewhat arbitrarily choose HREFF with a floor of 3.5% as the metric for the rest of this article. That still leaves RDR-8 as the winner but it isn’t miles ahead of the competition.
(Note: You’ll see that I soon run into problems with my choice of a 3.5% floor for HREFF!)
Most of the data I discuss below can be found in the follow Google Sheet. Journal articles have to deal with space constraints…luckily I don’t!
Pye's Retrenchment Discount Rate
Pye Discount Rate, 65, 70, 75, 80, 85, 90, 95, 100 2, 3. 3%, 3. 2%, 3. 1%, 3. 1%, 3. 1%, 3. 0%, 3. 0%, 3. 0% 3, 4. 0%…
Confirming the Results
Before I start playing with alternatives, I need to make sure my simulation doesn’t have any mistakes, so I’ll try to mirror Pye’s assumptions as closely as I can and see what I get.
- Normal distribution of returns
- Mean of 0.07
- Standard deviation of 0.16
- 25,000 iterations
- Every discount rate from 2–12, in increments of 1%
In my tests, RDR-7 has the highest score making it the winner. However, I tested in increments of 1%, while Pye only shows the even numbers. RDR-8 comes second place — and is the highest scoring even RDR.
The exact numbers in every cell aren’t the same as in Pye’s paper. That’s a bit of a concern but, even after a brief email exchange with Pye, I’ve not been able to track down the cause of the difference.
Things are pretty close, though, and the overall result mirrors Pye of RDR-8’s strong showing. I’m going to consider that “close enough” to proceed.
100% US Equities
Now that I’ve established my simulation is (probably) working correctly, we can start tweaking things.
As mentioned earlier Pye uses US equities from the early 1960s to 2006 to perform his tests. I don’t think he specifies but I assume he’s using Ibbotson’s data for this.
Let’s try two different data sources.
- “Optimal Withdrawal Strategy for Retirement Portolios” (2008), Appendix 1, by Blanchett et al.
- Credit Suisse Global Investment Returns Yearbook 2012 (aka “DMS” after the authors Dimson, Marsh, and Staunton)
RDR-8 is the highest rated discount rate using Blanchett et al. However, RDR-7 and RDR-9 are so close, you could make a case for either one as well. Even RDR-6 isn’t really that far off.
With DMS we run into our first problem: HREFF-3.5 is too high of a bar. DMS has a mean of .062 and a standard deviation of .204; that’s substantially worse than the .07 and 0.16 that Pye used. The DMS data is for the US from 1900–2012. All of the discount rates fail to consistently generate high enough withdrawals to make meaningful scores on the HREFF-3.5 metric.
How do we pick a winner?
Let’s re-run the comparison with HREFF-2, lowering the bar slightly. RDR-7 now emerges, though RDR-6 and RDR-8 are neck and neck.
As we move forward, we’re going to see other simulations where HREFF-3.5 fails to show us differences between the discount rates. When that happens, I’ll show HREFF with a floor of 3.5% and a lower number to help us decide which discount rates perform best.
Pye’s analysis — and ours up to now — assumes the retiree has 100% equities. That is not a very realistic assumption. In his original paper, Pye noted that using fixed issues didn’t really change the results, but he didn’t show the data that led him to this conclusion. (That’s totally understandable given the space constraints of a journal article.)
Let’s try it for ourselves. I use the historical mean & standard deviation from Blanchett et al. for a 60/40 portfolio and a 40/60 portfolio.
As soon as we add in bonds, we see the optimal RDR start to slide down when using the HREFF-3.5 metric. This might seem to contradict Pye’s suggestion that bonds have only a minimal effect. When we also look at HREFF-2, however, RDR-8 still performs reasonably well.
However, it still looks clear to me that it is not the top performer once we add in bonds.
The future is dangerous
Many people feel that using historical data for our current environment is dangerously misleading, due to the current combination of high valuations and low yields. Blanchett et al. offer a “conservative” set of inputs: they take the historical results and reduce the mean and increase the standard deviation.
Home country bias
So far we’ve only talked about US returns. But it is a big world out there and, thanks to Dimson, Marsh, and Staunton, we have over 100 years of equity and bond returns for over a dozen countries.
Let’s see how the Retrenchment Rule worked for investors around the world.
In most cases, countries experienced equity returns so poor that I needed to switch from HREFF-3.5 to a HREFF with a lower-floor in order to see meaningful differences in the discount rates.
In Japan, for instance, I had to use a floor of 0% because their equity returns have been so terrible. Between the aftermath of World War 2 and the recent decades of post-bubble Japan, the 20th century and early 21st century has not been kind to the Japanese equity investor.
A league table of sorts…
Let’s try to make sense of all the data tables we’ve now generated. As a first cut, let’s take the the top three for each data set and award them points. First place gets 3, second place 2, third place 1. Then add up all the points.
Across all of the data sets we looked at, RDR-7 is consistently the best performer. When it didn’t come it first place, it was usually second place. The only country where it didn’t finish in the top three was the UK.
Does all of this overturn Pye’s conclusion about using a Retrenchment Discount Rate of 8%? Not exactly. Pye only showed even numbered discount rates and we’re talking about the difference between 7% and 8%. And Pye’s “success metric” for picking a winner wasn’t HREFF in any form. People can reasonably disagree about whether HREFF is the right thing to be using to sort winners and losers.
This league table isn’t exactly the pinnacle of science; it is more of a fun thing to try to provide some perspective on mountains of data. But based on everything I’ve written above, I think there is a stronger case for a 7% Retrenchment Discount Rate than for any other choice.
Where I do differ from Pye is his statement about using a discount rate of “at least 8%”. Based on my results, it seems clear that using a discount rate above 8% is risky. Both the inclusion of bonds and looking at global returns damp the performance of higher Retrenchment Discount Rates. Even if you’re not a doom & gloom type, few believe the next decade or two of portfolio returns are going to match those of the recent American past.