Trinity Study 2019 Edition

Share on pinterest
Share on facebook
Share on twitter
Share on reddit

When you come to retire, how much of your nest egg will you spend each year?

4%?

Right, I get it, you’ve done your homework on the ‘Trinity Study’ that originated the ‘4% Rule’.

But have you looked at the Trinity Study from another dimension? What would the answer be in that case?

Join me as I take this cornerstone study for FIRE enthusiasts to another dimension and see how the results change in surprising ways. (oooh! Things are gonna get all actuary-like. You betcha!)

The Trinity Study

There has been so much written on the Trinity Study in the personal finance blogosphere that I’m going to pass on a detailed introduction to this famous piece of work. But very briefly – the researchers looked back at what would have happened to your retirement nest egg under different spending amounts and different historical periods. The parameters of the study were;

  • Historical data from 1926-2009
  • Portfolios composed of US large cap stocks and long term corporate bonds
  • 30 year retirement duration

This produced 54 separate time periods (most of them overlapping) and the researchers looked at the maximum amount that could be withdrawn per year without exhausting your funds. The punchline of the study was that all the withdrawal rates were at least 4% a year with a portfolio composed of 75% stocks and 25% bonds. This then coined the term that 4% was a “safe withdrawal rate”.

Said another way; as long as you don’t withdraw more than 4% a year then according to the Trinity Study your funds should be safe from running out.

Obvious Questions

At this point the obvious questions you should ask are:

  • What about testing this further back in time?
  • Tell me what happens with different investment strategies?
  • What about different time periods?

Other writers (for example Big ERN) have addressed these issues comprehensively and this is a pretty well trodden path.

And honestly, you really didn’t want me to answer the obvious questions, did you? So we’re going to break through the Matrix and look at the problem in a different way.

Another Dimension

Let’s break some rules.

One of the issues with the Trinity Study is that it is based on very little data and the time periods are overlapping, so we don’t even have 54 independent observations. We therefore need a way to extend the data.

So… I’m going to construct new 30 year retirement periods.

Uh, what?

I’m going to take stock and bond data stretching back to 1802 and randomly take 30 of those returns to make a new retirement period.

Hang on, what?

This new retirement period will be composed of actual historical data but in a random order. And get this bit; I’m going to allow replacement, so that any particular return could be chosen twice or more!

Oh man, this is crazy actuary sh!t.

Stay with it, this is a completely respectable method suggested by one of the most important economists of the modern age.

Paul Samuelson

Paul Samuelson was the first American to win the Nobel prize for economics and generally reckoned to be the most influential economist of the later 20th Century. He proposed a thought experiment;

What if you wrote down the annual historical stock returns from the last 150 years on 150 separate pieces of paper and put them in a hat. Then randomly pull out of the hat 30 pieces of paper to represent a new thirty year historical period, but crucially replace each piece of paper in the hat after you make each draw.

At this point you enter a different dimension where an important hypothesis of the market is killed.

Mean-Reversion of Stocks

Most people believe that stocks will mean revert over time. In other words after an extended down market, the market will somehow bounce-back, and over the long term the return on stocks will stay healthy.

There is no doubt that we have a huge amount of compelling historical evidence that stocks exhibit this kind of behavior. In every 20 year period stocks have outperformed bonds, and they always come back after a crash. The resilience of the US equity market is quite extraordinary, and this is one reason that makes lump sum investing preferable to dollar cost averaging.

But think of this; since 1802 there have only been 12 non-overlapping 20 year periods. Moreover the stock market between 1802 and the 1870’s was extremely under-developed and basically consisted of only a few railroad stocks, so consider that there has only been 8 non-overlapping 20 year periods since the 1870’s.

That’s really not a lot of evidence for the bounce-backability of equities.

If your surgeon proposed a very delicate life-saving procedure that could result in death, but happily announced that there had been eight previous successful operations, would you go for it? Or would you want to see more successful trials before subjecting yourself to it?

So by creating new 30 year retirement sequences of returns we are creating market scenarios with historically accurate equity volatility and returns, but we are simply discarding the rainbow and unicorns assumption that the equity market always comes good in the end. If at this point you are becoming red-faced with disgust at my cavalier attitude to simulating retirement periods, then perhaps visit the technical notes to get some further background before making your judgement.

Results

So… are you still with me? Are we gonna head into another dimension on this one?

We will look at 30 year periods, since that was the period of the original Trinity Study.

The proprietary AoF model cranked out 1,000 new 30 year simulations and I tested what the safe withdrawal rate would have been. Here’s my results on the top with the actual Trinity Study results below for comparison. I’ve ordered the scenarios from lowest to highest withdrawal rate and showed the number of simulations in each 1% bucket.

My Results from Another Dimension
Original Results from Trinity Study

What’s the first thing you notice?

The Trinity Study has laughably few results. It’s perhaps amazing that the most important question for wannabe retirees: how much can I spend in retirement? Is being based on a study with so few data points.

I know that there have been extensions to this study, but the lack of non-overlapping historical periods reduces the effectiveness of the results.

What’s the second thing you notice?

There is a much greater spread of results, from the very high, to the disturbingly low.

In the alternative dimensions there are scenarios with a SWR in excess of 20% and some scenarios with a SWR of less than 2%! In these alternative dimensions stocks can genuinely result in a sustained period of great returns, or a sustained bear market. Without an automatic reversion to the mean the market can stay down for a long time.

Let’s summarize the results so we can compare them.

 Alternative Dimension (75% Stocks)Trinity Study (75% stocks)
Number of simulations1,00056
Median swr7.0%6.3%
Minimum swr1.8%4.0%

What’s a Safe Withdrawal Rate in an Alternative Dimension?

Did you see in the table above that over all 1,000 simulations the minimum SWR was a buttock-clenchingly low 1.8%?

It’s tempting to dismiss that as too low and resulting from a crazily harsh sequence of returns that could only happen in another dimension. But let’s look at this single scenario a bit more closely.

In the chart below I have taken the single simulation that generated the SWR of 1.8% and shown the annual equity returns.

Stock Returns in Bad Scenario

You’ll see that there is a brutal 6 year bear market from years 6 to 12, but on the other hand there are plenty of years with returns in excess of 10%. And remember that is 10% real returns – not bad!

But what I think is the real nail in the coffin for this retiree cohort is the big drawdown of over 20% in the first year of retirement. As we know from sequence of returns risk, a big drawdown early in your retirement can be deadly without remedial action.

I’m going to guess that if those first two years had higher returns then the SWR would improve from 1.8% to around 2.5%.

What Else Have We Learned?

I was expecting some pretty bad scenarios to be honest, and correspondingly I expected some pretty high SWR scenarios. That’s just my actuarial spidey-sense tingling.

But I did not expect the SWR values to be so high in general. In the chart I showed earlier you can see a big clump of results in the 5%-8% range. Over half the scenarios produced SWR values in the range 5.5% to 8.8%.

In addition, 950 out of the 1,000 scenarios produced a SWR in excess of 3.8%!

This is saying that if you live in another dimension (and by this point you might think that I do) then to be 95% certain that you will have sufficient retirement funds then a 3.8% SWR is sufficient. You might want to be 100% certain on this important issue, but I think most of us believe in some mean reversion of equities and so this 95% level of certainty could be acceptable.

Final Thoughts

What’s my final thoughts on this?

I think the 4% rule from the Trinity Study is simply an accident that resulted from analyzing too few scenarios. Had the Trinity Study been done in another dimension the lowest safe withdrawal rate would likely have been lower and this might have changed the course of hundreds of early retirements.

It also seems that the results point the way to higher withdrawal rates being acceptable. Even a withdrawal rate north of 5% would have been successful most of the time in the alternative dimension. When you combine this relatively high withdrawal rate with some additional income in retirement (e,g, from a part-time job) and the ability to tighten your belt on spending if the market crashes then I would be comfortable with a withdrawal rate in excess of 4%.

What did you think of the Trinity Study in another dimension? Did you think that removing the mean reversion of stock returns provided an interesting insight into this? Does the very low SWR worry you, or were you cheered up by the number of scenarios that resulted in a relatively high SWR? Comment below!

I’ve written some more technical notes on this subject here.

The following is a guest post from the Actuary on Fire.

19 thoughts on “Trinity Study 2019 Edition”

  1. Pingback: The Trinity Study In Another Dimension - actuary on FIRE

  2. Pingback: May 4th, 2018 Features - Rockstar Finance :: Curating the best of money and personal finance

  3. All of the market return scenarios we’ve seen in the past are linked to the larger growth trajectory of the economy over the decades; that’s why historically equities have “recovered” and “reverted to the mean” of overall outperformance. The random walk of past returns still assumes a mean of equities outperformance, and the wider distribution of results can be expected.

    For me, mean reversion is real. But the real question is what is the mean going forward: Equity performance is entirely dependent on real growth (or not) of the larger economy.

    The evaluation of SWR really boils down to this, will the underlying trajectory of the economy going forward for the next 30-50 years match the historical trajectory of the past?
    That’s the real question, I believe.

    1. I think if the underlying trajectory is for real economic growth (or real GDP) then you can expect equities to outperform, but there is no *entitlement* for them to outperform. One of the point I make in the notes is that mean reversion is not a given, but it is a reasonable expectation for a growing economy.
      Thanks for reading!

  4. Nice montecarlo analysis similar to vanguards retirement calculator. The one risk is stocks do tend to exhibit momentum even if it’s not for sure reversion to the mean. So some correlation year to year should be considered. But I actually prefer this methodology due to the limited data set of other options. Also it should give a more conservative estimate as multiple negative years in a row are more likely in a random montecarlo. It would seem prudent to err on the side of caution for such an important question.

    1. I’m glad you mentioned momentum, since that seems to contradict mean reversion and has always confused me! Momentum (positive serial correlation) is an observed phenomena, and mean reversion (negative serial correlation) is also an observed phenomena. So I don’t know what to make of that contradiction!
      I also like this method, it’s quite clear and clear, so I might do some more analysis with this. I want to see how asset allocation might change in a new dimension. (I suspect not as much as you might think)
      Thanks, as always, FTF for visiting.

  5. Superb article AoF. I was reading on the Harry Browne Permanent Portfolio, which is a low risk low return portfolio of astonishing stability. It is so stable in fact due to it’s low risk it allows a greater SWR up to and over 5%. I think your original point dominates “Stocks are Risky” and in most modern portfolios therefore risk dominates the portfolio. You wind up paying for a measly little bit of return with a whole lot of risk. In the HBPP, stocks in no way dominate, so their risk in no way dominates but they do make a meaningful contribution to return over time. In the HBPP there are always plenty of other resources beside stocks from which to get your lunch money so you never have to sell low. In fact you are forced to buy low since yearly re-balancing forces that. I’m not hawking the HPBB but I am hawking the idea of truly diversified non-correlated risk adjusted portfolios. VTI is not diversified. VTI is stocks and purely carries the risk of stocks.

    1. You raise a great theme of whether I should sacrifice expected return for lower volatility. Your point being that a portfolio that does not drop so much in a drawdown has less ground to gain, and so with compounding will come out ahead. I’m bought into that… I think. I have certainly seen that work in my corporate life.
      But I need to do some modeling on the SWR aspect of that…

      Hey – thanks for dropping by Gasem!

  6. Interesting post once again AOF. I am paranoid so I have over-saved. I also think it makes more sense to base your draw down on a percentage of portfolio rather than 4% plus inflation.

  7. Sorry (not sorry) to be a downer, but this is poor and miseleading work. Time to go back to actuarial school.

    1. Example? You need to provide more data that just saying this is “poor and misleading”. In what ways?

  8. Didn’t the Trinity study consider a SWR successful as long as the portfolio did not reach $0 after 30 years? If so, I’d love to see your simulations using a 60+ year time horizon to better approximate current life expectancies for those pondering early retirement.

    1. Don’t forget that ERN at Early Retirement Now did a load of back-testing with different time periods and different investment strategies, so you should definitely look at that. He also just used the historical periods rather than my crazy method of jumbling up the returns.

      I might extend my work to look at different periods, but I’m just not sure it is worth it. I did this to test whether you get much insight by removing the mean-reversion condition on equities. It actually didn’t make a whole lot of difference!

  9. This is a great addition to all of the Trinity related analysis. Thanks for doing it.

    How many of those sub 4% SWRs are related to an early year market reduction like the one you highlighted? The reason I ask is that a lot of people place great faith on a 2-3 year cash pot to avoid those early sequence of return risk. If you had a 2 year cash cushion and were able to avoid those first two years in all those scenarios what does the distribution of subsequent SWRs look like?

  10. Mark -great question I shall have a look. You’re starting to get into investment strategies to mitigate sequence of return risk. You’re suggesting something a bit like the Kitces Bond Tent, where you hold more bonds (or cash) in the early years. I want to do a deep dive post on this topic at some point.
    Stay tuned!

  11. Very interesting, raises several thoughts for me:

    1) I’m always curious when going back as far as 1802 or 1870 if the distribution is substantially different for the more recent history. Especially if you’re comparing to the Trinity study that started at 1926, how many observations from 1802 to 1925 fall outside of the range implied by the Trinity study. And do those observations show up in the results that have particularly low SWR? Especially when sampling with replacement, it’d be interesting to see what the difference would be if you’d matched the Trinity study timeline.

    2) I’ve always been curious why people can do a sequential Monte Carlo study as a time series forecast without adding in forecast error. I guess the sampling with replacement gets closer to creating a broader distribution but it still is not explicit about error in forecasting future variance. Someday I’m going to dig back through my Time Series and Stochastic Process books to see if I can figure out why/if this should get left off.

    3) Regardless of your results in the Monte Carlo, you know the support of the distribution starts at -100% though it may have infinitesimal probability. I may be tempted to fit data to a parametric distribution and sample from it to give some probability to market drops that exceed observed history, but I’d be surprised if it moved the needle on a 95th percentile estimate by much. Still may drop the 3.8% result you got a bit lower. I’ve been surprised in the past when adding information about the support of a distribution.

    4) What really annoys me about the sequential Monte Carlo that was formulated in the Trinity study is that it under-samples the most recent years which should in theory contain the most information about future conditions. At least that’s my understanding of it. I don’t see that assumption called out much. I may consider trying a local bootstrap or something similar to maintain some information about sequence of returns and still allow for more samples to flesh out the distribution a bit.

    After saying all that, I think these are interesting because of the technical details. I’m not sure the results will change substantially. I think the 4% rule gives people something to measure their progress against. But it shouldn’t be taken as the final answer. It’s weird to aim at a 95% certainty anyway. As if everyone is going to just shrug and say “I guess we hit the 5% realm” if they run out of money… The Trinity study viewed differently says a 4% withdraw rate can fail. Perhaps the worst thing that came out of that study was the word “safe” in front of withdraw rate.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.