Retracing Intermediate Quantities in the Mortality Model of Rohde et al 2004

This note was motivated by and part of my goal to better understand Rohde et al's modeling work in the 2004 paper "Modelling the recent common ancestry of all living humans". My initial current steps are described in this blog post.

In this post, I am focussing on the mortality model used in Rohde et al's work. There were a few more steps than I had realized (relative to my own unfamiliarity with actuarial methods), and I wanted to dive in a little bit.

The goal is to verify for myself the following statement from Rohde et al. for their model:

"...the death rate, β, was raised to 12.5 for the purposes of the model. This produces an average life span of 51.8 for those who reach maturity."

Based on the derivation below for the needed quantities, I am currently getting 51.866.

Note: This is a work-in-progress. I could be off-by-one in the equations below based on the interpretation of "death at age s" vs "death by age s" kind of thing.
Mortality Model in Rohde et al

In Rohde's model, the probability $p(s)$ that an individual (referred to as a "sim" in their paper) dies at age $s$ (in years), conditional on not having died before age $s$, is assumed to follow a "discrete Gompertz-Makeham form":

$p(s) = \alpha + (1-\alpha) \exp\{ (s - maxAge)/\beta \} $
"Survival Tree" for Mortality Model
given that one die not die before age $s$
($p(s)$ defined in the text)


where the parameters and the values used are shown in the following table.

ParameterDefinitionRohde et al 2004
$\alpha$Accident rate0.01
$maxAge$Maximum Lifespan100
$\beta$Death Rate12.5

While in Rohde's model you can use this formula to determine the probability that a "sim" of a given age dies, you can also calculate other summary statistics without need for simulation.

But you have to be careful.

This is because when calculating any statistics, you must first specify at what age are you are starting.

The "survival tree" figure above, inspired by that in the Wikipedia page on life expectancy, shows how the mortality probabilities are used sequentially.

If a sim has lived to age $s$, then the expected lifespan given that they have lived to that age $s$ is

$$ \begin{align*} Expected\ Lifespan\ if\ live\ to\ s &= \sum_{k=s}^{MaxAge} k\ *\ prob(die\ at\ age\ k\ given\ that\ survive\ to\ age\ s) \\ &= \sum_{m=0}^{MaxAge-s} (s+m)\ *\ prob(die\ at\ age\ (s+m)\ given\ that\ survive\ to\ age\ s) \\ &= s * \sum_{m=0}^{MaxAge-s} prob(die\ at\ age\ (s+m)\ given\ that\ survive\ to\ age\ s) \\ & \ \ \ \ + \sum_{m=0}^{MaxAge-s} m \ *\ prob(die\ at\ age\ (s+m)\ given\ that\ survive\ to\ age\ s) \\ &= s + \sum_{m=0}^{MaxAge-s} m \ *\ prob(die\ at\ age\ (s+m)\ given\ that\ survive\ to\ age\ s) \end{align*} $$

where the sum that is multiplied by $s$ sums to one because the sim must die at one of those ages.

The (simple) pattern for the needed probabilities can be obtained by referring to the "survival tree" figure. For example,

$$ \begin{align*} prob(die \ at\ age\ s\ given\ that\ did\ not\ die\ before\ age\ s) &= p(s) \\ prob(die \ at\ age\ s+1\ given\ that\ survive\ to\ age\ s) &= (1-p(s)) * p(s+1) \\ prob(die \ at\ age\ s+2\ given\ that\ survive\ to\ age\ s) &= (1-p(s)) * (1-p(s+1)) * p(s+2) \\ prob(die \ at\ age\ s+3\ given\ that\ survive\ to\ age\ s) &= (1-p(s)) * (1-p(s+1)) * (1-p(s+2)) * p(s+3)\\ &\vdots \\ \end{align*} $$

or, more generally $$prob(die\ at\ age\ s+m\ given\ that\ survive\ to\ age\ s) = p(s+m) * \prod_{n=0}^{m-1} (1-p(s+n)) $$ which can be used to obtain the simple-to-calculate

$$ \begin{equation} Expected\ Lifespan\ if\ live\ to\ s = s + \sum_{m=0}^{MaxAge-s} m\ *\ p(s+m) * \prod_{n=0}^{m-1} (1-p(s+n)) \label{\eqnum} \end{equation} $$

Obviously, one could remove the first term in the sum.

After a bit of tedious thrashing, I think this might be correct. Note that this rederivation is I'm sure a trivial thing for those familiar with actuarial concepts and techniques.

Checking Average Lifespan Reported in Rohde et al

So now I can use the formula for expected lifespan above to check the following statement in Rohde et al for the mortality model used:

"...the death rate, β, was raised to 12.5 for the purposes of the model. This produces an average life span of 51.8 for those who reach maturity."

When using the parameter values reported in Rohde et al in the equation for expected lifespan for those who reach maturity ($s=16$), I get a value of 51.866. Is that the same as 51.8? I don't know if he truncated instead of rounded, or I am wrong. It seems pretty close.

Note that as for my running of the population model itself, the average lifespans for those reaching maturity in a given simulation is consistently slightly higher than 51.8. This could be due to any number of trivial issues with my implementation, and I hope to be looking at that later.

Postscript: An Alternative Form?

I also noticed that the following equation seems to yield the same result for expected lifespan. There must be some simple algebraic collapsing going on.... or I am wrong.

$$ \begin{equation} Expected\ Lifespan\ if\ live\ to\ s = s + \sum_{m=0}^{MaxAge-s} \prod_{n=0}^{m} (1-p(s+n)) \end{equation} $$

No comments:

Post a Comment

Popular Posts