deepdream of
          a sidewalk
Show Question
Math and science::INF ML AI

Gompertz distribution

Gompertz distribution

The Gompertz distribution is defined equivalently by:

Probability density function

f(t)=keαtekα(eαt1)

Survival function

S(t)=ekα(eαt1)

Hazard function

h(t)=keαt

Gompertz distribution. Motivation.

A survival function can be expressed as:

S(t)=exp(t0th(x)dx) (where h is the hazard function)=exp(y(t))(thus defining y)

The transformation y is constrained so that all 3 of the following statements hold:

y(t0)=0y()=y(t)0

From here on, we are concerned with the situation where t0.

Gompertz (1825) assumed that y took the form:

y(t)=kα(eαt1).

Gompertz describes the motivation in detail, relating it to a geometric progression of deaths within long fixed length periods. The transformation satisfies the constraints above, on the condition that k>0. That k must be positive is implied by the survival function being positive function.

Gompertz distribution. Properties.

Below are some properties of the Gompertz distribution.

Mode

Differentiating f(t) and equating to zero, we find the mode:

0=f(t)=(keαt+α)kekα(eαt1)+αtα=keαtt=1αln(αk)

If α<k, then the mode is at t=0. If α>k, then the mode is at tm=1αln(αk). Wikipedia also notes that when the mode is positive, the cumulative distribution evaluated at tm is always between 0 and 0.6321:

0<F(tm)<1e1

Translation

If X:Ω[0,) is Gompertz distributed with parameters (k,α), then a forward shifted variable Y=Xta is Gompertz distributed with parameters (h(ta),α).

Proof. With a change of variable, t=tv, we will show that S(t)=S(v)S(t) where S is the same survival function as S, but with the k parameter being set to kh(v)=kexpαv. With this done, it will follow that \( S(t|t>v) = S*(t').

Let t=tv.

S(t)=ekα(eαt1)by definition of S=ekα(eα(v+t)1)substitute t=ekα(eαveαt1)expand =ekα(eαveαv+eαveαt1)(0+11) trick=ekα(eαv+eαv(eαt1)1) group=ekα(eαv1+eαv(eαt1)) rearrange=ekα(eαv1)ekeαvα(eαt1) rearrange=S(v)S(t)by definition of S

The consequence of this truncating a Gompertz distribution at time ta (i.e. setting to zero the all probability mass at times less than ta) and making a variable change such that t0=ta leaves a Gompertz distribution with parameters (h(ta),α).

f,F,S and h, definitions

Let TR be the codomain of a random variable. Typically, the codomain is non-negative, T[0,). Use tT to denote a generic value in the codomain. Let t0 denote the infimum of the codomain (typically t0=0).

Let f:RT be a probability density function of the underlying random variable. We then make a number of definitions:

  • Cumulative distribution function: F:T[0,1].
    F(t)=t0tf(x)dx
  • Survival function: S:T[0,1].
    S(t)=1F(t)
  • Hazard function: h:T[0,).
    h(t)=f(t)S(t)

    Also called the intensity function.

Survival function. Intuition.

The survival function maps a time t to a probability mass, representing the probability that the event has not occurred yet, by time tt0. This is the most natural representation to work with when answering the question: what is the probability I will live until at least age 46.

The survival function acts as a re-normalizing factor in

ft>ta(t)=f(t)S(ta)
that allows f to be transformed with the information that the event has not occurred by time ta. While f is normalized by F()=0, ft>ta should be normalized by the remaining probability mass 1F(ta)=S(ta), which will be less that 1.

When t=ta ft>ta is the hazard function:

ft>ta(ta)=h(ta)

and so, the hazard function is the continually re-normalized density function. When worrying about dying, the hazard function h(tb) tells us the danger of dying at t=tb, assuming that one has lived until tb. If you knew that you would go to war for 4 years once you reach 18, then your hazard function would sharply spike at 18 and then sharply drop again when the war ends, or you get discharged. High hazard values denote times at which you should exercise caution.


Source

Page 513 of Philosophical Transactions for the Year 1825 (link)

"Maximum-likelihood Estimation of the Parameters of the Gompertz Survival Function", by Garg et. al (1970) (link)