## ⓘ Memorylessness

In probability and statistics, memorylessness is a property of certain probability distributions. It usually refers to the cases when the distribution of a "waiting time" until a certain event does not depend on how much time has elapsed already. To model memoryless situations accurately, we must constantly forget which state the system is in: the probabilities would not be influenced by the history of the process.

Only two kinds of distributions are memoryless: geometric distributions of non-negative integers and the exponential distributions of non-negative real numbers.

In the context of Markov processes, memorylessness refers to the Markov property, an even stronger assumption which implies that the properties of random variables related to the future depend only on relevant information about the current time, not on information from further in the past. The present article describes the use outside the Markov property.

### * 1.1. Waiting time examples * With memory

Most phenomena are not memoryless, which means that observers will obtain information about them over time. For example, suppose that X is a random variable, the lifetime of a car engine, expressed in terms of "number of miles driven until the engine breaks down". It is clear, based on our intuition, that an engine which has already been driven for 300.000 miles will have a much lower X than would a second equivalent engine which has only been driven for 1.000 miles. Hence, this random variable would not have the memorylessness property.

### * 1.2. Waiting time examples * Without memory

In contrast, let us examine a situation which would exhibit memorylessness. Imagine a long hallway, lined on one wall with thousands of safes. Each safe has a dial with 500 positions, and each has been assigned an opening position at random. Imagine that an eccentric person walks down the hallway, stopping once at each safe to make a single random attempt to open it. In this case, we might define random variable X as the lifetime of their search, expressed in terms of "number of attempts the person must make until they successfully open a safe". In this case, E will always be equal to the value of 500, regardless of how many attempts have already been made. Each new attempt has a 1/500 chance of succeeding, so the person is likely to open exactly one safe sometime in the next 500 attempts – but with each new failure they make no "progress" toward ultimately succeeding. Even if the safe-cracker has just failed 499 consecutive times or 4.999 times, we expect to wait 500 more attempts until we observe the next success. If, instead, this person focused their attempts on a single safe, and "remembered" their previous attempts to open it, they would be guaranteed to open the safe after, at most, 500 attempts.

Real-life examples of memorylessness include the universal law of radioactive decay, which describes the time until a given radioactive particle decays, and the time until the discovery of a new Bitcoin block. An often used theoretical example of memorylessness in queueing theory is the time a storekeeper must wait before the arrival of the next customer.

## 2. Discrete memorylessness

Suppose X is a discrete random variable whose values lie in the set {0, 1, 2.}. The probability distribution of X is memoryless precisely if for any m and in {0, 1, 2.}, we have

Pr X > m + n ∣ X ≥ m = Pr X > n. {\displaystyle \PrX> m+n\mid X\geq m=\PrX> n.}Here, PrX > m + n | X ≥ m denotes the conditional probability that the value of X is greater than m + n given that it is greater than or equal to m.

The only memoryless discrete probability distributions are the geometric distributions, which count the number of independent, identically distributed Bernoulli trials needed to get one "success". In other words, these are the distributions of waiting time in a Bernoulli process.

### * 2.1. Discrete memorylessness * A frequent misunderstanding

"Memorylessness" of the probability distribution of the number of trials X until the first success means that, for example,

Pr X > 40 ∣ X ≥ 30 = Pr X > 10. {\displaystyle \PrX> 40\mid X\geq 30=\PrX> 10.}It does not mean that

Pr X > 40 ∣ X ≥ 30 = Pr X > 40, {\displaystyle \PrX> 40\mid X\geq 30=\PrX> 40,}which would be true only if the events X > 40 and X ≥ 30 were independent, i.e. Pr X ≥ 30 = 1. {\displaystyle \PrX\geq 30=1.}

## 3. Continuous memorylessness

Suppose X is a continuous random variable whose values lie in the non-negative real numbers "0, ∞). The probability distribution of X is memoryless precisely if for any non-negative real numbers t and s, we have

Pr X > t + s ∣ X > t = Pr X > s. {\displaystyle \PrX> t+s\mid X> t=\PrX> s.}This is similar to the discrete version, except that s and t are constrained only to be non-negative real numbers instead of integers. Rather than counting trials until the first "success", for example, we may be marking time until the arrival of the first phone call at a switchboard.

### * 3.1. Continuous memorylessness * The memoryless distribution is an exponential distribution

The only memoryless continuous probability distributions are the exponential distributions, so memorylessness completely characterizes the exponential distributions among all continuous ones. The property is derived through the following proof:

To see this, first define the survival function, S, as

S t = Pr X > t. {\displaystyle St=\PrX> t.}Note that S t is then monotonically decreasing. From the relation

Pr X > t + s ∣ X > t = Pr X > s {\displaystyle \PrX> t+s\mid X> t=\PrX> s}and the definition of conditional probability, it follows that

Pr X > t + s Pr X > t = Pr X > s. {\displaystyle {\frac {\PrX> t+s}{\PrX> t}}=\PrX> s.}This gives the functional equation which is a result of the memorylessness property:

S t + s = S t s {\displaystyle St+s=StSs}From this, we must have for example:

S 2 = S 1 2 {\displaystyle S2=S1^{2}\quad } S 1 = S 1 / 2 i.e. S 1 / 2 = S 1 / 2. {\displaystyle S1=S1/2^{2}{\text{ i.e.}}\quad S1/2=S1^{1/2}.}In general:

S a = S 1 a {\displaystyle Sa=S1^{a}}The only continuous function that will satisfy this equation for any positive, rational a is:

S a = S 1 a = e ln S 1) a = e − λ a, {\displaystyle Sa=S1^{a}=e^{\lnS1)a}=e^{-\lambda a},}where λ = − ln S 1). {\displaystyle \lambda =-\lnS1).}

Therefore, since S a is a probability and must have λ > 0, {\displaystyle \lambda > 0,} then any memorylessness function must be an exponential.

Put a different way, is a monotone decreasing function meaning that for times x ≤ y, {\displaystyle x\leq y,} then S x ≥ S y. {\displaystyle Sx\geq Sy.})

The functional equation alone will imply that S restricted to rational multiples of any particular number is an exponential function. Combined with the fact that is monotone, this implies that S over its whole domain is an exponential function.