# ⓘ Yule–Simon distribution. In probability and statistics, the Yule–Simon distribution is a discrete probability distribution named after Udny Yule and Herbert A. ..

## ⓘ Yule–Simon distribution

In probability and statistics, the Yule–Simon distribution is a discrete probability distribution named after Udny Yule and Herbert A. Simon. Simon originally called it the Yule distribution.

The probability mass function pmf of the Yule–Simon ρ distribution is

f k ; ρ = ρ B ⁡ k, ρ + 1, {\displaystyle fk;\rho=\rho \operatorname {B} k,\rho +1,}

for integer k ≥ 1 {\displaystyle k\geq 1} and real ρ > 0 {\displaystyle \rho > 0}, where B {\displaystyle \operatorname {B} } is the beta function. Equivalently the pmf can be written in terms of the rising factorial as

f k ; ρ = ρ Γ ρ + 1 k + ρ + 1 _, {\displaystyle fk;\rho={\frac {\rho \Gamma \rho +1}{k+\rho^{\underline {\rho +1}}}},}

where Γ {\displaystyle \Gamma } is the gamma function. Thus, if ρ {\displaystyle \rho } is an integer,

f k ; ρ = ρ ρ! k − 1! k + ρ!. {\displaystyle fk;\rho={\frac {\rho \,\rho!\,k-1!}{k+\rho!}}.}

The parameter ρ {\displaystyle \rho } can be estimated using a fixed point algorithm.

The probability mass function f has the property that for sufficiently large k we have

f k ; ρ ≈ ρ Γ ρ + 1 k ρ + 1 ∝ 1 k ρ + 1. {\displaystyle fk;\rho\approx {\frac {\rho \Gamma \rho +1}{k^{\rho +1}}}\propto {\frac {1}{k^{\rho +1}}}.}

This means that the tail of the Yule–Simon distribution is a realization of Zipfs law: f k ; ρ {\displaystyle fk;\rho} can be used to model, for example, the relative frequency of the k {\displaystyle k} th most frequent word in a large collection of text, which according to Zipfs law is inversely proportional to a typically small power of k {\displaystyle k}.

## 1. Occurrence

The Yule–Simon distribution arose originally as the limiting distribution of a particular stochastic process studied by Yule as a model for the distribution of biological taxa and subtaxa. Simon dubbed this process the "Yule process" but it is more commonly known today as a preferential attachment process. The preferential attachment process is an urn process in which balls are added to a growing number of urns, each ball being allocated to an urn with probability linear in the number the urn already contains.

The distribution also arises as a compound distribution, in which the parameter of a geometric distribution is treated as a function of random variable having an exponential distribution. Specifically, assume that W {\displaystyle W} follows an exponential distribution with scale 1 / ρ {\displaystyle 1/\rho } or rate ρ {\displaystyle \rho }:

W ∼ Exponential ⁡ ρ, {\displaystyle W\sim \operatorname {Exponential} \rho,}

with density

h w ; ρ = ρ exp ⁡ − ρ w. {\displaystyle hw;\rho=\rho \exp-\rho w.}

Then a Yule–Simon distributed variable K has the following geometric distribution conditional on W:

K ∼ Geometric ⁡ 1 − exp ⁡ − W). {\displaystyle K\sim \operatorname {Geometric} 1-\exp-W)\.}

The pmf of a geometric distribution is

g k ; p = p 1 − p k − 1 {\displaystyle gk;p=p1-p^{k-1}}

for k ∈ { 1, 2, … } {\displaystyle k\in \{1.2,\dotsc \}}. The Yule–Simon pmf is then the following exponential-geometric compound distribution:

f k ; ρ = ∫ 0 ∞ g k ; exp ⁡ − w) h w ; ρ d w. {\displaystyle fk;\rho=\int _{0}^{\infty }gk;\exp-w)hw;\rho\,dw.}

The maximum likelihood estimator for the parameter ρ {\displaystyle \rho } given the observations k 1, k 2, k 3, …, k N {\displaystyle k_{1},k_{2},k_{3},\dots,k_{N}} is the solution to the fixed point equation

ρ t + 1 = N + a − 1 b + ∑ i = 1 N ∑ j = 1 k i 1 ρ t + j, {\displaystyle \rho ^{t+1}={\frac {N+a-1}{b+\sum _{i=1}^{N}\sum _{j=1}^{k_{i}}{\frac {1}{\rho ^{t}+j}}}},}

where b = 0, a = 1 {\displaystyle b=0,a=1} are the rate and shape parameters of the gamma distribution prior on ρ {\displaystyle \rho }.

This algorithm is derived by Garcia by directly optimizing the likelihood. Roberts and Roberts

generalize the algorithm to Bayesian settings with the compound geometric formulation described above. Additionally, Roberts and Roberts are able to use the Expectation Maximisation EM framework to show convergence of the fixed point algorithm. Moreover, Roberts and Roberts derive the sub-linearity of the convergence rate for the fixed point algorithm. Additionally, they use the EM formulation to give 2 alternate derivations of the standard error of the estimator from the fixed point equation. The variance of the λ {\displaystyle \lambda } estimator is

Var ⁡ λ ^ = 1 N λ ^ 2 − ∑ i = 1 N ∑ j = 1 k i 1 λ ^ + j 2, {\displaystyle \operatorname {Var} {\hat {\lambda }}={\frac {1},}

the standard error is the square root of the quantity of this estimate divided by N.

## 2. Generalizations

The two-parameter generalization of the original Yule distribution replaces the beta function with an incomplete beta function. The probability mass function of the generalized Yule–Simonρ, α distribution is defined as

f k ; ρ, α = ρ 1 − α ρ B 1 − α k, ρ + 1, {\displaystyle fk;\rho,\alpha={\frac {\rho }{1-\alpha ^{\rho }}}\;\mathrm {B} _{1-\alpha }k,\rho +1,\,}

with 0 ≤ α < 1 {\displaystyle 0\leq \alpha

• Udny Yule FRS 18 February 1871 26 June 1951 usually known as Udny Yule was a British statistician, particularly known for the Yule distribution Yule
• Price s model Proof of stake Simon model Stochastic processes Wealth condensation Yule Simon distribution Bibliogram Yule G. U. 1925 A Mathematical
• probability distribution by itself, it is not associated to the Zipf s law with same exponent. See also Yule Simon distribution The Zeta distribution is defined
• frequency in English Wikipedia above also demonstrates this. The Yule Simon distribution that results from preferential attachment intuitively, the rich
• Poisson - distributed random variables. The skew elliptical distribution The Yule Simon distribution The zeta distribution has uses in applied statistics and statistical
• names: authors list link Greenwood, M. Yule G. U. 1920 An inquiry into the nature of frequency distributions representative of multiple happenings
• statistics, the multivariate normal distribution multivariate Gaussian distribution or joint normal distribution is a generalization of the one - dimensional
• Pareto principle The Long Tail Parsimony Preferential attachment Yule - Simon distribution Bierbaum, Esther: A Paradigm for the 90s American Libraries
• on Zeta distribution discrete Yule Simon distribution discrete Student s t - distribution continuous of which the Cauchy distribution is a special
• elliptical distributions Cambanis, Huang Simons 1981, p. 368 Fang, Kotz Ng 1990, Chapter 2.9 Complex elliptically symmetric distributions pp. 64 - 66

...
 ...
...