 # ⓘ Hellinger distance. In probability and statistics, the Hellinger distance is used to quantify the similarity between two probability distributions. It is a type ..

## ⓘ Hellinger distance

In probability and statistics, the Hellinger distance is used to quantify the similarity between two probability distributions. It is a type of f -divergence. The Hellinger distance is defined in terms of the Hellinger integral, which was introduced by Ernst Hellinger in 1909.

### 1.1. Definition Measure theory

To define the Hellinger distance in terms of measure theory, let P and Q denote two probability measures that are absolutely continuous with respect to a third probability measure λ. The square of the Hellinger distance between P and Q is defined as the quantity

H 2 P, Q = 1 2 ∫ d P d λ − d Q d λ 2 d λ. {\displaystyle H^{2}P,Q={\frac {1}{2}}\displaystyle \int \left{\sqrt {\frac {dP}{d\lambda }}}-{\sqrt {\frac {dQ}{d\lambda }}}\right^{2}d\lambda.}

Here, dP / dλ and dQ / d λ are the Radon–Nikodym derivatives of P and Q respectively. This definition does not depend on λ, so the Hellinger distance between P and Q does not change if λ is replaced with a different probability measure with respect to which both P and Q are absolutely continuous. For compactness, the above formula is often written as

H 2 P, Q = 1 2 ∫ d P − d Q 2. {\displaystyle H^{2}P,Q={\frac {1}{2}}\int \left{\sqrt {dP}}-{\sqrt {dQ}}\right^{2}.}

### 1.2. Definition Probability theory using Lebesgue measure

To define the Hellinger distance in terms of elementary probability theory, we take λ to be Lebesgue measure, so that dP / dλ and dQ / d λ are simply probability density functions. If we denote the densities as f and g, respectively, the squared Hellinger distance can be expressed as a standard calculus integral

H 2 f, g = 1 2 ∫ f x − g x) 2 d x = 1 − ∫ f x g x d x, {\displaystyle H^{2}f,g={\frac {1}{2}}\int \left{\sqrt {fx}}-{\sqrt {gx}}\right)^{2}\,dx=1-\int {\sqrt {fxgx}}\,dx,}

where the second form can be obtained by expanding the square and using the fact that the integral of a probability density over its domain equals 1.

The Hellinger distance H P, Q satisfies the property derivable from the Cauchy–Schwarz inequality

0 ≤ H P, Q ≤ 1. {\displaystyle 0\leq HP,Q\leq 1.}

### 1.3. Definition Discrete distributions

For two discrete probability distributions P = p 1, …, p k {\displaystyle P=p_{1},\ldots,p_{k}} and Q = q 1, …, q k {\displaystyle Q=q_{1},\ldots,q_{k}}, their Hellinger distance is defined as

H P, Q = 1 2 ∑ i = 1 k p i − q i 2, {\displaystyle HP,Q={\frac {1}{\sqrt {2}}}\;{\sqrt {\sum _{i=1}^{k}{\sqrt {p_{i}}}-{\sqrt {q_{i}}}^{2}}},}

which is directly related to the Euclidean norm of the difference of the square root vectors, i.e.

H P, Q = 1 2 ‖ P − Q ‖ 2. {\displaystyle HP,Q={\frac {1}{\sqrt {2}}}\;{\bigl \|}{\sqrt {P}}-{\sqrt {Q}}{\bigr \|}_{2}.}

Also, 1 − H 2 P, Q = ∑ i = 1 k p i q i. {\displaystyle 1-H^{2}P,Q=\sum _{i=1}^{k}{\sqrt {p_{i}q_{i}}}.}

## 2. Connection with the statistical distance

The Hellinger distance H P, Q {\displaystyle HP,Q} and the total variation distance or statistical distance δ P, Q {\displaystyle \delta P,Q} are related as follows:

H 2 P, Q ≤ δ P, Q ≤ 2 H P, Q. {\displaystyle H^{2}P,Q\leq \delta P,Q\leq {\sqrt {2}}HP,Q\.}

These inequalities follow immediately from the inequalities between the 1-norm and the 2-norm.

## 3. Properties

The Hellinger distance forms a bounded metric on the space of probability distributions over a given probability space.

The maximum distance 1 is achieved when P assigns probability zero to every set to which Q assigns a positive probability, and vice versa.

Sometimes the factor 1 / 2 {\displaystyle 1/2} in front of the integral is omitted, in which case the Hellinger distance ranges from zero to the square root of two.

The Hellinger distance is related to the Bhattacharyya coefficient B C P, Q {\displaystyle BCP,Q} as it can be defined as

H P, Q = 1 − B C P, Q. {\displaystyle HP,Q={\sqrt {1-BCP,Q}}.}

Hellinger distances are used in the theory of sequential and asymptotic statistics.

The squared Hellinger distance between two normal distributions P ∼ N μ 1, σ 1 2 {\displaystyle \scriptstyle P\,\sim \,{\mathcal {N}}\mu _{1},\sigma _{1}^{2}} and Q ∼ N μ 2, σ 2 {\displaystyle \scriptstyle Q\,\sim \,{\mathcal {N}}\mu _{2},\sigma _{2}^{2}} is:

H 2 P, Q = 1 − 2 σ 1 σ 2 σ 1 2 + σ 2 e − 1 4 μ 1 − μ 2 σ 1 2 + σ 2 2. {\displaystyle H^{2}P,Q=1-{\sqrt {\frac {2\sigma _{1}\sigma _{2}}{\sigma _{1}^{2}+\sigma _{2}^{2}}}}\,e^{-{\frac {1}{4}}{\frac {\mu _{1}-\mu _{2}^{2}}{\sigma _{1}^{2}+\sigma _{2}^{2}}}}.}

The squared Hellinger distance between two multivariate normal distributions P ∼ N μ 1, ∑ 1 {\displaystyle \scriptstyle P\,\sim \,{\mathcal {N}}\mu _{1},\sum _{1}} and Q ∼ N μ 2, ∑ 2 {\displaystyle \scriptstyle Q\,\sim \,{\mathcal {N}}\mu _{2},\sum _{2}} is:

H 2 P, Q = 1 − det ∑ 1 / 4 det ∑ 2 1 / 4 det ∑ 1 + ∑ 2 1 / 2 exp ⁡ { − 1 8 μ 1 − μ 2 T ∑ 1 + ∑ 2 − 1 μ 1 − μ 2 } {\displaystyle H^{2}P,Q=1-{\frac {\det\sum _{1}^{1/4}\det\sum _{2}^{1/4}}{\det \left{\frac {\sum _{1}+\sum _{2}}{2}}\right^{1/2}}}\exp \left\{-{\frac {1}{8}}\mu _{1}-\mu _{2}^{T}\left{\frac {\sum _{1}+\sum _{2}}{2}}\right^{-1}\mu _{1}-\mu _{2}\right\}}

The squared Hellinger distance between two exponential distributions P ∼ E x p α {\displaystyle \scriptstyle P\,\sim \,{\rm

where B {\displaystyle B} is the Beta function.

• measures of distance include the Hellinger distance histogram intersection, Chi - squared statistic, quadratic form distance match distance Kolmogorov Smirnov
• the distance of one probability distribution to the other on a statistical manifold. The divergence is a weaker notion than that of the distance in
• the original Broadway production of Jesus Christ Superstar at the Mark Hellinger Theatre. Jesus Christ Superstar sold in excess of 12 million albums. Future
• Film producer Mark Hellinger bought the rights to the title from Weegee. In 1948, Weegee s aesthetic formed the foundation for Hellinger s film The Naked
• Heavy - tailed distribution Heckman correction Hedonic regression Hellin s law Hellinger distance Helmert Wolf blocking Herdan s law Herfindahl index Heston model Heteroscedasticity
• definition, previously given. Cramer Rao bound Fisher information Hellinger distance Information geometry Amari, Shun - ichi Nagaoka, Horishi 2000 Chentsov s
• example, the Kullback Leibler divergence, Bregman divergence, and the Hellinger distance With indefinitely large samples, limiting results like the central
• Coupling  2: BRG Hellinger distance 2: R Kullback Leibler divergence  2: DCR Levy metric  2: R Total variation Total variation distance in probability
• Strouse. The Broadway production opened on August 21, 1986 at the Mark Hellinger Theatre with little advance sale and to mostly indifferent reviews, and
• at the Mark Hellinger Theatre on December 18 and ran for 329 performances Dear World     Broadway production opened at the Mark Hellinger Theatre on February
• valley of Holzhausen brook Saarbach nearby Heldburg in the Kreck. In Hellinger Tal running the river Helling and flow south of the Heldburger Land at

...
 ...
...