Van Houtum distribution

Probability distribution
Van Houtum distribution
Probability mass function
Van Houtum distribution probability mass function example
Parameters p a , p b [ 0 , 1 ]  and  a , b Z  with  a b {\displaystyle p_{a},p_{b}\in [0,1]{\text{ and }}a,b\in \mathbb {Z} {\text{ with }}a\leq b}
Support k { a , a + 1 , , b 1 , b } {\displaystyle k\in \{a,a+1,\dots ,b-1,b\}\,}
PMF { p a if  u = a ; p b if  u = b 1 p a p b b a 1 if  a < u < b 0 otherwise {\displaystyle {\begin{cases}p_{a}&{\text{if }}u=a;\\p_{b}&{\text{if }}u=b\\{\frac {1-p_{a}-p_{b}}{b-a-1}}&{\text{if }}a<u<b\\0&{\text{otherwise}}\end{cases}}}
CDF { 0 if u < a ; p a if  u = a p a + x a 1 p a p b b a 1 if  a < u < b 1 if  u b {\displaystyle {\begin{cases}0&{\textrm {if}}u<a;\\p_{a}&{\text{if }}u=a\\p_{a}+\lfloor x-a\rfloor {\frac {1-p_{a}-p_{b}}{b-a-1}}&{\text{if }}a<u<b\\1&{\text{if }}u\geq b\end{cases}}}
Mean a p a + b p b + ( 1 p a p b ) a + b 2 {\displaystyle ap_{a}+bp_{b}+(1-p_{a}-p_{b}){\frac {a+b}{2}}}
Mode N/A
Variance

  a 2 p a + b 2 p b   {\displaystyle \ a^{2}p_{a}+b^{2}p_{b}-{}\ } ( a + b ) ( 1 p a p b ) + 2 a p a + 2 b p b 4 {\displaystyle {\frac {(a+b)(1-p_{a}-p_{b})+2ap_{a}+2bp_{b}}{4}}}

+ b ( 2 b 1 ) ( b 1 ) a ( 2 a + 1 ) ( a + 1 ) 6 {\displaystyle {}+{\frac {b(2b-1)(b-1)-a(2a+1)(a+1)}{6}}}
Entropy

  p a ln ( p a ) p b ln ( p b )   {\displaystyle \ -p_{a}\ln(p_{a})-p_{b}\ln(p_{b})-{}\ }

( 1 p a p b ) ln ( 1 p a p b b a 1 ) {\displaystyle (1-p_{a}-p_{b})\ln \left({\frac {1-p_{a}-p_{b}}{b-a-1}}\right)}
MGF e t a p a + e t b p b + 1 p a p b b a 1 e ( a + 1 ) t e b t e t 1 {\displaystyle e^{ta}p_{a}+e^{t}bp_{b}+{\frac {1-p_{a}-p_{b}}{b-a-1}}{\frac {e^{(a+1)t}-e^{bt}}{e^{t}-1}}}
CF e i t a p a + e i t b p b + 1 p a p b b a 1 e ( a + 1 ) i t e b i t e i t 1 {\displaystyle e^{ita}p_{a}+e^{itb}p_{b}+{\frac {1-p_{a}-p_{b}}{b-a-1}}{\frac {e^{(a+1)it}-e^{bit}}{e^{it}-1}}}

In probability theory and statistics, the Van Houtum distribution is a discrete probability distribution named after prof. Geert-Jan van Houtum.[1] It can be characterized by saying that all values of a finite set of possible values are equally probable, except for the smallest and largest element of this set. Since the Van Houtum distribution is a generalization of the discrete uniform distribution, i.e. it is uniform except possibly at its boundaries, it is sometimes also referred to as quasi-uniform.

It is regularly the case that the only available information concerning some discrete random variable are its first two moments. The Van Houtum distribution can be used to fit a distribution with finite support on these moments.

A simple example of the Van Houtum distribution arises when throwing a loaded dice which has been tampered with to land on a 6 twice as often as on a 1. The possible values of the sample space are 1, 2, 3, 4, 5 and 6. Each time the die is thrown, the probability of throwing a 2, 3, 4 or 5 is 1/6; the probability of a 1 is 1/9 and the probability of throwing a 6 is 2/9.

Probability mass function

A random variable U has a Van Houtum (a, b, pa, pb) distribution if its probability mass function is

Pr ( U = u ) = { p a if  u = a ; p b if  u = b 1 p a p b b a 1 if  a < u < b 0 otherwise {\displaystyle \Pr(U=u)={\begin{cases}p_{a}&{\text{if }}u=a;\\[8pt]p_{b}&{\text{if }}u=b\\[8pt]{\dfrac {1-p_{a}-p_{b}}{b-a-1}}&{\text{if }}a<u<b\\[8pt]0&{\text{otherwise}}\end{cases}}}

Fitting procedure

Suppose a random variable X {\displaystyle X} has mean μ {\displaystyle \mu } and squared coefficient of variation c 2 {\displaystyle c^{2}} . Let U {\displaystyle U} be a Van Houtum distributed random variable. Then the first two moments of U {\displaystyle U} match the first two moments of X {\displaystyle X} if a {\displaystyle a} , b {\displaystyle b} , p a {\displaystyle p_{a}} and p b {\displaystyle p_{b}} are chosen such that:[2]

a = μ 1 2 1 + 12 c 2 μ 2 b = μ + 1 2 1 + 12 c 2 μ 2 p b = ( c 2 + 1 ) μ 2 A ( a 2 A ) ( 2 μ a b ) / ( a b ) a 2 + b 2 2 A p a = 2 μ a b a b + p b where  A = 2 a 2 + a + 2 a b b + 2 b 2 6 . {\displaystyle {\begin{aligned}a&=\left\lceil \mu -{\frac {1}{2}}\left\lceil {\sqrt {1+12c^{2}\mu ^{2}}}\right\rceil \right\rceil \\[8pt]b&=\left\lfloor \mu +{\frac {1}{2}}\left\lceil {\sqrt {1+12c^{2}\mu ^{2}}}\right\rceil \right\rfloor \\[8pt]p_{b}&={\frac {(c^{2}+1)\mu ^{2}-A-(a^{2}-A)(2\mu -a-b)/(a-b)}{a^{2}+b^{2}-2A}}\\[8pt]p_{a}&={\frac {2\mu -a-b}{a-b}}+p_{b}\\[12pt]{\text{where }}A&={\frac {2a^{2}+a+2ab-b+2b^{2}}{6}}.\end{aligned}}}

There does not exist a Van Houtum distribution for every combination of μ {\displaystyle \mu } and c 2 {\displaystyle c^{2}} . By using the fact that for any real mean μ {\displaystyle \mu } the discrete distribution on the integers that has minimal variance is concentrated on the integers μ {\displaystyle \lfloor \mu \rfloor } and μ {\displaystyle \lceil \mu \rceil } , it is easy to verify that a Van Houtum distribution (or indeed any discrete distribution on the integers) can only be fitted on the first two moments if [3]

c 2 μ 2 ( μ μ ) ( 1 + μ μ ) 2 + ( μ μ ) 2 ( 1 + μ μ ) . {\displaystyle c^{2}\mu ^{2}\geq (\mu -\lfloor \mu \rfloor )(1+\mu -\lceil \mu \rceil )^{2}+(\mu -\lfloor \mu \rfloor )^{2}(1+\mu -\lceil \mu \rceil ).}

References

  1. ^ A. Saura (2012), Van Houtumin jakauma (in Finnish). BSc Thesis, University of Helsinki, Finland
  2. ^ J.J. Arts (2009), Efficient optimization of the Dual-Index policy using Markov Chain approximations. MSc Thesis, Eindhoven University of Technology, The Netherlands (Appendix B)
  3. ^ I.J.B.F. Adan, M.J.A. van Eenige, and J.A.C. Resing. "Fitting discrete distributions on the first two moments". Probability in the Engineering and Informational Sciences, 9:623–632, 1996.
  • v
  • t
  • e
Probability distributions (list)
Discrete
univariate
with finite
support
with infinite
support
Continuous
univariate
supported on a
bounded interval
supported on a
semi-infinite
interval
supported
on the whole
real line
with support
whose type varies
Mixed
univariate
continuous-
discrete
Multivariate
(joint)DirectionalDegenerate
and singular
Degenerate
Dirac delta function
Singular
Cantor
Families
  • Category
  • Commons