Frequency Distributions
Frequency Distributions
Stephen Mildenhall
September 1999
1. Backgrounders
1.1 Moment Notation per [JKK] and [PW]
This section contains some basic definitions and notation.
- ·
-
The rth uncorrected moment, moment about zero, raw moment
or moment about the origin is
- ·
-
The rth corrected moment, moment about
the mean, or central moment is
- ·
-
Define m: = E(X).
-
·
-
Variance is second central moment s2: = m2.
-
·
-
CV is Ö{m2}/m = sX/m.
-
·
-
Index of skewness is
a3(X) = Ö{b1(X)} = m3/m23/2.
-
·
-
Index of kurtosis is
a3(X) = b2(X) = m4/m22.
-
·
-
Corrected from uncorrected moments:
|
mr = E(X-E(X))r = |
r å
j = 0
|
(-1)r |
æ ç
è
|
r
j
|
ö ÷
ø
|
m¢r-jmj. |
|
In particular:
Note that you can do these in your head from the binomial
coefficients, remembering that m is the only tricky one!
- ·
-
Raw moments in terms of the central moments:
- ·
-
The rth descending factorial moment is
[PW] call these simply factorial moments and denote them m(r).
- ·
-
Factorial moments interms of uncorrected or raw moments:
- ·
-
Raw moments in terms of factorial moments
In general we have
|
m¢r = |
r å
j = 1
|
S(r,j)m¢[j] |
|
where S(r,j) are the Stirling numbers of the second kind.
- ·
-
The cumulants, or semi-invariants, are defined as the
coefficients of tr/r! in the Taylor expansion of the MGF (see below):
|
KX(t) = logMX(t) = |
å
| krtr/t!. |
|
For independent X and Y, kr(X+Y) = kr(X)+kr(Y).
- ·
-
Cumulants interms of the central moments:
Generating Function Notation per [JKK]
The characteristic function is
The probability generating function is
where Pj = Pr(X = j). Thus f(t) = G(eit). The moment generating
function is M(t) = G(et). The cumulant generating function is
K(t) = lnG(et).
We have
|
m¢r = |
drG(et) dtr
|
ê ê
ê
|
t = 0
|
. |
|
Also, since the factorial moment generating function is
we have
|
m¢[r] = |
drG(1+t) dtr
|
ê ê
ê
|
t = 0
|
. |
|
Mixtures and Stopped Sum distributions per [JKK]
[JKK] write mixtures as
|
NB = Poisson(Q) |
Ù
Q
|
Gamma(a,b). |
|
The PGF of a mixture is the mixture of the PGF's.
Examples
- ·
-
A Gamma mixture of Poissons is a negative binomial.
-
·
-
An inverse Gaussian mixture of Poissons is a PIG. The
Generalized IG distribution gives Sichel's distribution.
-
·
-
A Poisson mixture of Poissons is a Neyman Type A
distribution. By Gurland it is also a Poisson-stopped sum of Poisson
distributions.
-
·
-
A Beta mixture of NBs gives the Beta-Negative Binomial. The
mixture is
|
NB = NB(k,P) |
Ù
p = Q-1
|
Beta(a,b). |
|
where Q = 1+P. Here p: = Q-1 has beta distribution with pdf
If the PGF can be written as G1(G2(z)) then Feller calls the
result a ``generalized'' distribution. F1 the generalized
distribution and F2 is the generalizing distribution. These are the
infinitely divisible distributions, by Levy's theorem. They are also
called stopped sum distributions.
Write the distributions with a Ú, so: G1(G2(z))
corresponds to F1ÚF2. Note that
|
G1(G2(z)) ~ F1 |
Ú
| F2 ~ Count |
Ú
| Severity. |
|
SayF1ÚF2 as F1-stopped summed-F2 distribution. For example
|
NB = Poisson |
Ú
| Logarithmic. |
|
Theorem. Let distributions F1, F2 have pgf's G1(z) = \sumpkzk and G2(z), where G2(z) depends on a parameter fin such a way that
Then the mixed distribution represented by
has the pgf
so
For example, the Poisson, binomial and negative binomial distributions
all have pgf's of the required form:
|
|
æ ç
è
|
p 1-qz
|
ö ÷
ø
|
kf
|
= |
æ ç
è
|
æ ç
è
|
p 1-qz
|
ö ÷
ø
|
k
|
ö ÷
ø
|
f
|
. |
|
2. Poisson Distribution
See [JKK] Chapter 4, especially section 3.
- ·
-
Parameter: q
-
·
-
Pr(X = x) = exp(-q)qx/x!.
3. Negative Binomial Distribution
See [JKK] Chapter 5.
- ·
-
Parameters: k = r and p, q: = 1-p.
-
·
-
|
|
Pr
| (X = x) = |
æ ç
è
|
k +x-1
k-1
|
ö ÷
ø
|
pk qx = |
G(k +x) G(k)x!
|
pk qx |
|
[JKK] prefer a parameterization by P and k. They write
Q = 1+P. Then p = 1/(1+P) = 1/Q. [PW] use r = k and b = P. These
give the following view.
- ·
-
Parameters: k and P, Q = 1+P
-
·
-
|
|
Pr
| (X = x) = |
æ ç
è
|
k +x-1
k-1
|
ö ÷
ø
|
|
æ ç
è
|
1- |
P Q
|
ö ÷
ø
|
k
|
|
æ ç
è
|
P Q
|
ö ÷
ø
|
x
|
|
|
- ·
-
or
|
|
Pr
| (X = x) = |
æ ç
è
|
k +x-1
k-1
|
ö ÷
ø
|
|
æ ç
è
|
1 1+p
|
ö ÷
ø
|
k
|
|
æ ç
è
|
p 1+p
|
ö ÷
ø
|
x
|
|
|
4. Logarithmic Distribution
See [JKK] Chapter 7. This is a single parameter family supported on
the positive integers. The parameter is q. Letting
a = -ln((1-q))-1 we have
This distribution is not easy to deal with.
- ·
-
Parameter: 0 < q < 1
-
·
-
Pr(X = x) = aqx / x
5. Stopped Sum Distributions
- ·
-
Neyman Type A: Poisson sum of Poissons. Limited since ratio of
skewness of kurtosis falls in a tight range. No closed form expression
for density, but easy to use FFT methods. See
other Neyman distributions. See [JKK] Chapter 9, Section 6.
-
·
-
Thomas's Distribution is a Neyman Type A, where the summed
distribution is a shifted Poisson, ensuring that each occurrence
yeilds at least one claim. See page 392.
-
·
-
Polya-Aeppli distribution is a Poisson stopped Shifted Geometric
distribution. The Geometric distribution is a NB with k = 1, so the
variance multiplier equals m+1. Could be useful for clash, but the
``number of claims per occurrence'' distribution is very
limited. Again, no closed form for probabilities but easy to estimate
using FFT. See page 378.
-
·
-
Poisson-Pascal distribution, also called the generalized
Polya-Aeppli distribution, is a Poisson stopped sum of negative
binomial distributions. Can also be regarded as a mixture of negative
binomial (k,P)'s where k has a Poisson distribution. See page 382.
-
·
-
The Generalized Poisson-Pascal distribution ([PW] page 259) is a
Poisson stopped sum of truncated (at zero) negative binomial
distributions. The PGF is obvious.
Per an interesting table on p 253 of we have the following
formulae for the third moments about the mean.
|
| |
|
| |
|
m3 = 3s2-2m+ |
m-2 m-1
|
|
(s2-m)2 m
|
|
| |
| |
| |
| |
| m3 = 3s2-2m+ |
r+2 r+1
|
|
(s2-m)2 m
|
|
|
| |
|
Note that r > -1 in the last line give a great deal of flexibility.
Beta-Negative Binomial, a NB mixed over the variance multplier
distributed as a beta should have a lot of potential as a
distribution. However, the PGF involves 2F1 which makes it very
hard to deal with.
7. Generalized Poisson-Pascal
Distribution, [PW]
The GPP is a Poisson stopped-sum of extended truncated Negative Binomial
distributions. It is a three parameter distribution. It has PGF
|
G(z) = exp |
æ ç
è
|
q |
æ ç
è
|
(1+P-Pz)-r-(1+P)-r 1-(1+P)-r
|
-1 |
ö ÷
ø
|
ö ÷
ø
|
. |
|
Note that provided 1-(1+P)-r = 1-pr > 0
|
G(z) = exp |
æ ç
è
|
q 1-(1+P)-r
|
((1+P-Pz)-r-1) |
ö ÷
ø
|
|
|
is a valid PGF for a Poisson-Negative Binomial (Poisson-Pascal). The
condition is necessary so that the frequency is non-negative.
Thus in the Poisson-Pascal case the distribution can be
regarded as a Poisson-NB without zero truncation, or a Poisson-ZTNB,
with an adjusted primary Poisson frequency.
Special cases of the GPP include:
- ·
-
r = 1 is a Poisson-Geometric
-
·
-
r > 0 is a Poisson-Pascal, aka Poisson-Negative Binomial
-
·
-
-1 < r < 0 is a Poisson-ETNB, and you need the zero truncation.
-
·
-
r = -1/2 is a Poisson-Inverse Gaussian mixture.
8. PIG and GPIG Distributions
The PIG is a Poisson mixed over an Inverse Gaussian distribution. The
PIG is closed under certain convolutions, see [PW]. It has a thicker
tail than the Negative Binomial distribution. It is a special case of the
generalized Poisson-Pascal distribution with r = -1/2.
References for this section are from [PW], Section 7.8.3.
Per page 260, the PIG is a Poisson ETNB.
Per page 261, the Poisson ETNB with -1 < r < 0 is a Poisson
mixture with a stable distribution, (see also Feller p 448, 581).
- ·
-
PIG Parameters: m and b.
-
·
-
See below with l = -1/2.
The Generalized Poisson inverse Gaussian distribution is also called
Sichel's distribution.
- ·
-
Sichel's Distribution Parameters: m and b
and l.
-
·
-
Pr(X = x) = [( mn)/ n!][( Kl+n(mb-1Ö{1+2b}))/( Kl+n(mb-1) )] (1+2b)-(l+n)/2.
The rth factorial moment
|
m[r] = mr |
Kl+r(m/b) Kl(m/b)
|
. |
|
The Bessel function used is the modified Bessel function of the third
(second according to some sources!)
kind, Kl(x). It is available for integral l built
into Excel, in MathFunctions as nrBesselK(n,x) for any real n and
x Î R and also in Matlab as nrBesselK, again for any n and
any x Î C.
Matlab mentions their BesselK uses a MEX interface to a Fortran
library by D. E. Amos, which are available on the
web under www.netlib.com, search for amos.
9. Recursive Classes of Distributions
The (a,b) recursion is
For (a,b,0) the recursion is valid for n = 1,2,3,.... For
(a,b,1) the recursion is valid for n = 2,3,4,....
The (a,b) classes fall into two sub-groups.
- ·
-
(a,b,0) distributions are supported on the non-negative
integers. They are specified through a, b, and p0.
-
·
-
(a,b,1) distributions are supported on the positive
integers. There are two sub-sub-classes. The zero-truncated
distributions have zero probability at zero. These include the
zero-truncated Poisson, logarithmic and negative binomial
distributions. The zero-modified distributions are a weighting of a
degenerate distribution with a zero-truncated class.
For the negative binomial, there is slightly more flexibility in the
choice of parameters for the truncated distribution, so it is
sometimes called the ``extended truncated negative binomial
distribution''. Normally we have parameters r = k and b = P = q/p, with mean
rP and variance multiplier 1+P = Q. In the ETNB, we must still have
b > 0, so the apparent variance multiplier is greater than
1. However, we can have -1 < r < 0, which would translate into a
negative mean, in the usual case. Also, if r < 0 then the probability
of a zero loss is pr > 1, which is also impossible, since p < 1
always. (Recall, p = 1/(1+P) = 1/vm.)
See the nice table on p 229 of for a good summary of the
options. See also page 250-251 for a chart showing the relationships
between the various distributions.
Data Tables
Poisson Distribution Key Facts
-
|
| | Item | Poisson Distribution |
|
| | | |
| | Mean | q |
| | Variance | q |
| | | |
| | q | m |
| | n/a | |
| | m3 | q |
| | m4 | 3q2+q |
| | CV | 1/Ö{q} |
| | Skewness | 1/Ö{q} |
| | Kurtosis | 3+1/q |
| | | |
| | PGF G(z) | exp(q(z-1) |
| | MGF f(t) | exp(q(eit-1) |
| | | |
| | Recursions | |
| | p0 | exp(-q) |
| | pn | pn-1q/ n |
| | | |
|
Negative Binomial (r = k,p) Key Facts
-
|
| | Item | NB Distribution |
|
| | | |
| | Mean | kq/p |
| | Variance | kq/p2 |
| | | |
| | VM View, m and v | |
| | p | 1/v |
| | k | m/(v-1) |
| | Contagion View, m and c | |
| | p | 1/(1+cm) |
| | k | 1/c |
| | | |
| | m3 | [( kq(1+q))/( p3)] |
| | m4 | [( 3k2q2)/( p4)]+[( kq(p2+6q))/( p4)] |
| | CV | 1/Ö[kq] |
| | Skewness | [( 1+q)/( Ö[kq])] |
| | Kurtosis | 3+[( p2+6q)/ kq] |
| | | |
| | PGF G(z) | (p/(1-qz))r |
| | MGF f(t) | (p/(1-qeit))r |
| | | |
| | Recursions | |
| | p0 | pr |
| | pn | pn-1 (k+n-1)q/n |
| | pn+1 | pn (k+n)q/(n+1) |
| | | |
|
Table
Negative Binomial (k,P) Key Facts
-
|
| | Item | NB Distribution |
|
| | | |
| | Mean | kP |
| | Variance | kP(1+P) |
| | | |
| | VM View, m and v | |
| | P | v-1 |
| | k | m/(v-1) |
| | Contagion View, m and c | |
| | P | cm |
| | k | 1/c |
| | | |
| | m3 | kP(1+P)(1+2P) |
| | m4 | 3k2P2(1+P)2+kP(1+P)(1+6P+6P2) |
| | CV | ((1+P)/(kP))1/2 |
| | Skewness | [( 1+2P)/( {kP(1+P)}1/2)] |
| | Kurtosis | 3+[( (1+6P+6P2))/( kP(1+P))] |
| | | |
| | PGF G(z) | (1+P-Pz)-k |
| | MGF f(t) | |
| | | |
| | Recursion | |
| | p0 | Q-k |
| | pn+1 | [( k+r)/( r+1)][ P/( 1+P)]pn |
| | | |
|
Table
PIG Distribution Key Facts
-
|
| | Item | NB Distribution |
|
| | | |
| | Mean | m |
| | Variance | m(b+1) |
| | | |
| | VM View, m and v | |
| | m | m |
| | b | v-1 |
| | Contagion View, m and c | |
| | m | m |
| | b | cm |
| | | |
| | m3 | |
| | m4 | |
| | CV | |
| | Skewness | |
| | Kurtosis | |
| | | |
| | PGF G(z) | exp(-m/bÖ{(1+2b(1-z))}-1 ) |
| | MGF f(t) | |
| | | |
| | Recursion | |
| | p0 | exp(-m/b(Ö{1+2b}-1) |
| | p1 | mÖ{1+2b}p0 |
| | pn | [( b)/( 1+2b)](2-[ 3/ n])pn-1 +[( m2)/( 1+2b)][ 1/( n(n-1))]pn-2 |
| | | |
|
Table
Generalized PIG Distribution Key Facts
-
|
| | Item | NB Distribution |
|
| | | |
| | Mean | m |
| | Variance | m(b+1) |
| | | |
| | VM View, m and v | |
| | m | m |
| | b | v-1 |
| | Contagion View, m and c | |
| | m | m |
| | b | cm |
| | | |
| | m3 | |
| | m4 | |
| | CV | |
| | Skewness | |
| | Kurtosis | |
| | | |
| | PGF G(z) | exp(-m/bÖ{(1+2b(1-z))}-1 ) |
| | MGF f(t) | |
| | | |
| | Recursion | |
| | p0 | exp(-m/b(Ö{1+2b}-1) |
| | p1 | mÖ{1+2b}p0 |
| | pn | [( b)/( 1+2b)](2-[ 3/ n])pn-1 +[( m2)/( 1+2b)][ 1/( n(n-1))]pn-2 |
| | | |
|
FILLINNAME Key Facts
-
|
| | Item | NB Distribution |
|
| | | |
| | Mean | |
| | Variance | |
| | | |
| | VM View, m and v | |
| | | |
| | | |
| | Contagion View, m and c | |
| | | |
| | | |
| | | |
| | m3 | |
| | m4 | |
| | CV | |
| | Skewness | |
| | Kurtosis | |
| | | |
| | PGF G(z) | |
| | MGF f(t) | |
| | | |
| | Recursion | |
| | p0 | |
| | pn | |
| | | |
|
JKK]
[JKK]
Johnson, Kotz and Kemp
Statistical Methods for Forecasting
John Wiley and Sons
1983
File translated from TEX by TTH, version 2.34.
On 11 Sep 1999, 17:28.