54, Part 1, pp. 127–142
A useful distribution for fitting discrete data: revival of the Conway–Maxwell–Poisson distribution
Galit Shmueli,
University of Maryland, College Park, USA
Thomas P. Minka and Joseph B. Kadane,
Carnegie Mellon University, Pittsburgh, USA
Sharad Borle
Rice University, Houston, USA
and Peter Boatwright
Carnegie Mellon University, Pittsburgh, USA
[Received June 2003. Revised December 2003]
Summary. A useful discrete distribution (the Conway–Maxwell–Poisson distribution) is revived and its statistical and probabilistic properties are introduced and explored. This distribution is a two-parameter extension of the Poisson distribution that generalizes some well-known discrete distributions (Poisson, Bernoulli and geometric). It also leads to the generalization of distributions derived from these discrete distributions (i.e. the binomial and negative binomial distributions).
We describe three methods for estimating the parameters of the Conway–Maxwell–Poisson distribution. The first is a fast simple weighted least squares method, which leads to estimates that are sufficiently accurate for practical purposes. The second method, using maximum likelihood, can be used to refine the initial estimates. This method requires iterations and is more computationally intensive. The third estimation method is Bayesian. Using the conjugate prior, the posterior density of the parameters of the Conway–Maxwell–Poisson distribution is easily computed. It is a flexible distribution that can account for overdispersion or underdispersion that is commonly encountered in count data. We also explore two sets of real world data demonstrating the flexibility and elegance of the Conway–Maxwell–Poisson distribution in fitting count data which do not seem to follow the Poisson distribution.
Keywords: Conjugate family; Conway–Maxwell–Poisson distribution; Estimation; Exponential family; Overdispersion; Underdispersion
1.
References: Bleistein, N. and Handelsman, R. A. (1986) Asymptotic Expansions of Integrals. New York: Dover Publications. Boatwright, P., Borle, S. and Kadane, J. B. (2003) A model of the joint distribution of purchase quantity and timing Breslow, N. (1990) Tests of hypotheses in overdispersed Poisson regression and other quasi-likelihood models. Brockett, P. L., Golden, L. L. and Panjer, H. L. (1996) Flexible purchase frequency modeling. J. Marktng Res., 33, 94–107. Chatfield, C., Ehrenberg, A. S. C. and Goodhardt, G. J. (1966) Progress on a simplified model of stationary purchasing behaviour (with discussion) Consul, P. C. (1989) Generalized Poisson Distributions: Properties and Applications. New York: Dekker. Conway, R. W. and Maxwell, W. L. (1962) A queuing model with state dependent service rates. J. Indstrl Engng, 12, 132–136. Dean, C. B. (1992) Testing for overdispersion in Poisson and binomial regression models. J. Am. Statist. Ass., 87, 451–457. Friendly, M. (1995) Plots for discrete distributions. York University, Downsview. (Available from http://www. Hoaglin, D. C. (1980) A Poissonness plot. Am. Statistn, 34, 146–149. Johnson, N. L., Kotz, S. and Kemp, A. W. (1992) Univariate Discrete Distributions. New York: Wiley. Kadane, J. B., Shmueli, G., Minka, T. P., Borle, S. and Boatwright, P. (2003) Conjugate analysis of the ConwayMaxwell-Poisson distribution. To be published. Lambert, D. (1992) Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics, 34, 1–14. Maceda, E. C. (1948) On the compound and generalized Poisson distributions. Ann. Math. Statist., 19, 414–416. Manton, K. G., Woodbury, M. A. and Stallard, E. (1981) A variance components approach to categorical data models with heterogenous cell populations: analysis of spatial gradients in lung cancer mortality rates in North Minka, T. P., Shmueli, G., Kadane, J. B., Borle, S. and Boatwright, P. (2003) Computing with the COM-Poisson distribution Satterthwaite, F. E. (1942) Generalized Poisson distribution. Ann. Math. Statist., 13, 410–417. Wimmer, G., Kohler, R., Grotjahn, R. and Altmann, G. (1994) Toward a theory of word length distributions.