POISSON-WEIGHTED EXPONENTIAL UNIVARIATE VERSION AND REGRESSION MODEL WITH APPLICATIONS

This study introduces a new two-parameter mixed Poi sson distribution, namely Poisson-Weighted Exponential (P-WE), which is obtained by mixing Poi sson distribution with a new class of weighted exponential distribution. The new P-WE distribution provides a more flexible alternative for modelling  over dispersed count data compared to Poisson distr ibution. The estimation procedures of P-WE distribu tion via method of moments and maximum likelihood are pr ovided. This study also introduces P-WE regression model which can be fitted to over dispersed count d ata with covariates. The P-WE distribution and P-WE  regression model are fitted to two sets of count da ta.


INTRODUCTION
Mixed distribution can be considered as one of important approaches for obtaining a new distribution for count data in statistics and probability studies. In particular, mixed Poisson and mixed negative binomial distributions provide a more flexible alternative for modelling over dispersed count data compared to Poisson distribution. Examples of mixed Poisson and mixed negative binomial distributions are negative binomial which is a mixture of Poisson and gamma (Klugman et al., 2008;Lawless, 1997), negative binomial-Pareto (Klugman et al., 2008;Meng et al., 1999), Poisson-inverse Gaussian (Klugman et al., 2008;Tremblay, 1992), Poisson-Lindley (Sankaran, 1970;Ghitany et al., 2008), negative binomial-inverse Gaussian (Gomez-Deniz et al., 2008), negative binomial-beta (Wang, 2011) and negative binomial-lindley (Hossein and Ismail, 2010;Lord and Geedipally, 2011).Several applications of mixed Poisson distributions for fitting real data are discussed in Karlis and Xekalaki (2005).
This study introduces a new two-parameter mixed Poisson distribution, namely Poisson-Weighted Exponential (P-WE), which can be considered as an alternative for modelling over dispersed count data. The contents of this study are as follows. In section 2, we study the basic properties of the new P-WE distribution. Section 3 illustrates the estimation of parameters via method of moments and maximum like lihoodprocedure, section 4 introduces the new P-WE regression model which is applicable to over dispersed count data with covariates. The application of P-WE distribution and P-WE regression model on two sets of count data are provided in section 5. Finally, several conclusions are presented in section 6.

JMSS
where, is the shape parameter and the scale parameter new WE distribution, which was obtained froman idea suggested by Azzalini (1985) who introduced a shape parameter for several symmetric distributions, was proposed by Gupta and Kundu (2009) who introduced a shape parameter to exponential distribution which belongs to non-symmetric distributions. In several cases, the new WE distribution provides better fit than weibull, gamma or generalized exponential distributions. The new WE distribution can also be represented as a sum of two independent exponential distributions.
Assume that the conditional random variable X|λ is Poisson distributed with p.m.f.,: And the random variable Λ is distributed as WE with p.d.f.: The p.m.f. of P-WE distribution is Equation (1): The m.g.f. of P-WE distribution is Equation (2): Which is equivalent to the sum of two geometric distributions, each with probability 1 β β + and 1 αβ + β αβ + β + . Therefore, P-WE distribution can be represented, X = Y +Z, where Y~Ge Since the P-WE distribution is a convolution of two geometric distributions, it is a special case of a general family of distributions examined by Kemp (1979) involving convolutions of binomial and pseudo-binomial variables. Several models from this family of distributions, such as Non-central Negative Binomial (NNB), Generalized Non-central Negative Binomial (GNNB) and Binomial-Binomial (BB) distributions, were discussed in more details in Kemp (1979) and Ong (1995).
The mean and variance are easier to be obtained by using M X (t). The mean and variance respectively are Equation (3 and 4): And: The skewness is given by µ β + β + α + + β α + + β α + + γ = = σ β + α + + β α + + . Figure 1 and 2 show the coefficient of variation and skewness of P-WE distribution as functions of z = g (x = α, y = β). Figure 3-6 show several examples of p.m.f. of P-WE distribution, indicating that the distribution can be considered as an alternative for over dispersed count data.

Parameter Estimation: Method of Moment Estimators
Assume that x 1 ,x 2 ,…,x n are sample data of size n distributed as P-WE with p. , where 0.5<α<1.

Maximum Likelihood Estimators
The log-likelihood of P-WE distribution is: By taking the partial derivatives with respect to α and β and equating them to zero, we obtain: where, the Maximum Likelihood (ML) estimators can be solved numerically using MM estimators as initial values.

P-WE Regression Model
The new P-WE regression model can be derived using different parameterization of P-WE distribution.
Let α = ν-1 and i 1 v v + β = µ . The p.m.f. of P-WE distribution can be reparameterized as Equation (7): (1 v) The covariates can be incorporated via a log link function, , where e i denotes the exposure, x i the vector of covariates and β the vector of regression parameters. Hence, the log likelihood of P-WE regression model is: The ML estimators of β and ν can be obtained by maximizing the log likelihood.
The new P-WE regression model can be compared with other regression models for count data with covariates such as Poisson, Negative Binomial (NB) and Generalized Poisson (GP). In actuarial literature as well as insurance practice, Poisson regression model has been widely used for modeling claim count data. As examples, (Nasr-Esfahani et al., 1990;Renshaw et al., 1994) respectively fitted Poisson regression model to two different sets of U.K. motor claim count data.
For handling over dispersion, several regression models such as NB and GP have been suggested. Several parameterizations have been performed for NB regression models and the two well known models, referred asNB-1 and NB-2 in Greene (2008), have been developed and applied (Cameron and Trivedi, 1986;Lawless, 1987;Ismail and Jemain, 2007;Zulkifli et al., 2013). Several parameterizations have also been performed for GP regression models and the two well known models, referred as GP-1 and GP-2 in Yang et al. (2009), have been developed and applied (Consul, 1989;Ismail and Jemain, 2007;Ismail and Zamani, 2013).

Example 1
An insurance count data from Belgium in year 1993 is considered (Denuit, 1997) for fitting Poisson, NB and P-WE distributions, using both ML and MM estimation procedures. It should be noted that the MM estimators of P-WE distribution can be calculated using closed formulas (5 and 6) as long as 2   2  2 1 1 m m am = + , where 0.5<α<1. Based on the sample data, m 1 = 0.1057 and m 2 = 0.1149, so that a = 0.8211. The chi-square and log likelihood are considered as comparison criteria. Table 1 provides the observed values, fitted values, estimated parameters, chisquare and log likelihood.
The results show that the PWE-MLE provides the largest log likelihood and the smallest chi-square. Even though the NB distribution is a strong competitor, the P-WE distribution provides better performances because the log likelihood of PWE-MLE is larger than NB-MLE and the chi-square of PWE-MM is smaller than NB-MM.

Example 2
The US National Medical Expenditure Survey 1987/88 (NMES) data from Deb and Trivedi (1997) is considered. The NMES data was used to model the demand for medical care, captured by the number of physician office visit and the number of hospital outpatient visit. For an illustration purpose, we use only the first 2000 data for fitting the regression models. Our response variable is number of physician visit (OFP) and the covariates are the number of Hospital Stays (HOSP), self-perceived health status (POORHLTH and EXCLHLTH), number of chronic conditions (NUMCHRON), gender (MALE), number of years of education (SCHOOL) and private insurance indicator (PRIVINS). Table 2 shows the mean and standard deviation of the selected variables, whereas Table 3 shows the parameter estimates, standard errors and t-ratios for the fitted models.
The results show that the regression parameters for all models have similar estimates. As expected, NB-2, GP-2 and P-WE regression models provide similar inferences for the regression parameters where the absolute value of t-ratios are smaller than Poisson regression model. Comparison between the standard errors of regression parameters of NB-2, GP-2 and P-WE regression models indicate that the standard errors of NB-2 and P-WE models are equal or smaller than GP-2 model, with the exception of regression variable EXCLHLTH. Based on the log likelihood, AIC and BIC, the P-WE regression model is the best model for fitting the US NMES count data.  Table 3. Poisson, NB-2, GP-2 and P-WE regression models (example 2)

JMSS
s

CONCLUSION
This study has introduced a new two-parameter mixed Poisson distribution, namely Poisson-Weighted Exponential (P-WE), which is obtained by mixing Poisson distribution with a new class of weighted exponential distribution. The P-WE distribution is suitable for over dispersed count data with variance 2 2 a σ = µ + µ , 0.5<α<1. Besides the univariate version, the regression model of P-WE distribution with mean E (Y i ) = µ I and variance 2 2 i i i 2 1 v Var(Y ) (1 v) + = µ + µ + , ν>1, µ i >0, has been derived.
For numerical illustrations, P-WE distribution was fitted using MM and ML estimation procedures to an insurance count data and the results were compared to MM and ML estimators of Poisson and NB distributions. Based on chi-square and log likelihood, the P-WE MLE provide the largest log likelihood and the smallest chisquare. Considering the straight forward manner of obtaining the MM estimators using closed formulas, the P-WE distribution can be considered as an alternative for fitting over dispered count data.
The P-WE regression model was fitted to the US NMES data. The regression model was compared to Poisson, NB-2 and GP-2 regression models and based on the log likelihood, AIC and BIC, the P-WE regression is the best model for fitting this data. Therefore, the P-WE regression model can also be considered as an alternative for fitting over dispersed count data with covariates.