Estimating the Parameters of the Negative-Lindley Distribution using Broyden-Fletcher-Goldfarb-Shanno

Problem statement: The Maximum Likelihood Estimation (MLE) technique is the most efficient statistical approach to estimate parameters in a cross-sectional model. Often, MLE gives rise to a set of non-linear systems of equations that need to be solved iteratively using the Newton-Raphson technique. However, in some situations such as in the Negative-Lindley distribution where it involves more than one unknown parameter, it becomes difficult to apply the Newton-Raphson approach to estimate the parameters jointly as the second derivatives of the score functions in the Hessian matrix are complicated. Approach: In this study, we propose an alternate iterative algorithm based on the Broyden-Fletcher-Goldfarb-Shanno (BFGS) approach that does not require the computation of the higher derivatives. Conclusion: To assess the performance of BFGS, we generate samples of overdispersed count with various dispersion parameters and estimate the mean and dispersion parameters. Results: BFGS estimates the parameters of the Negative-Lindley model efficiently.


INTRODUCTION
Traditionally, Poisson and the Negative Binomial (NB) models are regarded as the most suitable models to represent count data. In the recent years, some new discrete distributions have been introduced such as the Com-Poisson model (Shmueli et al. 2005) (Khan and Khan, 2010) and Negative-Lindley (NL) distributions (Zamani and Ismail, 2010). In this study, we focus on the Negative-Lindley (NL) distribution. This is a twoparameter model and is developed through a mixture of the negative-binomial distribution and the Lindley distribution. Zamani and Ismail (2010) have applied it on two samples of insurance data and compared the fits with Poisson and negative binomial models. Their results show that NL is slightly better than NB and more efficient than Poisson since the counts are dispersed. To estimate the two parameters, the authors formulated separate maximum likelihood estimating equations and found the estimates iteratively using the Newton-Raphson technique. We note that, in their approach, they did not construct the joint hessian matrix. In fact, the hessian component of the likelihood function is quite difficult to compute as the second derivatives are difficult to express and may lead to numerical instability. In this study, we propose an alternate iterative algorithm based on the quasi-Newton Broyden-Fletcher-Goldfarb-Shanno (BFGS) approach (Yuan, 1991). This algorithm does not require the computation of the second derivatives of the score function and can yield equally reliable estimates. The outline of the study is as follows: We review the Negative-Lindley distribution and the maximum likelihood approach as demonstrated by Zamani and Ismail (2010). We next introduce the BFGS method of estimation . We then perform a simulation study where we generate samples of over-dispersed counts and estimate the parameters using BFGS algorithm. Finally, we present the conclusions and recommendations. Zamani and Ismail (2010) showed that the marginal distribution of the Negative-Lindley distribution is expressed as:

MATERIALS AND METHODS
( 1) , x 0,1, 2,... x j 1 Where: To estimate the parameters (θ, r), they used the maximum likelihood approach. The log-likelihood function yields: Where: n x = The value at the x th = Index and the partial derivatives ( 1) n 0 ( 1) j 0 x r 1 L log L(r, ) n log x r r r ( 1) n 0 ( 1) Where: k x x 0 n n = = ∑ As it can be noted, the first derivatives of the loglikelihood function are quite complicated. Thus, the construction of the Hessian matrix becomes difficult. To eliminate this problem, Zamani and Ismail (2010) formulated separate estimating equations to estimate θ and r. Following Klugman et al. (2008): Then Eq. 3 can be re-written as: The estimate of θ is obtained by solving Eq. 6 using the quadratic formula. In the same way, Eq. 4 can be written as: Then r is obtained iteratively by using the Newton-Raphson technique: where, k H(r ) is the estimate of the Hessian matrix at the k th iterated value. The algorithm works as follows: For an initial value of r , we calculate the estimate of θ using Eq. 6. Using this value of θ , we solve iteratively Eq. 8 until convergence. Having obtained an update of r , we replace in Eq. 6 to update θ , then solve Eq. 8 iteratively again. This cycle continues until both estimates converge. The Newton-Raphson method described above has one major drawback. It does not provide an estimate of the joint Hessian matrix of the parameters θ and r. As a result, the variance of the estimates θ and r may be over-or-under estimated. We note that Eq. 8 yields only an estimate of the Hessian matrix for the parameter r. To overcome this problem, we propose an iterative algorithm that is based on a quasi-Newton Broyden-Fletcher-Goldfarb-Shanno Initially, we set the parameters to zero and the hessian matrix at the (k-1) th iteration to be the identity matrix. Having obtained the first set of estimates, we follow the steps (9)-(13). The standard errors of ( ,r) θ can be obtained through the diagonal entries of 1 k H − .

RESULTS
We perform a simulation study to estimate (θ, r) using the BFGS approach. Initially, we generate a set of over-dispersed counts using the nbinom function in R 1.6.1 with various parameters. Then, we apply (10) to calculate the estimates of ( ,r) θ . The programs were implemented in MATLAB The results of the study are shown in the Table 1.

DISCUSSION
The simulation study was run for different sample size starting from 10-10,000. Each simulation was run 5000 times and the estimates obtained were averaged. To estimate the parameters, we start with very small values of ( ,r) θ .We note that as the cluster size increases, the standard errors decrease significantly. This justifies the consistency property of the estimators. Practically, we have reported very few nonconvergence simulations even for large sample size. In comparison with the MLE approach, we note that for sample size ranging from 10-50, there were around 2455 non-convergent simulations while BFGS yields approximately 600 non-convergent simulations. As the sample size increases, the number of non-convergent simulations decreases in both methods but BFGS yields still fewer ones. The computational time is also very encouraging and it indicates that BFGS is quite fast.

CONCLUSION
Based on the simulation results, we conclude that BFGS is a suitable technique to estimate the parameters of the Negative-Lindley model. It yields reliable estimates and provides consistent results. Moreover, this method of estimation provides an estimate of the standard errors of both parameters. We therefore recommend that to estimate parameters of the Negative-Lindley model, the BFGS algorithm is a more convenient estimation approach than the traditionally MLE technique.