Does Over or Under Dispersion in Inverse Binomial Data Suggest Anything? A Case in Point is the Waiting Time for Both Heart-Lung Transplants

The model is an abstraction of the reality. The selection of the usual inverse binomial as an underlying model for the number of patients waiting in months for heart and lung transplant is questionable because the data exhibit not the required balance between the dispersion and its functional equivalent in terms of the mean but rather an over or under dispersion. This phenomenon of over/under dispersion has been a challenge to find an appropriate underlying model for the data. This article offers an innovative approach with a new model to resolve the methodological breakdown. The new model is named Imbalanced Inverse Binomial Model (IIBM). A statistical methodology is devised based on IIBM to analyze the collected data. The methodology is illustrated with a real life data on the number of patients waiting in months for heart and lung transplants together. The results in the illustration do convince that the new approach is quite powerful and brings out a lot more information which would have been missed otherwise. In specific, the odds of receiving the organs are higher under an estimated imbalance in the data than under an ideal zero imbalance in all the states except Alabama. The odds are consistently higher under an estimated imbalance in the data than under an ideal zero imbalance across all the age groups waiting in months. Further research work is needed to identify and explain the factors which might have caused the imbalance between the observed dispersion in the data and its functionally equivalent amount according to the underlying inverse binomial model for the data. The contents of this article remains the foundation on which the future research work will be built.


INTRODUCTION
Various factors trigger Cardiovascular Disease (CVD) which damage kidney, brain, lung and heart. An ounce of prevention is worth a ton of treatment. To reduce the risk to contract CVD, the knowledge of its causal factors would help (Marieb and Hoehn, 2010). Such factors are: High blood pressure, high serum cholesterol levels, excessive alcohol consumption, sugar level in food, family history of genetics, obesity, lack of physical activity, psychosocial factors, diabetes mellitus, air pollutions and smoking among others. About 20.55% men and 15.9% women smoke according to a survey in 2012. Based on their measured serum cotinine levels above 0.05 ng mL −1 , about 40.1% of non-smokers had been exposed to the second hand smoke. Consequently, the risk of getting CVD has become a serious issue in the United States of America (USA) like in other world nations. About 31.9 million above the age 20 years have heart and lung ailment. In 2010 alone, 1 in 9 death certificates (279 098 deaths) in USA showed heart failure. In every 40 sec, someone in the USA experience a stroke and in every 4 min, someone dies. Arcasoy and Kotloff (1999); Budiani-Saberi and Delmonico (2008) and Finn (2000) for details of issues with respect to heart and lung transplants. A recent article by Shanmugam (2013a)

AJBS
issues and their resolutions of finding matching kidney and liver organs for transplant. These numbers suggest the gravity of the heart and lung related illness and hence, it is the theme of this article.
Dr. Norman Shumway is widely regarded as the father of heart transplantation although the world's first adult human heart transplant was performed by a South African Doctor, Christian Barnard. Dr. Christian N. Barnard performed the world's first human heart transplant operation on 3 December 1967, with the assistance of his brother, Marius Barnard and a team of thirty people. The operation lasted nine hours. The heart transplantation is not a cure for heart disease, but rather a life-saving treatment intended to improve the quality of remaining life for the recipients. Worldwide, about 3,500 heart transplants are performed annually. The vast majority of these are performed in the United States (2,000-2,300 annually). Sayeed (2009) warns about the ethical issues in transplants. Morris (2004); Reitz et al. (1982); Schlich (2010); Trzepacz and DiMartini (2000) and WHO (2008) for further details on heart and lung transplant.
Due to birth defects, pulmonary hypertension, emphysema, bronchiectasis, cystic fibrosis, many CVD cases require both heart and lung transplanted. Most of them are above 55 years old. About 115,152 people are waiting for right organ in the USA. The wait time and success rates for organs differ significantly. The combined heart-lung transplant is not an uncommon procedure. In year 1981, the first successful heart-lung transplant was performed by Dr. Bruce Reitz of Stanford University on a woman with idiopathic pulmonary hypertension. Due to the shortage of suitable donors, the combined transplant is a rare procedure; only about a hundred such transplants are performed per year in the USA. The waiting time to find both heart and lung is longer. The donor organs have to be healthy, right sized for the patient to adequately oxygenate and match the blood type. Until 2005, the United Network in Organ Procurement and Transplant Network (OPTN) allocated the organ to the recipient on first come first served basis. Later on, it is based on lung allocation score, an improved system which accommodates various measures of the recipient's health and need rather than how long has been waiting. The length of the waiting time matters when multiple patients with same lung allocation scores wait. Most candidates for heart-lung transplants have life-threatening damage to both their heart and lungs. In the USA, most prospective candidates have between twelve and twenty-four months to live. At any one time, there are about 250 people registered for the heart-lung transplantation at the United Network for Organ Sharing (UNOS) in the USA, of which around forty will die before a suitable donor is found. Once suitable donor organs are found, the surgeon makes an incision starting above and finishing below the sternum, cutting all the way to the bone. In 2004, there were only 39 heart-lung transplants performed in the entire USA and only 75 worldwide. For a comparison, note that in the same year, there were 2,016 heart and 1,173 lung transplants. The aim of this article is to analyze the uncertainty pattern of the waiting time (in months) of patients for both organs in a random sample of twelve states: Alabama (AL), California (CA), Florida (FL), Kentucky (KY), Maryland (MD), Minnesota (MN), Missouri (MO), Ohio (OH), Pennsylvania (PA), Texas (TX), Utah (UT) and Washington (WA) in the year 2008. Next, we will examine the waiting time data in Table 1 and come up with an appropriate underlying model for the data.

Why a New Model is Necessary?
The model is an abstraction of the reality. Recently, Shanmugam (2013b) demonstrated the importance of having an appropriate model to capture the fear among women in several nations to report the incidence of rape. Another article by Shanmugam (2013c) pointed out that the exponential model had to be tweaked to address the the chance for more survival time if a cancerous kidney is removed.
Let the Random Variable (RV) Y = 0,1,2,3,…, be the number of months a patient is waiting for suitable lung and heart organs from donor. Suppose that the probability of finding both organs is r r r where r≥1and µ>0 denote the number of organs in need and an unknown average number of months to wait for the recipients. In our discussion, note that r = 2 because of need for lung and heart. Trivially, the probability of not finding both organs in a month is Science Publications The model (1) has mean µ = r Odds and dispersion Equation (2):

AJBS
The inverse binomial is employed in many other application areas. For example,  applied inverse binomial to estimate the benefits of breast feeding.  investigated the impact of over-,eaui-and under-dispersion in insurance data. Shanmugam (2011) utilized the over or under dispersion to create an index to assess how much the Poissonness has been diluted in the collected data.
Notice that the mean increases when the number, r of needed organs or the odds of finding them increases. Furthermore, the dispersion υ is more than its functional , there is an over-dispersion indicating of an imbalance in the model requirement.
Likewise, when 1 r , there is an underdispersion indicating an opposite imbalance. Hence, we define the imbalance parameter Equation (3): Which is normed to fall in the closed interval [0,1]. Blending (3) into the model (1) The model (4) is new to the literature and hence, it is named here as imbalanced inverse binomial distribution (IIBD). Notice that when there is a balance (that is, φ = 0), the IIBD (4) reduces to the IBD (1) as a special case. The mean and dispersion of the IIBD (4) are respectively Equation (5 and 6): And: Notice that when there is balance (that is, φ→0), the mean (5) and dispersion (6) reduce to the mean µ and dispersion 1 of the usual inverse binomial distribution (1). With the coordinates z = µ φ , the mean in the y-axis (that is, y = µ) and the imbalance in x-axis (that is, x = φ), the mean (5) and dispersion (6) are displayed in a 3-dimension graph in Fig. 1 and 2 respectively. Under zero imbalance, both the mean and dispersion are planes. But, under a non-zero imbalance, the mean is a bent plate (Fig. 1) and the dispersion is convexly bent plane (Fig. 2). The impact of imbalance on mean and dispersion is clear. The IIBD (4) suggests that under an imbalance φ ≠ 0 between the dispersion and its functional equivalence in terms of the mean (that is, with over or under dispersion), the odds of receiving the organs becomes Equation (7): When there is a balance (that is, φ = 0), the odds (7) of IIBD (4) reduces to the odds 0 r Odds φ= = µ of IBD (1) as a special case. In other words, the Odds φ under an imbalance is related (Fig. 1) to the odds, Odds φ = 0 under balance and it is Equation (8): With the coordinates z = Odds φ , y = µ and x = φ, the odds (8) is displayed in a 3-dimension graph in Fig. 3. Under zero imbalance, the odds is smooth platonic. Under a non-zero imbalance, the odds is influenced and volatile by the mean µ and the level of imbalance φ.

Is the Imbalance Significant?
The practitioners might want to judge whether the collected data y 1 , y 2 ,…, y n of size n≥2 exhibit a significant imbalance measure to warrant the application of IINBD (4) instead of the usual NBD (1). Let y and 2 y s be the data mean and dispersion respectively. And: Is the estimated imbalance measure mlê φ in the data significant? This amounts to finding the probability value (that is, p-value) for the null hypothesis H o : φ = 0 to be true according to the collected data. For this purpose, we resort to Neyman's C(α) technique. What is it? Based on the regression concept, Neyman outlined a powerful methodology. Shanmugam (1992) for step-bystep details about deriving Neyman's C(α) test statistic. In our situation, the test statistic turns out to be 2 y rs T y(r y) = + . Imposing the formulas: (1 ) Var(T r, , , ) n φ φ + υ µ υ φ ≈ υ Hence, the standardized statistic Z follows the standard normal distribution. It means that the p-value of rejecting the null hypothesis H o : φ = 0 in favor of the research hypothesis H o : φ ≠ 0 is Equation (13) When the null hypothesis is rejected at a α level and the true value of imbalance measure is known to be φ 1 , the probability of accepting the true value φ 1 is called the statistical power and it is Equation (14)

Illustration
Now, we would illustrate the above mentioned methodology using the data in Table 1 on the number, Y of patients waiting in months for heart-lung transplant in a random sample of twelve states in U.S.A. The sample mean, y and dispersion 2 y s are calculated for each state and for each time period in months: Less than one, 1-3, 3-6, 6-12, 12-24, 24-36, 36-60, more than 60. The underlying model for Y is inverse binomial with r = 2 (the number of organs) when there is a zero imbalance between the data dispersion and its functional equivalent in terms of the mean.
To check out whether there is a zero imbalance, we first compute and display ( Table 1 and 2) the imbalance measure, mlê φ using (9). Their significance levels (that is, p-values) are calculated using (13). Smaller the p-value means the estimated imbalance measure is significant. In this sense, the estimated imbalance for those waited less than one month and those waited more than 60 months are significant. Next, let us examine the power of the methodology in an event the true value of the imbalance parameter H 1 : φ = φ 1 = 0.5. The power is the probability of accepting the true H 1 : φ = φ 1 = 0.5 according to the collected data. The value of the power is calculated using () and are displayed in the Table 1

CONCLUSION
This article has contributed a methodology by extending the usual inverse binomial distribution based on the existing over or under dispersion in the data. In some cases, the over or under dispersion is significant enough to tilt the odds, mean and dispersion.
What factors are causing the over or under dispersion? To make an assessment of this, data on more covariates are needed. Also, a regression type methodology has to be devised. The contents of this article remains the foundation for future development of the regression methodology and it is likely to emerge for the benefits of the health and medical professionals.