What Else do Epileptic Data Reveal

Problem statement: Aggregating and analyzing data of all patients usin g statistical methodologies as often done in macro sense would be not useful when physician’s professional interest was only to provide the best medical care to the pa tient. For this purpose, individual data of the involved patient should be analyzed and modeled in a micro sense for the physician to notice whether the treatment was helping the particular patient as demonstrated in this article. Understandably, a medical treatment would work in some patients but n ot i all patients. The physician would be more helped to know whether the treatment worked in a pa tient. Otherwise, the physician might switch to another treatment for the patient. No appropriate m ethodology existed in the literature to perform suc h a profile analysis. Hence, this article introduced a new statistical methodology and demonstrated the methodology using epileptic data. Approach: A probabilistic approach was necessary, as the num ber of epilepsy seizure in a patient happened to involv e a degree of uncertainty. In some patient, the chance for a large number of seizures might be more depending on his/her proneness. The proneness would be a latent and non-measurable factor and hen ce, it could be captured only as a parameter. The traditional Poisson distribution was not suitable a s it assumed homogeneous patients with respect to t he proneness. The probability model should match the r eality. A generalized Poisson model with an additional parameter to describe individual patient ’s proneness was necessary as the article demonstrated. The author introduced such a model an investigated several statistical properties befor e in another article A new methodology with that pro bability was devised in this article for assessing the efficacy of a treatment for a chosen patient in epi lepsy study. Results: Physicians pondered over whether epilepsy seizure incidences data support th eir hunch that their treatment was successful for a patient. This kind of case-by-case profiling was ne cessary to exercise the option of switching to another treatment for the patient. Aggregated medic al data analysis of all patients did not help in making decision for a particular patient. The resul ts of this article demonstrated about how the new methodology worked in epilepsy data to confirm when the treatment was successful. Patients, nurses and physicians were eager to develop an early warni ng system about how successful the treatment was in a patient. Such an early warning system was feas ible, after finding the probability pattern of the data, because of the new methodology in this articl e. The discussions in this article could be emulate d for other medical data analysis to address patient’ s profile. Conclusions/Recommendations: As demonstrated with an example using epilepsy data, o ther medical data could be fit, analyzed and interpreted using the incidence rate restricted Poi sson model. Not only the incidence rate but also th e restriction level on the incidence rate due to the treatment could be estimated and tested. The proxim ity of the patients could then be identified using the indices based on mapping the principal components o f their data as demonstrated in the article.


INTRODUCTION
The frontiers of medical discovery are expanding remarkably in this 21st century with inter-disciplinary cooperative research efforts. To advance medical discoveries, researchers are in great need of powerful and appropriate statistical methodologies to extract and interpret pertinent medical information. Applied statisticians are constantly inventing new methodologies to meet the needs. Yet, data like the seizure incidences remain under-utilized. Finding an appropriate underlying probability model for the data pattern has to be innovative and tailored to the needs of medical researchers as demonstrated in this article.
To be specific, consider the epilepsy seizure incidences data in Table 1 and 2 (Lu and Wang, 2003 for clinical details). The data were collected from fiftynine patients who experienced repeatedly epilepsy seizures. Twenty-seven of them were in a control group and they received "placebo" drug. The remaining thirtytwo patients received progabide drug. The patient's age and number of seizures prior to the beginning of the treatment period were noted. The numbers of seizures in each of the four treatment years were recorded. The first task is to frame a modeling strategy to extract and best utilize data information to address patient's profile and the treatment effectiveness as demonstrated in this article.
First, let me start with the medical background. What is seizure? Seizure is just a transient symptom of irregular neuron activities. Seizure is not confined to only humans. Animals exhibit this episode. Recurring seizures is recognized as epilepsy in medical discipline. Is epilepsy curable? Is a particular treatment effective? Do the patient's age, frequency and severity of the seizures have significant influence in its cure? Medical community is split on this issue. Some physicians believe that the epileptic seizure incidence can be significantly reduced by a successful treatment. Neurologists are actively tracing out the root-causes of epileptic seizures. In curing epilepsy, does age make any difference? About 30-50% of the patients above 80 years age seem to experience a second seizure. What else do the chosen epilepsy incidences data reveal? This tutorial article explores the data to answer this and other pertinent questions.
The seizures impair body movements, conscious awareness and cognitive behaviors. A loss of memory occurs after every episode. Some patients express dizziness, lightheadedness, tight chest prior to the episode. Studies show that some seizures are unnoticed as they occur even during sleep. For recent accounts on medical advancements to cure epilepsy, Fisher et al. (2005); Berg (2008); Shukla et al. (2004) and Binjadhnan and Ahmad (2010).
Epileptic patients and physicians who are treating them are eager to develop an early warning system. Is it feasible? It all depends on complete and correct capturing of patient's data information. Such a capturing requires best possible underlying probability pattern and it is often a challenge. The challenge is intense due to hidden restrictions on the seizure incidence rate because of the treatment effect.
Other patients in both groups possess under dispersion with some exceptions. The placebo patients with ID # 13 and # 4 have over dispersion effect in year 3 and in year 4. Therefore, the commonly used Poisson probability model is clearly inappropriate for the data in Table 1 and 2. However, the Incidence Rate Restricted Poisson (IRRP) model would study even for the exception cases because the usual Poisson model is a particular case of IRRP model. What is IRRP model? Shanmugam (1991) introduced the IRRP model to understand traffic accident patterns. This model is probably not familiar to all medical researchers. No other article or book exists in the literature for medical researchers to learn to interpret data patterns. To fulfill this apparent need, this tutorial article with discussions is worthwhile and hence, is prepared. The discussions in this article can be emulated in other medical data analysis to address patient's profile. Patients, nurses and physicians are often eager to develop an early warning system. Is it feasible? An answer is affirmative if the data pattern is correctly identified. An appropriate underlying model for the collected data is an unavoidable necessity. Could it be IRRP model?
Could the prior number of seizures before beginning the treatment and the patient's age be valuable predictors in an early warning system to project the future number of seizures? The age of the patients (except pregabide patient with ID # 9) range from 18-43. Such a regression could address whether the epilepsy illness has progressively worsened or cured. The parameters of IRRP regression are seizure incidence rate and its restriction level. This article demonstrates on how to test the significance of the estimated restriction level using a property of noncentral chi-squared probability model and test the significance of the estimated seizure incidence rate using normal probability model. The receiver operating characteristic curve of the cumulative model function of the seizure incidence rate in terms of the cumulative model function of the restriction level reveals the dynamics of the medical treatment as shown in Fig. 3 through Fig. 10. In the end, a principal component analysis is performed using the estimated incidence rate and its restriction level for all four years in both groups. The principal component results are displayed in Fig.  11 and 12 and interpreted subsequently.
Incidence rate restricted Poisson model: Let Y be the number of seizures experienced by a patient. This number could be anyone in the observable collection of possibilities {0, 1, 2, 3, ….}. The random variable Y is a Poisson type because of its rarity. The seizure incidence rate, λ is understandably restricted due to non-measurable treatment effect, patient's biologic and neurologic defects among others. The directly measurable factors in epilepsy data are his/her age and prior number of seizures before the beginning of the treatment but not the treatment effect. The collective impact of all non-measured factors on the seizure incidence rate is portrayed here as the restriction parameter β A negative amount for β is indicative of under dispersion (that is, variance is smaller than the mean) and a positive amount for β is indicative of over dispersion (that is, variance is larger than the mean).
The infinite value for β is indicative of equal dispersion (that is, variance is equal to the mean).
In a scenario of equal dispersion, the IRRP model in (1)  . In this scenario of equal dispersion, restrictions on the incidence rate do amount to no medicine/treatment effect. In all other scenarios with a finite level of restrictions on the seizure incidence rate, Shanmugam (1991) Incidence Rate Restricted Poisson (IRRP) model in (1) with a probability mass function: Would capture it and it is appropriate for the nonnegative integer random variable Y, where the incidence parameter λ is restricted by an unknown restriction parameter β and y = 0,1,2,….,.
The estimate of the seizure Incidence Rate and Its Restriction Parameter of the IRRP model in (1) are respectively: and: where, y and 2 s denote the data mean and variance respectively.
In the scenario of equal dispersion, recall that 2 s y = and consequently β = ∞ according to (3)  The placebo patients with ID #3 and ID #13 exhibit over dispersion while placebo patient with ID #18 exhibit equal dispersion in year 1 (Fig. 3). All other placebo patients exhibit under dispersion in year 2 (Fig. 4). The placebo patients with ID # 13 and ID # 14 exhibit over dispersion while none exhibits equal dispersion in year 3. All others exhibit under dispersion (Fig. 5). The placebo patients with ID #13 and ID #14 exhibit over dispersion while patient with ID #18 exhibit equal dispersion in year 4 (Fig. 7).
A unique property of the usual Poisson model (that is, β = ∞) is the equality of mean and variance. The usual Poisson model is inappropriate with the absence of this property in the data. Obviously the seizure incidence rate is restricted. Shanmugam (1991) for full inferential properties of the IRRP model. The needed results for discussions are quoted below. The probability of rejecting the true null hypothesis that H 0 β = ∞ (meaning that the seizure incidence rate is unrestricted or equivalently the treatment is not effective) in favor of the false alternative H 1 β < ∞ (meaning that the seizure incidence is restricted or equivalently the treatment is effective) is (Shanmugam 1991 is the upper tail area under standard normal curve for a given significance level 0<α<1. The unrestriction on the seizure incidence rate is synonymous to ineffective treatment/treatment.
The power is the probability of rejecting the false null hypothesis that H 0 β = ∞ in favor of the true alternative H 1 β = β < ∞ is: is the normal cumulative model function (cdf).
Likewise, the true null hypothesis H 0 λ = λ 0 the seizure incidence rate is rejected in favor of false alternative hypothesis 1 The power is the probability of rejecting false H 0 λ = λ 0 in favor of true H 1 λ = λ 1 is: where the degrees of freedom is: and the non-centrality parameter is: The formulas in (2) through (8) are demonstrated in Section 3 with the data in Table 1 and 2.
A demonstration of epileptic data analysis with IRRP model: The first task is to utilize the seizure data in Table 1 and 2 to estimate the IRRP model parameters. The incidence pattern for each patient should be captured for each year. To notice such pattern for year 1, the mean y and dispersion 2 s for his/her seizure incidences up to year 1 are computed using the number of seizures before beginning the treatment and the number of seizures in year 1. Substituting them in (2) and (3), the seizure incidence rate and restriction level for year 1 are estimated. With inclusion of the observed seizure incidence in year 2, the mean y and dispersion s 2 for his/her seizure incidences up to year 2 are updated and substituted again in (2) and (3) to estimate the model parameters for year 2. This process of calculations and estimations are continued for all four years. These estimates are displayed in Table 4 for Placebo group (excluding the two patients who exhibited equal dispersion) and in Table 5 for Treatment group. To reject the hypothesis H 0 β = ∞ (meaning that the seizure incidence is unrestricted) in favor of the alternative H 1 β < ∞ (meaning that the seizure is  Table 6 for placebo group and in Table 7 for pregabide group. The significant ones are displayed in boldface.
For an example, the hypothesis H 0 β = ∞ is not rejected for placebo patient 1 in year 1 but is rejected in 2, 3and year 4. Another example is pregabide patient 4 in year 2 and in which case, the hypothesis H 0 β = ∞ is not rejected in year 2 but is rejected in 1, 3and year 4. The boldface entries in 6 and Table 7 indicate the scenarios in which the hypothesis H 0 β = ∞ is rejected.
There appears to be relationships among the prior # of seizures, age and # of seizures in year 1 of patients as exhibited in Fig. 9 for placebo patients and in Fig. 10 for pregabide patients. The prior number of seizures is lower in older ages in both groups.
What relationships exist among the estimates of the incidence rate and the restriction level? The Fig. 11 through Fig. 14 reveal the pattern among placebo patients over the 4 years. Similar patterns among pregabide patients over the four years are exhibited in Fig. 15 through Fig. 18. In year 1, the restriction level is stable irrespective of the seizure rate in the Placebo group (Fig. 11). In year 2, year 3and year 4, the restriction level is increasing along with increasing seizure rate due to effective treatment (Fig. 12 through  14). In 1, 2, 3and year 4, the restriction level is increasing along with increasing seizure rate due to effective treatment (Fig. 15 through 18). There are some anomalies in both groups as evidenced in the Fig. 11 through Fig. 18. In medical studies like this, the ideal incidence rate to attain is λ 0 =0. Is it attained among the epilepsy patients in our data? Could the null hypothesis H 0 λ = λ 0 =0 about the ideal incidence rate be rejected at significance level α = 0.05 according to the collected data? .
The bold-faced values in Table 8 and in Table 9 are indicative of rejecting H 0 λ = λ 0 =0 respectively for placebo patients and so for pregabide patients. For example, the null hypothesis H 0 λ = λ 0 =0 is not rejected for placebo patient with ID # 14 in 1, 2, 3 and year 4 and for pregabide patient with ID # 3 in 1, 2 and year 3 only. The null hypothesis H 0 λ = λ 0 =0 is rejected for pregabide patient with ID # 3 in year 4.    These power values are displayed in Table 8 for placebo patients and in Table 9 for pregabide patients.
The receiver operating characteristic curve of the cumulative model function of the seizure incidence rate  Fig. 19 through Fig. 22 for placebo and in Fig. 23 through Fig. 26 for pregabide patients. The pattern is disappearing after year 1 and it is indicative of effective treatment.   The powers for the seizure incidence rate and its restriction level due to treatment effect are displayed for all years in Fig. 19 through Fig. 22 for placebo group and in Fig. 23 through Fig. 26 for pregabide group of epilepsy patients.
Notice that the power about the incidence rate is stable in year 1 for placebo group and not so for pregabide group. The power is quite varying across all ages in both groups in all years. In a way, the power about the restriction level is varying considerably in all years for both groups of epilepsy patients. This phenomenon is just a tip of the "iceberg" in a medical sense that there must have been some unique personal metabolic characteristics among the patients. A scrutiny of patients' personal characteristics is necessary to detect the full details. Because of the lack of such information about the patients in these two groups, this line of research study is not pursued in this article. A next natural statistical analysis to perform with the data involves principal components. The estimated seizure incidences and the restriction levels in year 1 through year 4 are considered for the principal components analysis. The Fig. 27 for Placebo patients and Fig. 28 for Pregabide patients portray the results for the first three principal components. There is no other pattern among the estimates of the incidence rates and their restriction levels to comment.
To check whether a pattern exists, a principal component analysis was performed with the estimates of seizure rate and restriction level for placebo and pregabide patients. The first two principal components explained 77% of the data variations in the placebo group and 90% of the total variations in pregabide group. In the Placebo group, the first principal component picked up the restriction level in year 3, the seizure incidence rate in 1-4 year as significant factors. In the Pregabide group, the first principal component picked up the restriction level in 1-4 year and the seizure incidence rate in 1-4 year as significant factors.
In the Placebo group, the second principal component picked up the restriction level in 1 and year 2 as significant factors. In the Pregabide group, the second principal component picked up only the age as significant factor. Using the factor loadings of the two principal components, two indices are computed for each patient. The two indices are used to graphically classify the proximity of patients in each group. Figure 29 for Placebo patients' proximity and Fig. 30-32 for Pregabide patients' proximity.

CONCLUSION
As this example, other medical data can be fit, analyzed and interpreted using the IRRP model. Not only the incidence rate but also the restriction level on the incidence rate due to treatment can be estimated and tested. The first two principal components can be computed using factor loadings. The proximity of the