Improvement of Bayesian Credible Interval for a Small Binomial Proportion Using Logit Transformation

Email: t-ogura@clin.medic.mie-u.ac.jp Abstract: A novel credible interval of the binomial proportion is proposed by improving the Highest Posterior Density (HPD) interval using the logit transformation. It is constructed in two steps: first the HPD interval for the logit transformation of the binomial proportion is driven and then the corresponding credible interval of the binomial proportion is calculated by the inverse logit transformation of that interval. Two characteristics of the proposed credible interval are: (i) the lower limit is over 0% when the zero events are obtained and (ii) the error probability is not large for any population binomial proportion. The characteristic in (i) corresponds to the claims in the Rule of Three, which means that even if zero events are obtained from n trials, the events might occur three times in other n trials. The characteristic in (ii) is important for medical research. This is because the error probabilities of all groups do not increase even if there is a high population binomial proportion among some groups. The proposed credible interval is compared with the existing confidence and credible intervals. We verified using numerical and practical examples to confirm the potential usefulness of the proposed credible interval.


Introduction
Since the confidence or credible interval of the binomial proportion p is routinely used in medical research, further detailed studies are important under various practical settings.We are interested in a case in which the binomial proportion is expected to be a small.As a result, the number of events x may be 0 in n trials.In this case, the events might occur three times in other n trials by the Rule of Three (Hanley and Lippman-Hand, 1983).Even if the result of the zero events is obtained, the estimated point of p is not 0%.This is very useful in medical research (Jovanovic and Levy, 1997).The upper limit of confidence and credible intervals specialized for zero events have been studied by various researchers and include the credible interval using the non-informative priors (Tuyl et al., 2008) and the credible interval using the informative priors (Winkler et al., 2002).The lower limit of the confidence or credible interval for zero events is 0%, which means that it allows zero risk, but this is not suitable for medical research (Liu et al., 2015).Note that the plug-in predictor ( | ) f y p degenerates at 0 when p = 0.It is our understanding that the event is predicted to occur with a low binomial proportion, even when zero events are obtained.
The lower limits of most existing confidence intervals for zero events are 0% (Newcombe, 2012), such as the Clopper-Pearson confidence interval (Clopper and Pearson, 1934).The Clopper-Pearson confidence interval is often used to estimate the interval of the binomial proportion.It actually preserves the significance level regardless of the population binomial proportion.However, it is also known to be needlessly wide (Thulin, 2014a).Edwards et al. (1963) proposed a credible interval in which the interval is not too wide.The error probability of the existing credible interval was determined to be on average.The credible interval is not uniquely defined and two methods have been well established (Gelman et al., 2014).One is the Highest Posterior Density (HPD) interval which chooses the shortest possible interval enclosing 100(1-)% of the probability density function.The other is the equal-tailed credible interval.Both the outsides of the probability density function, in the upper and lower limits of the equal-tailed credible interval, are /2.The beta prior density Be(a, b) with the positive parameters a and b is used as the prior density.Since the hyperparameter is primarily set to a  1, the credible interval for this paper is discussed under this condition.The lower limit of the HPD interval is 0% for zero events because the posterior distribution is monotone decreasing from 0% (Bernardo, 2005).On the other hand, the lower limit of the equaltailed credible interval is over 0% for zero events.The error or coverage probability is used as evaluation criterion of the confidence or credible interval (Schilling and Doi, 2014;Jin et al., 2017).The error probability is 1 minus the coverage probability.Numerical examples show that the error probability of both credible intervals might be locally high when the population binomial proportion is high.
We propose a novel credible interval base on the HPD interval to dissolve those problems.Using the logit transformation of the binomial proportion, the posterior distribution changes from a monotone decreasing distribution to a unimodal distribution.We calculate the HPD interval for the logit transformation of the binomial proportion.
By the inverse logit transformation of the interval, the novel credible interval for the binomial proportion is obtained.The lower limit of the proposed credible interval is over 0% even when zero events are obtained.
In Section 2, we introduce the interval estimations of the binomial proportion.We propose a novel credible interval using the logit transformation in Subsection 2.1.The Clopper-Pearson confidence interval can be shown by Bayesian framework in Subsection 2.2.Section 3 compares the error probability of several confidence and credible intervals.In Section 4, we verify the appropriateness of the proposed credible interval using two practical examples.Conclusions are presented in Section 5.

Interval Estimation of the Binomial Proportion
We introduce two existing credible intervals (the HPD and the equal-tailed credible intervals) of the binomial proportion and propose improving the HPD interval using the logit transformation.The Clopper-Pearson confidence interval is explained as a credible interval.

Credible Interval
The conditional probability function for x, given p, is: where n is sample size.The beta distribution is often used as the prior distribution.Let a prior distribution for p be the beta distribution Be(a, b) with the positive parameters a and b and let its probability density be(p; a, b) The posterior distribution is proportional to the product of prior distribution and likelihood as follows: where B(,) denotes the beta function.The 100(1-)% equal-tailed credible interval (Gelman et al., 2014) The 100(1-)% two-sided credible interval of the HPD interval (Box and Tiao, 1965) is defined as follows: where d is chosen as the minimum value satisfying (5).
When the posterior distribution is a unimodal distribution, the two intersections of (p|x) and are the HPD interval   ** , lu pp.When x = 0, the intersection of (p|x) and d is one place.Then, the HPD interval becomes the same as the 100(1-)% upper credible interval   * 0, u p .If the event is predicted to occur with a small binomial proportion, the lower limit of 0 is not appropriate.We improve the HPD interval where the lower limit is over 0% for zero events.The logit transformation of p is shown as follows:  = log{p/(1-p)} Then, the posterior density for  is expressed as: This posterior density is a unimodal distribution even when x = 0.It is defined in the range from - to .The 100(1-)% two-sided credible interval of  is expressed as: where d' is chosen as the minimum value satisfying (7).
The interval estimation of  is written as (l, u).The interval estimation of p is obtained by the inverse logit transformation: To distinguish the HPD interval in this paper, the interval estimation using the logit transformation is called the proposed HPD and the interval estimation of the existing method is called the existing HPD interval.
Two familiar choices of (a, b) are (0.5, 0.5) and (1, 1), which are referred to as Jeffreys' and the uniform prior densities, respectively (Bolstad, 2016).The choice of p is important and a familiar non-informative prior density is usually desired.We also include a case of a < b, say (a, b) = (0.5, 1.5) which is called the reverse J-shaped prior density.When there is a belief that p is likely to a small binomial proportion, as observed in various medical applications, this type of a prior density may be suitable.Thus, it was worthwhile to add the reverse J-shaped prior density in this study.

Clopper-Pearson Confidence Interval
The 100(1-)% two-sided credible interval of ( l p , u p ) is obtained by setting both the tail probabilities as /2.Given x, the lower l p and upper u p limits satisfy: For ease of our comparison study, we use the wellknown equation between the cumulative binomial probability and incomplete beta function (Thulin, 2014b): The ( 9) and ( 10) are rewritten as: Note that ( 12) and ( 13) are obtained by substituting (a, b) = (0, 1) to (3) and to (4), respectively.These facts indicate the use of different probability densities at the lower and upper limits.In addition, these two prior densities are improper, extremal and unrealistic.Recall that both the distributions of Be(1, 0) and Be(0, 1) are assumed to be the limiting distributions of Be(1-c, c) and Be(c, 1-c) at c = 0.It is our understanding that the distributions Be(1-c, c) and Be(c, 1-c) for a very small c = 0 are extremal and are unrealistic for practical applications.

Performance of the Proposed HPD Interval
The error probability of the confidence or credible interval is defined as follows: where pl(i) and pu(i) are the lower and upper limits given i, respectively.The error probability is often used as evaluation criteria for confidence or credible interval.Copas (1992) advocated for level preservation and criticized the potentially large error probability of the credible interval proposed by Brenner and Quan (1990).He presented the error probability of the 99% equaltailed credible interval under the uniform prior density in the case of sample size n = 8.Schilling and Doi (2014) compared the coverage probabilities of some confidence intervals to search for optimal confidence interval in the cases of n = 8, 12, 20, 50.The softwares we used are Mathematica version 11.0 (Wolfram Research, Inc., 2017) for interval estimation and R version 3.3.1 (Team, 2017) for graph.We compare the proposed HPD interval with the Clopper-Pearson confidence and the two credible intervals; the existing HPD and the equal-tailed credible intervals.Figure 1 gives their error probabilities of the 95% Clopper-Pearson confidence interval and three credible intervals under the three prior densities; Jeffreys', the uniform and the reverse J-shaped prior densities.Since these error probabilities, except for the case of the reverse J-shaped prior density, are symmetric about p = 0.5 as functions of p, we will focus our attention on the range of 0  p  0.5.We evaluate the proposed HPD interval using the error probability.The error probability of the 95% Clopper-Pearson confidence and three credible intervals (the proposed HPD, the existing HPD and the equal-tailed credible intervals) under the three prior densities (Jeffreys', the uniform and the reverse J-shaped prior densities) are shown in Figure 1.The error probabilities of the confidence and credible intervals except for the reverse J-shaped prior density, exhibit linear symmetry at p = 0.5 and we focus our attention on the range of 0  p  0.5.Tuyl et al. (2008) compared the credible intervals under the some prior densities in the case of n = 8 and x = 0.In Table 1, we add the proposed HPD interval and the case of x = 1 in the sample size n = 8.The error probability of the Clopper-Pearson confidence interval is much smaller than 5%.In fact, the maximum error probability is 0.0309 at p = 0.4734.Depending on the research purpose, it is conceivable to study with the mean error probability rather than the maximum error probability.Comparing the existing HPD interval with the Clopper-Pearson confidence interval, both the lower limits are 0% in the case of x = 0.When x = 1, both the lower limits are close values.To narrow the extremely wide Clopper-Pearson confidence interval, the upper limit of the existing HPD interval is smaller than the Clopper-Pearson confidence interval.Since the upper limit is small, the error probability is often over 0.1, in the range of p > 0.2.The lower limit of the equal-tailed credible interval is larger and the upper limit of the equal-tailed credible interval is smaller than the Clopper-Pearson confidence interval.Thus, the interval is narrowed from both the upper and lower sides.
Although the error probabilities of the equal-tailed credible interval under Jeffreys' and the reverse J-shaped prior densities are occasionally over 0.1, in the range of p > 0.2, they are less than the existing HPD interval.However, the error probability of the equal-tailed credible is large when near p = 0. Comparing the proposed HPD interval with the Clopper-Pearson confidence interval, both the upper limits are close values.Therefore, the error probabilities of the proposed HPD are less than 0.1, in the range of p > 0.2, however, the lower limit of the proposed HPD interval is larger than the Clopper-Pearson confidence interval.Although it is occasionally over 0.1, in the range of near p = 0, there is no problem if p > 0 is expected.The upper and lower limits of the three credible intervals are characterized.When the sample size n is 20, the error probability of the 95% Clopper-Pearson confidence and three credible intervals under the three prior densities are presented in Fig. 2. The error probabilities of the existing HPD and equal-tailed credible intervals are over 0.1 for the proportion of about p = 0.1.In contrast, the error probabilities of the proposed HPD are less than 0.1, except for near p = 0.Even if the sample size increases, the error probabilities of the existing HPD and equaltailed credible intervals are occasionally large.On the other hand, the error probability of the proposed HPD is a narrow range, except for near p = 0. From the figures of the error probability in the sample sizes n = 8, 20, the proposed HPD interval under Jeffreys' prior density is the most stable near  for the Clopper-Pearson confidence and three credible intervals under the three prior densities.So, far as our experiences, the result exhibits approximately the same characteristic, regardless of the sample size.

Applications
Two practical datasets are analyzed to illustrate the potential usefulness of the proposed HPD interval.

Genome Sequencing Data
We begin with the data for rare variant discovery by deep whole-genome sequencing of 1070 Japanese individuals (Nagasaki et al., 2015).The data revealed a False Discovery Rate (FDR) for the high-confidence single-nucleotide variants (SNVs), deletions (validated deletions with less than or equal to 30 bases) and insertion (validated insertions with less than or equal to 30 bases) groups, together with the 95% confidence interval of FDR, which are summarized in Table 2.No explanation of the calculation method in the confidence interval was given.We add the 95% Clopper-Pearson exact confidence and three credible intervals (the proposed HPD, the existing HPD and the equal-tailed credible intervals) under the three prior densities (Jeffreys', the uniform and the reverse J-shaped prior densities).From the comparison of Nagasaki's confidence interval and three credible intervals, they calculated the existing HPD interval for the SNVs and deletion groups (x = 0) and the equal-tailed credible interval for the insertion group.They calculated the existing HPD interval for the SNVs and deletion groups and the equaltailed credible interval for the insertion group (x > 0).
It was not decided how to calculate the confidence intervals before clinical trials and may show the most convenient confidence intervals for researchers among several confidence intervals.The results of the statistical analysis and their interpretation should be reported from a fair and scientific point of view.More seriously, their lower limit is 0 when x = 0.This result is discouraging since a small binomial proportion of possible error is strongly anticipated in these types of the genome sequencing study.
HIV Antibodies Data Turnbull et al. (1992) tested saliva samples collected from 402 subjects for the presence of HIV antibodies.Positive results for HIV antibodies were obtained in 19 samples and negative results in 366 samples, 17 samples were too small to test.The 95% Wald confidence intervals of the HIV antibody positive samples were calculated for four groups consisting of injectors, noninjectors, homosexual/bisexual men and others.Newcombe (2012) pointed out some drawbacks of the Wald method.In particular, when the number of positive x was small, the lower limit of the Wald confidence interval was a negative value.Both the lower and upper limits of the Wald confidence interval were 0% when x = 0.They are summarizing as follows: "Failure to find any positives in a small series does not imply that there never would be any" Newcombe (2012) examined the addition of the 95% Clopper-Pearson confidence interval.Furthermore, we add the 95% three credible intervals (the proposed HPD, the existing HPD and the equal-tailed credible intervals) under the three prior densities (Jeffreys', the uniform and the reverse J-shaped prior densities) in Table 3.
Table 2: Total rare variant discoveries by deep whole-genome sequencing (Nagasaki et al., 2015).The interval estimations are calculated by five methods: Nagasaki's confidence, Clopper-Pearson exact confidence and three credible intervals.When the estimated value is exactly 0, it is expressed as 0.0 FDR Nagasaki Clopper-Pearson Proposed HPD Existing HPD Equal-tailed (%) 95% limit (%) 95% limit (%) Prior 95% limit (%) 95% limit (%) 95% limit (%) SNVs (n = 174, x = 0) 0.0 0.0 If it is considered appropriate that the lower limit is over 0%, the proposed HPD and the equal-tailed credible intervals are the optimal method.When calculating the credible intervals for some groups, the same calculation method is used.Although the population binomial proportion usually differs in the group, it is desirable that the appropriate credible interval is obtained for any population binomial proportion.Because the point estimate in the injectors group is 10.1%, if the existing HPD or the equal-tailed credible intervals are used, the error probability might be large.In the proposed HPD interval, the error probability is stable with small fluctuations in any population proportion, except for near p = 0. Since this example is discussed under p > 0, there is no concern that the error probability of the proposed HPD interval will become too large.

Conclusion
We discussed the confidence or credible interval when the binomial proportion is expected to be a small, especially when zero events were obtained.The estimated point of p was considered to be over 0% for zero events by the Rule of Three.Furthermore, the lower limit was considered to be over 0% in medical research.Two calculation methods of the credible interval were well known the existing HPD and the equal-tailed credible intervals.Although the existing HPD interval was narrower than the equal-tailed credible interval, the lower limit was 0% for zero events.On the other hand, the equal-tailed credible interval was over 0% for zero events.The error probabilities of the existing HPD and the equal-tailed credible intervals might become large when the binomial proportion exceeds a certain population binomial proportion.To eliminate those problems, we proposed the novel credible interval by improving the HPD interval using the logit transformation.The lower limit of the proposed HPD interval was over 0% for zero events.In the numerical example, the error probability of the proposed HPD interval was closer to  than the existing HPD and the equal-tailed credible intervals, except for near p = 0. We demonstrated that the proposed HPD interval worked even if the zero events were obtained or the population binomial proportion was high.In medical research, the statistical analysis plan must be fixed before the start of clinical trials.Since the proposed HPD interval performs satisfactorily for any outcome, the researcher can fix the statistical analysis plan using the proposed HPD interval even if a small binomial proportion is expected.

Fig. 1 :Fig. 2 :
Fig. 1: The error probability of the 95% Clopper-Pearson confidence interval and three credible intervals under the three prior densities.The dotted line connects discontinuity points.The sample size is 8

Table 1 :
Comparison of the Clopper-Pearson confidence interval and three credible intervals under the three prior densities.The sample size and the event is 0 and 1.When the estimated value is exactly 0, it is expressed as 0.0

Table 3 :
Prevalence of HIV antibodies in selected groups of ex-prisoners(Turnbull et al., 1992).The interval estimations are calculated by five methods: The Wald method, Clopper-Pearson confidence and three credible intervals.When the estimated value is exactly 0%, it is expressed as 0.0%