Histogram Matching Schemes for Image Thresholding

: This paper proposes several novel schemes for image thresholding. The idea is simply to compare the original image histogram to that of the thresholded image. Element by element comparison (sum of absolute difference between the two histograms) is found to be of better performance than a single feature (area or size) comparison. The optimum threshold is the one producing the best comparison. Cumulative histogram is introduced as a generalization to the area under the curve and found to be of better performance. In addition, a new performance measure is suggested based on percentage of correct assignments in both foreground and background. Comparative results with Otsu shows the effectiveness of the proposed schemes.


Introduction
Image thresholding is vital in many applications and one of the effective methods for image segmentation. Various schemes have been proposed in the literature, a good review can be found in Sezgin and Sankur (2004).
The histogram plays a crucial role in many of these schemes. In general, the histogram is used as an approximation to the probability density function (Otsu, 1979;Kapur et al., 1985). In these cases and their extensions, the threshold is selected as a solution to an optimization problem for some objective function dependent on features extracted from the histogram.
Due to the fact that a histogram does not carry spatial information (2 different images can have the same histogram), higher dimensional histograms have been proposed as in Zhang and Hu (2008;Abutaleb, 1989;Zheng et al., 2017).
The aforementioned schemes can be generalized to multi-level thresholding as in Liu and Yu (2009). However, the computational price is too high. In addition, having many thresholds, the ensemble size for each region is reduced. This often results in inferior quality since statistics (or probability distribution, i.e., histogram) rely heavily on a large ensemble size.
There are many measures, as described in Sezgin and Sankur (2004), to evaluate the performance of a thresholding scheme. However, application dependant, a subjective decision may be preferred.
This research proposes few formulations that exploit simple features deducted from the 1D histogram or the cumulative histogram. The features investigated are simply the area under the (cumulative) histogram and the sum of the absolute difference between the histograms of the original and thresholded images. The cumulative histogram is of special interest due to the fact that it is a monotonic function. The threshold is the one producing an image having a (cumulative) histogram matching in some sense that of the original image.
In addition, a simple performance measure is suggested in this work to reduce the bias towards large background (foreground).

Preliminaries
The following symbols are adopted in all subsequent sections: n>T = Size of gray levels above the threshold. m>T = Average of gray levels above the threshold. V>T = Variance of gray levels above the threshold.
Without loss of generality, the gray levels are normalized to the interval [0, 1]. Also, the histogram is normalized such that its sum is equal to one.
For the purpose of comparing the performance of the thresholding schemes, many evaluation criteria have been suggested (Sezgin and Sankur, 2004). A common criterion is the Misclassification Error (ME) given by: where, FG is the foreground of ground truth, BG is background of ground truth, FT is the foreground of thresholded image, BT is the background of thresholded image and | | is the cardinality of the set. The ME measure given above is simply the mean of the absolute difference between the thresholded image and the ground truth provided that both images are binary.
The Yule coefficient has been suggested by Sneath and Sokal, (1973) to avoid the bias against small foregrounds, given by: Many algorithms, however, tend to be biased toward one side. Unfortunately, Equation (2) will tend to produce higher values for the performance. In addition, the measure can be negative for highly misaligned foreground and/or background. As a remedy, the dual similarity measure DSM is thereby proposed as a modification to the Yule coefficient: Obviously, a value of 0 indicates best match and a value of 1 indicates completely misaligned foreground or background. In addition, algorithms having similar losses for foreground and background are preferred over those having good foreground detection with bad background detection or vice-versa.

Algorithms Based on the Histogram
Intuitively, the best thresholded image would be the one having the highest similarity (in some sense) with the original image. Similarity is performed in this work through a simple comparison (absolute difference) between a feature belonging to the original image histogram and a corresponding one belonging to the thresholded image histogram.
All of the proposed schemes implement exhaustive search to find the optimum threshold.

Linearized Histogram Area LHA
The first scheme is the comparison of the area under the histogram curve. The area under the original histogram can be approximated using trapezoidal rule as: The area under the thresholded histogram can be represented by that of two triangles (assuming 0 for the histogram at gray levels 0, T and 1), resulting in: The optimum threshold is given by:

Linearized Histogram Difference LHD
The piece-wise linear approximation to the histogram of the thresholded image in the previous section can be normalized to have a sum of 1. The result is then matched with the original histogram in a similar manner to the L-norm formulation. The optimum threshold can then be found as: A value of 2 was chosen for the exponent. However, the scheme has tendency to produce better performance with values more than 2. Other functions can be explored as well, not necessarily of power or polynomial type.

Area under 2 Gaussians A2G
Similar to scheme LHA above, with the exception that 2 Gaussians are used instead of 2 triangles. The thresholded histogram is now given by: The values of a<T at a>T are the solutions to the linear system resulting from fitting hT to hO. The optimum threshold is then given by:

Difference between 2 Gaussians D2G
Similar to scheme A2G above, however, hT should now be normalized to have a sum of 1. The optimum threshold is formulated as: Similar extensions suggested for LHD scheme above are of interest in this scheme as well.

Truncated 2 Gaussians T2G
Following the same procedures as in A2G and D2G above, we formulate the histogram as a sum of two truncated Gaussians; one belongs to the foreground and one to the background. Hence, we have: In a similar to scheme D2G above, hT should now be normalized to have a sum of 1 in scheme DTG.

Algorithms Based on Cumulative Histogram
The encouraging results of the previous section motivates the author to investigate the cumulative histogram as it can be considered as a generalization to the area under the histogram. In addition, the cumulative histogram has better grounds in terms of comparing the unavailable gray levels in the thresholded image. In other words, the histogram of the thresholded image has only two nonzero values, while the cumulative histogram has nonzero values for all gray levels greater than or equal to m<T.
The histogram of the thresholded image has only two nonzero entries: n<T at m<T and n>T at m>T. Therefore, the resultant cumulative histogram is a two-step function. HT will be zero until m<T, then n<T until m>T, after that it is 1. In other words: Exhaustive searches were used in the following schemes in a similar fashion to that of the previous section.

Cumulative Histogram Size CHS
This scheme is simply the difference between the total sums of the cumulative histograms. Hence, the threshold is given by: A similar outcome can be obtained by comparing the areas (trapezoidal approximations) under the cumulative histograms of the original and the thresholded images.

Cumulative Histogram Power CHP
The objective function here is through power comparison given by:

Cumulative Histogram Difference CHD
In this scheme, the optimum threshold is obtained when the resultant image has a matching cumulative histogram to that of the original in the following sense: The value of 0.1 for the exponent was chosen through experimentation. Other functions of the absolute difference can be used. This can open the path to a family of algorithms.

Experimental Results
The suggested comparative schemes based on Equation (4-17) are compared with Otsu (1979) thresholding due to its popularity. The images in Fig. 1 where used. Table 1 lists ME values, see Equation (1), for the proposed schemes and that of Otsu (1979) using the images in Fig. 1.
Results are encouraging as seen from Table 1 and 2. However, area (or size) schemes are inferior in performance to that of absolute difference schemes. Power scheme is somewhat in between. This clearly highlights the superiority of element by element match over a single feature match. Generalization of the last statement to other features in the literature requires an extensive testing. Nevertheless, it seems attainable given the strong results from Table 1 and 2.   In reference to the above two tables, the proposed DSM has a wider range of values compared to that of ME. In essence, ME is related to the percentage of error to the whole image, while DSM is related to the worst of the percentage errors obtained from either foreground or background.

Fig. 1: Test images used and their ground truth
In terms of computation (hence fast implementation), more consideration should be given to LHA followed by CHP, CHD and LHD. Figure 2 shows the thresholded images using LHD, D2G, D2G, CHP, CHD and Otsu schemes. Clearly, the results of these scheme are subjectively comparable to that of the ground truth in Fig. 1.
Interestingly, schemes dependant on the histogram have tendency to use larger exponents, giving more influence to outliers. On the contrary, smaller exponents are better with cumulative histogram schemes. The reason is simply the smoothing of the outliers thanks to the cumulative operation.
The scheme CHD has remarkable performance and hence more images were compared in Fig. 3. Unfortunately, no ground truth is available. In all these cases, the performance is superior to that of Otsu. However, more test data is needed to draw a stronger conclusion.

Conclusion and Future Work
New algorithms for image thresholding has been proposed in this work using simple (cumulative) histogram comparison.
Results are promising, however, more test images are needed to explore the limits of the proposed schemes. The domain of application for each of the proposed schemes as well as their extension to higher dimensional histograms are currently under investigation.
The results clearly indicate the superiority of schemes based on absolute difference over those dependant on a single value (or objective function).
The proposed schemes have many local minima (the LHA scheme is an exception) with objective function values comparable to that of the global minimum. This observation is also noticed (in some cases) for other schemes including Otsu. However, the number of minima is far less in Otsu case. More investigation is needed to decide whether these minima are due to histogram noise or can be used to refine the threshold. Suggestions can be: Weighted average, iterative thresholding, or the possibility of multi-level thresholding.
The remarkable performance of schemes D2G, DTG, CHP and CHD, see Equation 10, 13, 16 and 17, should encourage further investigations to find the optimum exponent to these schemes. Other non-linear functions may be worthy of some insight.
As clearly demonstrated in Fig. 3, CHD performance is outstanding. Referring to Equation 14, more thorough investigations are need to compare the CHD scheme with a piece-wise linear approximation or even a higher order one. This can also include exponential functions.

Ethics
This article is original and contains unpublished material. The corresponding author confirms that all of the other authors have read and approved the manuscript and no ethical issues involved.