Image Thresholding Using the Complement Feature

Email: salahameer@alumni.uwaterloo.ca Abstract: A new feature (the complement feature) is proposed in an Eigen formulations for performing global image thresholding. The goal is to find an intensity or gray-level value below which is the background while above it is the foreground (object). Each pixel in the image is represented by a (2D) unit vector where the x-component is the normalized (to [0,1] or [-1,1]) intensity of the pixel, while the y-component is its complement (e.g., Euclidian L2-Norm). The correlation matrix can then constructed to find the cross-correlation, Eigen vectors (axes of inertia) and Eigen values (description of respective sizes). Several implementations for each of the three previously mentioned categories are proposed to perform image thresholding. Interestingly, some of the proposed implementations do not require exhaustive search and a direct solution can be obtained. The results are promising on a wide range of images as demonstrated by comparison with the well-known Otsu method.


Introduction
Image thresholding, as a binary segmentation, plays a vital role in almost all image processing and computer vision tasks. The ultimate goal is to delineate the image in such a way to obtain useful descriptions of the object(s) comprising the scene. To achieve this goal, many algorithms has been (and still being) developed, see Goh et al. (2018). Details regarding categorization of these algorithms and the feature space used can be found in many traditional survey papers such as Sezgin and Sankur (2004). In fact, the field is so diverse that there are survey papers on a single subcategory e.g., (Oliva et al., 2019;Lucchese and Mitra, 2001;Dey et al., 2010;Ilea and Whelan, 2011;Peng et al., 2013;Unnikrishnan and Hebert, 2005).
One of the active areas in image segmentation is graph cuts and its variants, e.g., Chandel and Bhatnagar (2019). Shi and Malik (2000) proposed a normalized cuts scheme where a correlation matrix is constructed between all the elements in the image. Image segmentation, is then performed by thresholding the Eigen vector with the second lowest Eigen value. Results are remarkable, however, the computation cost is too high.
In fact, Eigen structures (mostly Eigen vectors) have been found useful in many areas such as: Color representation (Ohta et al., 1980), site monitoring (Sarkar and Boyer, 1998), image registration (Huizinga et al., 2016) and eigenfaces (Turk and Pentland, 1991) to name a few.
In this sutdy, the image thresholding problem is addressed. A simple but effective Eigen structure is proposed as a solution. To the best of the author knowledge, incorporating Eigen value decomposition (linear algebra) in image thresholding has not been proposed previously. Although some similarity exists with graph-cuts schemes, as will be shown in next section, the simplicity of the matrix (22) is a huge discriminating factor. A more important aspect is the introduction of the complement feature that has not been proposed previously as concluded by the author exhaustive literature search.

Method
Without loss of generality, the original image is normalized to the interval [0,1] and concatenated to produce a column vector of size N1, N is the number of pixels in the image. This work investigates the use of one feature per pixel: intensity gi. The description of each pixel is then extended to form a 2D unit vector having the intensity value as the x-component. In other words, each pixel is now represented by the vector: (1) G is now an N2 array, representing the whole image. As it is obvious from Equation (1), the complement value is appended as a second dimension/feature. Let's explore the use of this formulation into the normalized cuts proposed by (Shi and Malik, 2000), resulting in an NN matrix D given by: The second smallest Eigen vector of (normalized) D is the closest to the segmented (thresholded) image. It should be emphasized, however, that the author is not claiming the superiority of the proposed formulation. The goal is simply to establish a link with the normalized cuts scheme.
Essentially, matrix D has the same Eigen values to that of a lower rank matrix plus some zeroes to compensate for the size difference between the two matrices (Horn and Johnson, 2012, pp 65). Fortunately, the lower rank matrix is the auto correlation matrix (AG) of size 22 given by: Another important association is with image registration where the axes of the destination object are aligned with that of the source object. Interestingly, the axes are the Eigen vectors of the auto correlation matrix of the data set.
Unfortunately, 1D data will have a scalar as its auto correlation. Equation (3) is in fact a good remedy for 1D data to have 2 axes (of inertia). In addition, the Eigen values can be seen as a representation of the extent (strength) in each direction.
Equation (3) can be implemented with less computational cost through the histogram (h) since, h is normalized to have a sum of 1: Let's explore the benefits of Equation (3) by solving the Eigen formula: The Eigen vectors of AG represent the axes of inertia for the data set. While the Eigen values are the respective strengths. The vector Vmax (the one corresponding to the maximum Eigen value λmax) points toward the direction of maximum inertia. Since the y-component is not an independent component, the x-component represents a point of concentration of the original data. The Eigen vector Vmin (the one corresponding to the minimum Eigen value λmin), on the other hand, is normal to Vmax and hence, is not guaranteed to have its x-component within the original data range.
The hypothesis adopted in this study can be stated in an abstract form as: The largest Eigen value λmax (thereby Vmax) should be associated with the major process in the data, while the smallest Eigen value λmin (thereby Vmin) should be associated with the minor process. The major process can be the foreground or the background depending on the image content.
This motivates the author to use the x-component of Vmax as a threshold. The Eigen values of AG, (or their ratio) can also be used. This ratio can be considered as an approximation of the percentage of minorities and majorities under the previously proposed hypothesis. Minorities can relate to noise or small objects in reference to the dominant or major object in the image.
An exhaustive search can be performed to find the optimum threshold based on similarity between the original image auto correlation matrix and that of the thresholded image. Comparison can then be implemented using Equation (4b), Eigen values and/or Eigen vectors as will be shown in the following sub-sections.
The component added to obtain a unit vector, see Equation (1), can be generalized to any fuzzy complement. The following suggestions where tried and perfect thresholding was obtained with certain forms, however, it is highly image dependent: where, n is a free parameter and k>3.
The results are evaluated using the dual similarity measure DSM recently proposed (Ameer, 2019) given by: where, B stands for background, F stands for foreground, subscript G is for ground truth image and subscript T is for thresholded image.

Cross Correlation
An exhaustive search can be performed using Equation (4b) Due to the ambiguity of whether the minority is on the dark side or on the bright side, Equation (9) should be used in an iterative split and merge paradigm. An exhaustive search can also be performed to find the optimum threshold based on the similarity between the original image auto correlation matrix and that of the thresholded image. The following form of comparison is adopted: where subscript T is for thresholded image, subscript O is for original image and i goes from 1 to 2 is the index of the Eigen value used.
Various schemes can be realized from using Eigen values, some suggestions are:  Scheme TrimPos: Successive merging from the two data ends. The portion given by Equation (9) is applied at each iteration using the histogram, i.e., the components at each end are merged so that the end component has a size equal to that given by Equation (9) Figure 1 shows the images used for testing together with their ground truth and the popular Otsu method. The resultant thresholded images of the proposed schemes (given in the previous section) are shown in Fig. 2 (cross-correlation), Fig. 3 (Eigen values) and Fig. 4 (Eigen vectors) using Equation (6a) with n = 2. Table. 1 lists the DSM, Equation (7), for the results (only the ones producing a binary image) of Fig. 2-4.

Experimental Results
A simple comparison of Fig. 2-4 clearly indicates the potential of the proposed schemes. A similar conclusion can also be inferred from Table 1. More investigation is needed to find the best form of complement to each image or domain of images.    It can be seen from Fig. 3 that scheme ValPos and ValNeg produce binary images. However, for some images, using local minima can produce multi-level thresholds or a range of thresholds. On the other hand, TrimPos and TrimNeg produced a multi-level image after no more merging is possible, i.e., we are sure about the black and white regions but not the inbetween. Hence, further decision is needed to obtain a binary image. The number of iterations for TrimNeg were less compared to that of TrimPos, a notice worth future exploring. Figure 4 reveals that scheme PosVec produces better results than VecPos and VecNeg. Interestingly, scheme VecDual produced a multi-level output: Black, white and gray. The gray corresponds to ambiguous areas in a similar fashion to TrimPos and TrimNeg, see previous paragraph. Table 2 lists the value of n corresponding to some variants of Equation (6) that will create the best output (lowest DSM) using schemes PosVec, NegVec, ValPos and ValNeg as a subset. The values are rough indicators, however. For some images, changing n in some range will slightly change the value of DSM. In some other cases, there is more than one range. For some images, some types from Equation (6) do not produce good results for any n.

Conclusion and Future Work
Novel schemes are suggested in this study to perform image thresholding using one feature, intensity. Three descriptors from the auto correlation matrix (using the complement feature) are proposed, namely: Cross-correlation, Eigen values and Eigen vectors. The algorithms are fully automatic and no need for parameters' adjustment of any sort. Interestingly, some schemes do not require exhaustive search, see schemes PosVec, NegVec and results from Equation (8). In addition, scheme NegVec can be extended to a tri-level thresholding by using the xcomponent of both Eigen vectors.
Work is currently in progress to extend the algorithm to segment colored images and investigate the modification needed to perform segmentation on an arbitrary feature space.
Combining the information from the Eigen vector(s) and value(s) can improve the performance, as can be induced from Fig. 4 and Table 1, where scheme VecDual can give a tristate solution.
More work is needed to find the best formula from Equation (6) and maybe other fuzzy variants, to obtain better results given the domain of the images used. Elaborate testing is also needed to compare performance of normalizing the image to [-1, 1] Vs. [0,1].
One of the extensions to the proposed schemes is to threshold at all the minima (instead of global minimum) of the objective function of any proposed scheme. This extension was noticed to result in multi-level thresholding. An aggregated scheme of some form may also be helpful. However, performance evaluation is differed to a future work on segmentation.