Efficient Technique to Detect the Region of Interests in Mammogram Images (2008)

Problem Statement: Breast cancer is the second leading cause of cancer deaths in women today after lung cancer and is the most common cancer among women. The development of efficient technique to early detect the region of microcalcifications mammogram images is a must. Approach: The method proposed in this paper is to enhance the Computer Aided Diagnosis (CAD) performance. This automatic method can detect the region of interest in mammogram image accurately and efficiently using a modified standard deviation technique. The proposed method is divided to three steps: (a) reducing the mammogram image size, (b) segmentation the breast region, and, (c) detection the region of interest. Results: The application of the technique on 386 mammogram images from the MIAS and the USF databases showed that the method is so sensitive in detecting the microcalcifications in mammogram images with 98.9% detection of true positive. Conclusions: Hence the technique proposed showed major improvement in the detection of the micro calcification and the mass region.


INTRODUCTION
Breast cancer is the second leading cause of cancer deaths in women today after lung cancer and is the most common cancer among women, excluding nonmelanoma skin cancers [23] .The early detection of breast cancer is a key factor for saving lives as the effectiveness of some treatment methods increases with the early detection [7,14] .Thus, it can reduce the mortality by 20-30% [5] .The automation of the medical image analysis (i.e., mammogrammy) has been of interest to researchers because it can increase the possibility of early diagnosis.
Screening mammography is currently the best tool available for the early detection of breast cancer [3] .Although its sensitivity is relatively high compared with that of other breast imaging modalities, its falsenegative rate is still as high as 15-30% [4] .Double reading has been shown to improve sensitivity [5] , but it is not cost-effective in a clinical setting.In order to improve the accuracy and sensitivity of interpretation, a variety of Computer-Aided Diagnosis (CAD) have been proposed to also provide a second opinion for the specialist in detection the suspicious regions in mammogram images.
Approaches in detecting the suspicious regions in mammogram images: Many authors implement many algorithms in order to detect the suspicious regions in mammogram images using many approaches, some of these approaches are summarized in the following: Using the wavelet: Paul S. et al. [1] proposed an artificial method based on using hierarchical pyramid neural network in detecting the suspicious location in mammogram images.This method can successfully detect the microcalcification and masses in the mammogram images, and can also reduce the false positive predictions in the detected mammogram images by 50%.Whereas, Gurcan [7] developed a computer program to reduce these false positives of suspicious areas in mammogram images.This method is based on using Convolution Neural Network (CNN), from which the potential microcalcification locations were determined with global and local threshold.Then an algorithm was used to reduce the FP percentage through a first rule base classification that uses size contrast and Signal to Noise Ratio (SNR) information.A trained CNN classifier was then used to recognize the abnormal patterns.The wavelet methods are also used to identity microcalcification clusters in mammogram images.Lemaur [4] proposed a new technique on using the wavelets and their Sobelev regularity index.The microcalcification detection is improved when using the modified Matzinger polynomials wavelet instead of the traditional wavelet.Songyang and Guan [8] proposed an algorithm to detect the microcalcification in mammogram image based on wavelet features and neural networks.They utilized to use the fourth level of Daubechies-orthogonal wavelet family.Features obtained from the fourth level of Daubechiesorthogonal wavelet are, the median contrast and normalized gray level that are used as an input to train a feed-forward NN classifier.Wavelet features are considered in training the NN in order to trace the likelihood map that shows the possibility of the value pixel of being a microcalcification.As a result, the algorithm was successful in detecting 94% of the mean true positive at the cost of one false positive per image and 90% mean true positive detection rate at the cost of 0.5 false positive per image.Rafayah [10] proposed a computer Aided diagnosis Algorithm based on using wavelet analysis and fuzz-neural approach in detecting the microcalcification in mammogram images.In feature extraction, coefficients vectors are extracted from the wavelet decomposition of the image.The coefficients for vectors were then extracted and the horizontal, diagonal and vertical coefficients were determined.The normalization for the coefficients, the energy and feature reductions were carried out.Two classifiers were generated: the globally processed using Neuro-fuzzy classifier and locally classifier processed through a cropped ROI.

Mathematical and Intelligent methods:
In addition, the fuzzy logic and scale space approaches were used in detecting the microcalcification in mammogram images by Cheng [5] .The first stage in Cheng method was image enhancement using the fuzzy image.The microcalcification clustered at the enhanced image was then detected using a Laplacian of Gaussian (LoG) filter.As a result, the approach was very efficient and effective in locating the microcalcification in mammogram images.Netsch and Peitgen [11] proposed an approach for automatic detection of microcalcifications utilizing multi-scale analysis based on the Laplacian-of-a-Gaussian filter and mathematical model that describes micro-calcifications as bright spots with different sizes.Since microcalcifications are difficult to detect because of the variability in their shape, supervised learning by using the Support Vector Machine (SVM) was proposed by Issam [3] .Bocchi [2] investigated the Fractal Browanian Motion (FBM) theory in modeling the background of the mammogram images in order to detect the microcalcification by enhancing and modeling the microcalcification against the background, and detecting the Region Of Interest (ROI) in the image.The advantage of using (FBM) in a spatial domain, which is a stationary Gaussian random field, was to estimate the background model from which the microcalcification against the background enhancement can be obtained through using a matched filter.This resulted in an image that has positive peaks from which the microcalcification clusters were detected based on the size and contrast of the spots.
Mammogram topology: Sheshardi and kandaswamy [6] produced a new computer Aided Diagnosis approach that can detect the microcalcification in MIAS database.In this Algorithm the ROIs were cropped manually to a 256×256 cluster image which contains both the microcalcifications and the normal ROIs.They investigated to use two digital filters together with a filter response energy measure as a texture feature extractor.This algorithm is based on using the digital filters according to the microcalcification criteria for each image or for each image resource.The true positive in their case were high since the suspicious regions are cropped manually by the specialist.And usually, the manual cropped regions in the CAD system are not considered.Brijesh [9] proposed a novel algorithm for detecting the micro-calcifications in digital mammogram images.After experimenting with a number of features: average histogram, average grey level, number of pixel, average boundary, difference, contrast, energy, entropy, STD and skew were found to be the most significant features.The (min-max) and average for each feature is considered as an initial weights for linking weights between the input nodes and hidden nodes.As a result, the algorithm can successfully detect the microcalcification by 81.8%.
Statistical and shape measurements: Spiesberger [12] described a computer-aided mammographic algorithm.Brightness, compactness, and statistical measures were applied in a decision tree to characterize the candidates.Cross-correlation coefficient is then used to measure the presence of micro-calcifications.If the crosscorrelation coefficient is larger than a threshold, then micro-calcifications are declared.Davies and Dance [13] and Davies et al. [14,15] used a local thresholding technique to segment clustered micro-calcifications.The local threshold is selected at the valley of the local histogram.If the local histogram is unimodal, then the sub-image is interpolated from its neighbor sub-images.
The segmented objects are analyzed using size, shape, and gradient measures to extract clusters of microcalcifications.Shen et al. [16] discussed different shape factors including compactness, moments, and Fourier descriptors in calcification analysis.These quantitative measures represent the roughness of shape that is used to classify calcifications.Fam et al. [17] proposed a method for the detection of fine-clustered calcifications.If pixel intensity falls in a specific range, the region growing algorithm is applied and the intensity gradient is computed to test whether the candidate pixel satisfies the mean and variance criteria.The computation time of the algorithm is high.Mascio et al. [18] introduced a method for microcalcification segmentation in high resolution digital mammograms.The enhanced image is obtained by the round high-emphasis technique which is a high-pass filter preserving rounded edges and texture gist which is the average of morphological opening and closing subtracted from the original image.A threshold technique is then applied to segment the microcalcifications.This method is limited to detect only round shape microcalcifications.
The proposed computer aided diagnoses system: In CAD system the region of interest shall be considered automatically and accurately in order to have a high true positive percentage.Therefore, the mammogram image will be processed in three stages in order to highlight the region of interest in that image.The following follow chart in Fig. 1 shows these stage which will be presented in the following.
Image reduction stage: Most mammogram images are large in size with high resolution that requires specialized computing facilities to enables efficient processing.To facilitate the transmission of these images over computer networks image compression techniques are usually applied.In this paper, we present a size reduction algorithm that can be implemented on most mammogram images as a pre-processing step to reduce their size without affecting their quality [20] .
In image reduction technique, three main stages were implemented in this process.First, the image is shrunk, then the conversion process will be carried out and finally the mammogram image will be scaled down using the bicubic interpolation technique.

Image shrinking:
The image shrinking algorithm is used to eliminate the unused grey levels in the original 16-bit image.The histogram for the entire digital mammogram is found and the shrinking process is explained below: Find the histogram for the mammogram image.
The unused grey levels are eliminated by replacing them with the adjacent used grey level.As a result, the resulting histogram will have limited number of grey scales but there will be no gaps among them.The output image is generated based on the new histogram.
This algorithm is applied using C++ and the practical implementation is shown in Fig. 2. The first step in the algorithm is to find the numbers of grey levels used in the 16-bit image, which is usually less than 65536.Thus, the maximum level is determined for each image.Then, the maximum level is re-calculated to be in the range from 250 -255 grey levels.However, the real challenge is to find the coefficient that would enable this.The divider is determined based on the characteristics of the input image is approximately similar to the original one.
This stage was applied to 64 USF mammograms and in all cases the histogram of the resultant 8-bit image was very similar to the histogram of 16-bit input image, as shown in Fig. 3.The final results are shown in Fig. 4. It is clear from this Fig. 4 that the output result.
Image scaling: The capability to digitally interpolate the mammogram images to different sizes while maintaining their features and quality is important for many applications.The mammogram images are high resolution images because they contain small features of interest that may be of significant importance for radiologists.In mammogram images there are no abrupt changes between the neighboring pixels.Therefore, using bi-cubic interpolation will generate a representative pixel for 16 neighboring pixels that will facilitate in scaling down the mammogram images.The Bi-cubic interpolation is a sophisticated technique that produces smoother edges compared to the bilinear interpolation [19] .In addition, it has a relatively good effectiveness combined with reduced complexity and maintains good quality for scaled images [22] .Further information on image scaling using bi-cubic interpolation can be found in [21] .Mammogram images need to be scaled down to enable better transfer and processing.The bi-cubic interpolation technique is used to provide efficient reduction in the size of the mammogram without affecting its quality or regions of interest.The microcalcification cluster is defined to be at least 3 microcalcifications within a 1 cm2 region of mammogram [6] .Therefore, the scaling ratio for the mammogram image should be suitable to keep the micro-calcification cluster clear and easily detected by radiologists.In most mammogram cases, the smallest microcalcification cluster area has about 37 pixels in high resolution images.Therefore, the maximum down scaling ratio was set to 50% of the image height and 50% of the image width.This ratio will ensure that the microcalcification clusters can still be detected by radiologists.
After the conversion technique was carried out, the scaling procedure is applied to the whole mammogram database using the Bi-cubic interpolation technique.Hence, the image with a size of 15,338,672 bytes becomes 1,925,120 bytes.So the reduction ratio is about 87% as shown in Fig. 5.
The mammogram images are ready to be segmented and processed in order to find regions of interest in this image.

Mammogram image segmentation:
In normal and dense breasts, the extraction process is quiet simple since the intensity of the breast region is high and the corresponding histogram has mainly two distinct peaks.This makes the extraction process relatively simple.However, in the fatty breast, the background is part of the breast region.Therefore, a 2-stages technique is developed below to separate the background form the breast region [23] .
• In the first stage the average intensity of the image is computed.So, if we have an image I with dimension m×n and intensity level L, the average intensity of the image can be calculated using Eq. 1 Where: M = Average image intensity.M = Image width.N = Image height.
The main purpose of this method is to return the average value.In all mammogram images, the average point is located inside the breast region because of its high intensity.
• The next step of this algorithm is to find the threshold value that can extract the breast region accurately.The threshold value will be less than the average point and is calculated using Eq. 2. This value is found empirically and provides accurate results for the 386 mammograms.
The last stage in this technique is to apply the morphological dilation process to smooth the spiky edges that are included in the segmented regions and to include the parts of the skin that are eliminated in the previous step as shown in Fig. 6.The morphological operator and the structuring element.For our work, the structuring element is taken to be a small mask that is overlapped with the input image to generate a smoothed pattern.Our structuring element is a circular mask of size 15×15, since the breast take a curvature shape, the circular mask is the most suitable structuring element for this task.In addition, the size was set to 15×15 based on the mammogram image resolution.
After selecting the 386 mammogram images from these two databases (MIAS and USF) we have applied our technique and the obtained results can be summarized as follows: • For normal breast; applying the morphological operator improves the performance of the segmentation technique.The extracted region contains only the breast region along with the breast skin.In addition, the extracted breast boundary is smooth as shown in Fig. 7.The breast is extracted and stored in a dynamic array that would represent the best shape.• For dense breast; this technique manages to successfully extract the breast region.The extracted region contains the desired region of interest as shown in Fig. 8. • For fatty breast; the extraction process is a real challenge but is carried out successfully in almost all cases.Fig. 9 shows clearly the smoothness of the extracted region.

Region of interest detection:
The last stage in the preprocessing is the detection of Region Of Interest (ROI) in digitized mammogram images.Before the process of detecting the possible microcalcifications location, artifacts inside the breast area are needed to be eliminated from the segmented mammogram Image.We developed a technique that distinguish these artifacts, which are in fact resulted from different noise resources such as the digital machine, the scanner, and the scanning process.The technique used to eliminate these artifacts is based on the fact that microcalcification areas in mammogram images are hazy regions.This means the pixels contained within these hazy regions have pixels with intensities that range from 70-240 grey levels.These values are considered by the authors as hazy areas, after reviewing 386 mammogram images from different resources.In accordance with these authors' observations, two thresholds are set to eliminate artifacts within the breast region, where the upper threshold value is set to 240 to eliminate the shiny artifact regions.The lower threshold value is calculated from the average of all the non-zero valued pixels and the standard deviation of these non-zero valued pixels.The computed value of this lower threshold will not only help in eliminating the background regions and artifacts of low intensity pixels, but it eliminates the low level boundary regions of the breast, hence this will reduce the size of the region of interest.Fig. 10 shows the lower and the upper boundary, while, Fig. 11 shows the eliminated regions and the resulted region of interest in the image after the removal of the breast artifacts regions.
The algorithm used to detect the ROIs is stimulated from the distribution and characteristics of a mountain area.It is well known that the contour or the mesh distribution of a mountain is similar to the Gaussian distribution.This means the latitude of the mountain center or the peak is the highest point in latitude among all the surrounding points.Reflecting this simple natural fact to detection process of microcalcifications in mammogram images will be the same as the detection of mountain peak.The microcalcification is a peak when comparing with its neighbor pixels.This is clearly shown if we have a microcalcification in a cluster, then the mesh grid for this image will have a high intensity comparing with the surrounding pixels as shown in Fig. 12.Therefore, the main purpose of this algorithm is to search for a peak in the mammogram image.For this purpose a small mask (i.e., 9 × 9 mask size) is used to scan the segmented mammogram image and for each move of this mask, the detection for peaks is performed.The algorithm can be summarized as the following: • A mask of size 9 by 9 is chosen to detect ROIs.
The reason behind the selection of this size comes from the USF and MIAS database resolution.The USF and MIAS database is scanned on a resolution 45µm×45 µm, 50µm×50 µm respectively, in order to include the microcalcification which has a dimension is less than 1mm.Therefore, number of pixels in this resolution will be 22×22 pixel/mm in USF and 20×20 pixel/mm in MIAS databases.In our case, this resolution is decreased down in the scaling process.So, the image is scaled down by half of its height and half of its width.So, the new pixel size will be approximately 100µm×100 µm which means that the number of the pixel will be 10 × 10 pixels/mm.Therefore, the mask of size 9 ×9 is suitable to detect the microcalcification points.This mask size was tested in 382 mammogram cases from two database resources and it was efficient in detecting the microcalcifications points in the mammogram images.Thus a cluster of (9×9), which results in odd cluster, is chosen to extract microcalcifications regions for both databases.• For each step movement of the mask, the central point is chosen to be the reference pixel.Starting from this central point, eight connected branches are defined as the diagonals of this cluster, as shown in the example in Fig. 13.
• For each branch, or the equivalent diagonal, the modified standard deviation is calculated according to the Eq. 2.

(
) • For each branch, or the equivalent diagonal, the modified standard deviation is calculated according to the Eq. 2.
( ) Where: STD i = The standard deviation for each branch (i).Center = The cluster center (here in this example is 200) Xi = The pixel intensity in the specified branch.N = Number for the pixel in the specified branch.

RESULTS
In order to detect a potential microcalcification, a counter is set to zero.The value of this counter will be incremented by one if the STD of a branch is greater than a specific experimental value.Applying the above method on 386 mammogram cases, we found that the specific value of 8 is suitable to our experimental technique.Thus if the counter value is found to be greater than 4 (half of the number of branches) then the cluster center will be considered as a peak.The reason in adopting such a technique is derived from that a microcalcification could be extended out of the mask region as shown in Fig. 14.Under this condition it will be impossible to find more than four adjacent branches that could have the MSTD criteria in the mask region.
As a result the suspicious regions in the mammogram image are all included and especially the microcalcifications regions.Figure 15 shows the results of implementing this technique of different types of breast (normal, dense and fatty breast) collected from different database resources.But the false-positive (the blood vessels) in this method are high which needed to be reduced.

DISCUSSION
The algorithm was first evaluated according to the proved cases and the results were good.A subjective evaluation was then carried out using a small survey conducted at the radiology department in King Hussein Cancer Center in Jordan (KHCC).In this study, 32 cases with microcalcification (USF database) and 25 cases with microcalcification (MIAS database) were considered in this evaluation process.The radiologists were asked to evaluate both the original images and the processed images.In this evaluation a survey was designed to measure the degree of satisfaction that each radiologist had with the processed images when compared to the original.Four specialists were involved in evaluating the cases.
Two images per case were presented to each radiologist; the first is the original while the second is the processed mammogram image.The radiologists were asked to make a comparison between the original and the processed images.The comparison was based on their ability to highlight the microcalcification in both the original and the processed image.Therefore, if the microcalcification was picked up and highlighted by the specialist this will be a True Positive (TP), where as The results below show the percentage of TP and FN cases detected by specialists analyzing the modified mammogram images, Table 1.
As shown in the previous table, the percentage True-Positive (TP) cases were high (~98.9%).Therefore, the algorithm can successfully detect and highlight the microcalcifications accurately.In very little cases (<1.2%), the algorithm missed the microcalcification boundary so it is considered as False Negative (FN).The false negative percentage was very small compared with the true positive.So, this algorithm can reasonably detect the microcalcification in mammogram images.Besides, this algorithm is efficient for high resolution mammogram images since it is fast and simple.
So, this algorithm can reasonably detect the microcalcification in mammogram images.Besides, this algorithm is efficient for high resolution mammogram images since it is fast and simple.For example, processing an image of 2486×914 will take 6.8 second using 1.8 GHz processor with 512MB RAM memory.

Fig. 3 :Fig. 4 :Fig. 5 :
Fig. 3: The 8-bit modified histogram.a) The Original Histogram of the mammogram image b) The 8-bit modified Histogram of the mammogram image

Fig. 6 :Fig. 7 :
Fig. 6: The thresholding and the morphological processes a) Original mammogram image b) Original mammogram image c) Segmentation using Average intensity with dilation process
2,3,4,5,6,7,8.Where: STD i = The standard deviation for each branch (i).Center = The cluster center (here in the example is 200 Xi = The pixel intensity in the specified branch.N = Number for the pixel in the specified branch.