A Novel SIFT-SVM Approach for Prostate Cancer Detection

: Prostate cancer is a major cause of concern in male population as it is said to affect 1 in every 7 men in their lifetime. The number of cases being registered for prostate cancer and its mortality rates are increasing yearly at an alarming rate. Due to the high resolution and multi-dimensionality of the Magnetic Resonance Imaging (MRI) images, proper diagnostic system and tools are required. In this study, multiclass Support Vector Machines (SVM) classifier has been used, which is a well-known machine learning technique to classify the prostate images into 3 categories namely normal, benign and malign. This study has also made use of the Scale Invariant Feature Transform (SIFT) feature extraction method which is well known for its high rotation invariant nature. A SIFT-SVM approach has been introduced for the first time in prostate cancer detection. The performance of the system is computed in terms of sensitivity, specificity and accuracy. Our approach achieved high performance with an accuracy rate of about 99.95% when 40% of the training data was considered for obtaining our result.


Introduction
Prostate cancer is the most commonly found form of cancer especially in the male population and also is a major cause of mortality rates in the developed countries (Niaf et al., 2014). Prostate cancer originates in the prostate gland, which is a walnut sized exocrine gland present in the male reproductive system and gradually spreads to other neighboring parts of the body. According to a study, an average of about 1.1 million cases of prostate cancer get registered yearly and about 1,276,106 new cases of prostate cancer and 358,989 deaths were registered worldwide in the year 2018 alone and the numbers are increasing steadily with every passing year (PCF, 2020). Usually a Digital Rectal Examination (DRE) followed by a Prostate Specific Antigen (PSA) test is performed in order to check if there is any abnormality present in the Prostate (Janney et al., 2017). An elevation in the PSA level indicates a potential risk of having cancer, but this rise in the PSA can also be a result of various other factors listed as (PCF, 2020): i. Prostatitis: A condition where the prostate gland gets inflamed due to some bacterial infections. This is not a serious condition and can be easily treated with antibiotics ii. Benign Prostatic Hyperplasia (BPH): It refers to the enlargement of the prostate gland which is a common condition that may occur due to the growing age of a person. This can be corrected with medication or by performing a small surgery iii. Prostate Carcinoma: A condition which occurs due to the abnormal and uncontrollable growth of the malignant cells in the prostate gland. This is a serious condition which requires immediate medical supervision which may otherwise prove to be fatal. Usually, this is treated with surgeries, chemotherapy or hormone therapy. So, usually a series of biopsies are performed to check the presence of malignant cell in the prostate after the test shows a spike in the PSA level. The Gleason grading system helps in evaluating the prognosis of men with prostate cancer by making use of the samples from a prostate biopsy. In amalgam with other criterion, this method of prostate cancer staging will predict the prognosis and helps guide the treatment. The Fig 1. shows the histology grading of the Gleason score, the lowest grade is assigned to a normal prostate tissue and higher the score more aggressive stage the cancer has reached (CCA, 2000) 1743 With the introduction of the Computer Aided Diagnosis (CAD) systems, the painful biopsies can be avoided to a larger extent and the location of the tumor cells can be identified precisely (Reda et al., 2016). MRI scan is usually preferred over a CT scan as the MRI images provide more anatomical and functional details of the prostate whereas a CT scan suffers from soft tissue resolution and is usually performed when the cancer has reached the metastasis (i.e., when the cancer has spread out of the prostate gland) (PCF, 2020). In this study, the MRI images have been used for the study. An MRI scan provides a cross-sectional view of the gland, like a CT scan, but it is capable of providing views from multiple different angles. The images got by the MRI scan gives a clear picture of the prostate. When compared with the Trans Rectal Ultrasound (TRUS) images, MRI images are more sensitive in diagnosing the disease.

1744
TRUS is still the most customary model used for prostate imaging due to it being less expensive and also because it facilitates real time ultrasound-guided biopsies (Llobet et al., 2007). But, TRUS does not provide a clear differentiation between the various regions of the prostate and suffers from low sensitivity and specificity which requires performing many painful biopsies to obtain reliable results.
In this study, Scale Invariant Feature Transform (SIFT) technique is used to draw the features from the MRI images where the key points were drawn from certain reference images which were stored in a database, later the same key points were used for comparison with other images to match the similar features from new images in order to form the candidate matching features. Later, these features were used to stratify the MR image to one of the three categories: Normal, benign and malign, by employing a multi-class Support Vector Machine (SVM) classification approach. It was observed that when SIFT features were passed through the multi-class SVM classifier, a better performance was obtained. Chan et al. (2003) worked on a multi-parametric CAD system for the first time to diagnose prostate cancer. Their method made use of line-scan diffusion, in conjunction with an SVM classifier, T2 and T2W images were considered to classify the predefined areas of the prostate peripheral zone for prostate cancer. Le et al. (2017) presented computerized approach that was based on multimodal Convolutional Neural Networks (CNNs) for identifying two kinds of prostate carcinoma diagnosis tasks that mainly distinguished between cancer and noncancerous tissues, along with Clinically Significant (CS) and indolent prostate cancer. Further these results were paired with SVM classifier along with some handcrafted features. Giannini et al. (2015) proposed a fully automatic CAD system that was designed as a two-stage process. Initially a probability map of malignant tumors is generated for all vertices inside the prostate. A candidate segmentation phase is then carried out to indicate suspicious regions, thus estimating both the sensitivity and the amount of False-Positive (FP) regions. Here SVM made use of radial basis kernel for classifying the images. Yang et al. (2017) presented an automated prostate cancer detection system that was capable of simultaneously identifying the existence of cancer in an image and also locating the abnormalities based on deep Convolution Neural Network (CNN) functions that made use of single-stage SVM classifier. Artan et al. (2010) proposed the use of cost-sensitive Support Vector Machines (SVMs) for automatic localization of prostate cancer. This approach showed better performance when compared to the traditional SVM classifiers. Their experimental results showed that, when multispectral MRI was used, there was an improvement in the accuracy of locating the cancer position compared to the standard MR image and that with the use of cost-sensitive SVM there was a significant boost in the performance. Niaf et al. (2012) proposed a CAD system in order to estimate the presence of prostate cancer in the peripheral zone by using the various types of multi parametric MRI images which comprised of T2-Weighted (T2W), Diffusion-Weighted (DW) and Dynamic Contrast-Enhanced (DCE) MRI images. They made use of the features from the grey level images, where the results were obtained and compared with four different classifiers. The classifiers used in the study were nonlinear Support Vector Machine (SVM), k-nearest neighbors, Naive Bayes classifiers and linear discriminant analysis. Litjens et al. (2011) proposed a multi-stage CAD scheme in order to reduce perception and oversight errors in screening prostate cancer using MRIs. Here, a 2-stage classification method was incorporated for estimating the probability of abnormalities in the prostate regions. Here they trained the SVM classifiers with the voxel features.

Related Works
Nasser and Dogru (2017) proposed a Scale Invariant Feature Transform (SIFT) and a Speeded-Up Robust Features (SURF) algorithm for improving signature recognition. Vector quantization technique was applied on Bag-Of-Word (BOW) features that provide the important points for every training image within a unified dimensional histogram. For this study they made use of multiclass SVM classifier. Li and Wang (2018) made use of SIFT algorithm to extract key points for content-based image classification. Here K-means clustering algorithm was used to cluster the features and then Bag Of Word (BOW) of each image was constructed. SVM classifier was used in order to classify the images. Bagla and Bhushan (2016) proposed a novel face recognition approach using hybrid SIFT-SVM approach. Results are categorized into child, adult and old age. The recognition rate was computed using the False Acceptance Rate (FAR) and False Rejection Rate (FRR). The results provided a good performance under variety of conditions such as different pose, lighting conditions and facial expressions. Hussain et al. (2018) employed various machine learning techniques like SVM kernels, Radial Base Function (RBF) and Gaussian and decision tree for the detection of prostate cancer. In addition, they also employed feature extraction strategies such as texture, structural, SIFT and Elliptic Fourier Descriptors (EFDs) features. Finally, the performance was assessed based on single as well as union of the above features using the above mentioned machine learning classification techniques.

Dataset
For the study, dataset was collected from the publicly available database that is provided by the Harvard University (National Center for Image Guided Therapy that is a research center for biomedical technology, Radiology Department) which is situated in Brigham and Women Hospital, Harvard Medical School, funded by National Institutes of Health. This database consists of MRI images that can be used for research purpose and is accessible at http://prostatemrimagedatabase.com/index.html. This database contains images with different series and medical exam description of various patients, where it includes both prostate and brachytherapy images. In our study we have considered 150 images for our study for which the features were extracted using SIFT feature extraction method and classified as normal, benign and malign prostate images using multiclass SVM approach. Lowe (2004) pioneered the implementation of SIFT as an image descriptor for matching and recognizing images. The SIFT is an algorithmic approach used for digital images in order to detect and identify local image patterns. It locates some key points and then supplies them with perceptible data that can be used for object recognition. SIFT works by extracting certain key points in the image and then by comparing them with the reference images that are stored in the database (Wang et al., 2013). Later each feature in the new image is then collated with the features of the images stored in the database in order to find the matching features, for which Euclidian distance of the feature vectors are calculated. Once the candidate features are extracted further filtration takes place based on its location, scale and its direction. SIFT algorithm provides a set of features from an image that are unaffected by noise, scaling, occlusion, illumination changes, rotation or blurring.

Scale Invariant Feature Transform (SIFT)
The basic idea of how SIFT assigns orientation to each of the key points that are invariant to rotation and extract the descriptor is illustrated in Fig. 2. Here each key point location is assigned one or perhaps more orientations based on the local gradient direction of the image, after which a descriptor for local image region is computed until a highly distinctive local image region is got. Further the working of the original SIFT algorithm is shown in Fig. 3 (Mizuno et al., 2011) in which a 8-bit grayscale image is taken as input which is subjected to scale spacing using Gaussian filtering after which key point extraction is performed as the next step using Difference of Gaussian (DoG) (PCF, 2020). As a third step orientation to all the extracted key points is assigned. This step basically done to make the image rotation invariant. Finally, the key descriptors are calculated. These series of steps are performed iteratively until highly distinctive local image region is obtained. The similar approach has been followed in our work also.

Scale-Space Extrema Detection
This step identifies those locations and scales which can be identified from various views for the same image. This is achieved using Gaussian function.
The scale space is defined by the function: The difference of Gaussian function is then calculated by using the difference of Gaussian of two scales separated by a factor k and is given by the equation: To identify the local maxima and minima of D (x, y, σ) every point is now compared to its eight neighbors that are at the identical scale-space and also to its nine neighbors above and below one scalespace. If the value that is obtained corresponds to the lowest or highest of all these points, then this point is said to be extrema.

Keypoint Extraction
In this step, the algorithm tries to remove some more points from the key points list by eliminating the low contrast features and also those features that are poorly located in the edge. This is achieved by using the Laplasian function. If the value obtained is less than the threshold value, then that particular feature can be excluded:

Orientation Assignment
A histogram of gradient orientation θ(x, y) can be assigned for each key point from the gradient magnitudes m(x, y) of the neighboring key points: where, L is the Gaussian smoothed image.

Keypoint Descriptor
The local gradient information that was employed in the previous step, is used to create the keypoint descriptors. Keypoint descriptors normally makes use of a set of 16 histograms, that are placed in a 44 grid, with each one having 8 orientation bins Fig. 2

Multiclass Support Vector Machines
Support Vector Machines (SVM) are by nature two class classifiers that requires labelling of the entire data but the problems in real life we may be required to deal with multiple classes, in order to deal with such situations, multiclass SVM's can be used, as it solves this problem by forming multiples of two class classifiers based on the feature vector derived from the input features and the class of the data. SVM is a supervised machine learning algorithm which finds its application in various fields such as text categorization, image classification, handwriting recognition, semantic parsing, pattern recognition, etc. SVM is based on the theory of decision plane which define the decision extremities. At first, the data that is trained is mapped onto a high dimensional space that forms a hyperplane separating one class of objects from the other class (Burges, 1998). In order to achieve the minimal error function, SVM performs the iterative training algorithm. Figures 4 and 5 shows the graphical representation of both two-class and multiclass SVM classifier (Wu et al., 2018).

Proposed System for Prostate Cancer Detection
In this study, a new approach for prostate cancer detection has been proposed. For this a hybrid algorithm has been used, by considering two already existing and well known algorithms, that is, the Scale Invariant Feature Transform (SIFT) and Support Vector Machine (SVM) with the aim to achieve high performance. In order to classify the test image, our proposed method follows mainly three steps. In the first step a certain number of training samples are identified based on the Euclidean distance present between the training sample and the test images. In the second step the number of matched SIFT feature pairs of each sample evaluated and the one with highest number of matched pair is considered and finally the similarity index between that of the training and the test samples is evaluated and based upon the matching category the test image is classified.

Experimental Results
The classification performance was measured using the multiclass SVM classifier and the performance was evaluated using Scale Invariant Feature Transform (SIFT) method. As shown in Table 1, the potential of the system was evaluated using sensitivity, specificity and accuracy. Sensitivity provides us with the True Positive (TP) rate, that is, sensitivity is the capability of a system to precisely identify those with the disorder, whereas specificity is the capability of the test to precisely identify those without the disorder that is True Negative (TN) rate. The equations for calculating sensitivity, specificity and accuracy are given as: where, FP is the false positive rate and FN is the false negative rate. The performance was checked at various ratios of training and test data. It was observed that the maximum performance was achieved when we took 40% training data. The highest accuracy rate achieved through this approach is 99.9451% Figure 6 shows the graph of the performance obtained by the SIFT-SVM approach at different training percentage. Figures 7-9 presents the results of the proposed SIFT-SVM approach that classifies the data/images of the prostate into Normal, Benign and Malign categories respectively.
The LOAD DATA option that is seen in the result window of Fig. 7 is used to load the data from the database. The system is initially trained to differentiate between normal, benign and malign categories using machine learning approach.    Later based on the knowledge gained after training, the system automatically learns to categorize the image. When the LOAD DATA option is clicked along with the image, even the features of the image gets loaded. Here the features are extracted using a function called get feature () where all the images are first resized to ensure uniformity in the image size. Once the images are resized, only the scale invariant features are extracted by performing the Gaussian operations. Next the CLASSIFY option on the result window classifies the selected image into one of the above mentioned categories by using the multiclass SVM classifier. The multiclass SVM classifier models a given training dataset with a corresponding group vector and classifies the given test data according to one versus all relations. Finally based on the training percentage the performance is calculated.

Conclusion
In this study, a 2-stage model has been proposed for detection of prostate cancer. The features were extracted using a SIFT feature extraction strategy, which extracted feature points that were invariant to affine modifications. Later a properly trained multiclass SVM classifier was incorporated to classify the input images into 3 categories: Normal, benign and malign. The performance of this hybrid SIFT-SVM approach is computed in terms of accuracy, sensitivity and specificity. The results were checked with varying ratios of training and testing data. The highest performance was obtained at 40:60 ratios of the training and test data respectively with the accuracy of 99.9451%, sensitivity of 99.9390% and specificity of 100%.
This SIFT-SVM (multiclass) approach was used for the first time in prostate cancer detection and has shown appreciable improvement in the performance when compared to the already existing works in this area. Further we can extend this hybrid model by considering multiple statistical features such as mean, variance, entropy, energy, etc. with the SVM classifier.