Classification of Normal and Abnormal Lung CT-scan Images Using Cellular Learning Automata

: This paper proposes a medical pattern recognition system based on the Cellular automata (CA). CA or cellular machine is a dynamic mathematical model that consists of several similar and simple units organized by considerably simple local rules. Each cell acts as a simple computer automaton. This can lead to the implementation of the complex computations through uncomplicated methods. However, the CA model needs to determine certain rules for specific use and this model is regarded as suitable for modelling certain systems. To overcome this problem, a method is needed through which the favorable rules are extracted. Cellular Learning Automata (CLA) model is obtained from developing CA by appending a Learning Automaton (LA) to each cell. Many applications of CA are known today, especially in the field of pattern recognition. Therefore in this study, we use the CLA to design an automatic system to diagnosis the images which contain cancer tissue. Hence in this study, after applying the required approaches on lung Computed Tomography (CT) images, images are classified through the CLA model and the proposed methods are evaluated in terms of sensitivity, specificity and accuracy. The proposed system promises a flexible and low complexity model. The method has been tested on 22 slices of CT scan images from a real-world dataset and has yielded satisfactory results. The model with a low error rate (0.09), yielded a favorable accuracy (95.4%).


Introduction
Pattern recognition is the study that identifies an object through a signal or image, based on its features which are obtained from a set of related data. In the pattern recognition approach, the extracted information from each image is contrasted into an explicit structural model of the desired object and recognition is achieved from the analysis of the images. The image or video will be searched to find a model based on the extracted information. Recently, the pattern recognition approach has become a topic of interest in the study of medical diagnosis and employing the technique has added significant contributions to the advancement of this field. Hence, several medical pattern recognition systems have been suggested so far.
The operation of pattern recognition is similar to the classification and the final aim of pattern recognition is to extract the patterns and classify them so that optimum results can be obtained. A pattern recognition system is formed based on basic steps including the preprocessing, feature extraction and selection, as well as classifier design and optimization as illustrated in The pattern recognition (PR) system needs an efficient, precise and highly reliable classifier. Therefore, if the methods are wisely selected, it can increase the detection and classification ability of the system. Assigning each instance to a class is called classification which is a problem for the PR approach (Jamshidnezhad and Nordin, 2012). The pattern recognition system gives a class label as an output, for example, "1" or "0" are classes label in a quality control test of a product (Liu et al., 2006). An image includes the quantitative spectral information which is used by the image classifier in order to discriminate the images from each other. The interpretation technique of visual images is based on size, shape, pattern, texture, shadow, tone and association (Kahaki et al., 2017). Classification involves the steps illustrated in Fig. 2.
In general, digital image classification techniques can be divided into two main categories: Supervised and unsupervised classification. In order to perform image classification, the images need to be analyzed and suitable decision rules should be extracted. In the supervised classification technique, the classes are determined and representative samples of the pixels are identified by the computer. When there is less information about certain object in some cases (images), in order to perform classification, the unsupervised classification techniques are utilized (Parashar and Harish Kundra, 2014). Most of the classifiers are supervised and need to be trained with labeled data. Hence to design a supervised classifier, it is imperative to prepare the training data and for this purpose we should obtain some objects such as: Enhancing the images, segment the region of interest, select the aim part according to the extracted features, prepare it for training the classifier and finally, training the classifier (Ahmed and Nordin, 2011).

Background and Related Works
Pattern Recognition (PR) is the implementation of computer algorithms in order to discover the regularities in a dataset. The regularities are then utilized to perform some proceeding, such as classifying the data into separate categories (Bishop, 2006). Basically, in the PR approach, a sample should be recognized as the criteria of measurements. There are several PR techniques that are widely used and each method has its relative benefits in various operational situations (e.g., Neural Networks (Raj et al., 2016), PCA (Segreto et al., 2014), Fuzzy sets (Mitra and Pal, 2005).
In order to carry out the PR in the medical field, special medical knowledge and experiences are required. The type of required information is often not fixed and constant and some have relatively low-level features such as texture, shape and other pixels based on the statics extracted from the images and utilized for recognition (Homma, 2009). The PR approach based on medical images is generally conducted by extracting the diseased tissue from the subject and then analyzing it. The step that follows is the detection of some regulation that is used to classify the images into the correct class which will be used to diagnosis. Automatic systems that are designed for medical diagnosis through the integration of computers and medical science are called Computer Aided Diagnosis (CAD).
Cellular Automata (CA) or cellular machine is a dynamic mathematical model, the model concepts were initially introduced by Neumann in 1966 (Wongthanavasu and Ponkaew, 2013). By following the cellular automata studied by Ulam, Conway (introduced the Game of life) and Wolfram (studied and presented the Wolfram rules) (Sarkar, 2000;Shiraishi et al., 2011). The main purpose of this model is to design a selfreproductive system which is computationally complete (Rennard, 2000). This model consists of cell lattices where each cell has a set of stats and work based on the local rules. The CA model works in discrete time and the cells states are updated according to local rules. In time t+1, the state of a cell is determined based on the previous state of the cell and its neighbors in time't' (Abu Dalhoum and Al-Dhamari, 2010). Three types of cellular automata have been introduced so far; one, two and three-dimensional cellular automata (Hadavi et al., 2014;Liu et al., 2006). For this study, we used the two dimensional (2D) CA. The 2D-CA model can be shown by a 4-tuple, (A, S, N, F) where A is a set of cells, S shows a set of possible states for the cells, N represents the neighborhood position and F is a function that obtains the new state for the cell (Fathy Navid and Bagheri Aghababa, 2013): Although each cell of the CA model works using simple rules, by local interaction, the cells lattice can 16 perform complex computations in less complex approaches, which is one of CA's advantages. It should be noted that as a problem, the CA model needs a certain form of rules and these rules should be carefully determined (Rosin, 2006). As a solution to this issue, an approach which is able to automatically generate the rules is required. Making the CA intelligent and improving them by adding the learning capability is one of these methods. In CA, the state of a cell is defined by a rule, according to the dependence of the neighborhood of the cell. The activities scope of the cells is defined by the neighborhood. Therefore, the type and radius of the neighborhood can impress the simulation results. Figure 3 illustrates some of the most common neighborhoods for 2D-CA (Janssen, 2010).
For calculating the new state of a cell, it is required to know how can be referenced each cell's neighbors. For one-dimensional CA, it is simple as the cell that is indexed with (i) has two neighbors, which are (i-1) and (i+1). In the 2D-CA, the cells do not have a single index; they indexed by (x, y), according to the rows and columns (Shiffman, 2012). The neighbors of a cell (x, y) are shown in Fig. 4.
As previously mentioned, the next state (t+1) depends on the state (t) of the cell and its neighbors. More neighborhood is given by: Logical Exclusive-OR operation is represented by ⊕ in Equation (2).
The operation of the Cellular Automata (CA) on the medical images was surveyed by Wongthanavasu (2011). The CA techniques appeared as a natural tool for image processing. CA model has several advantages; it has a local nature and the computing implementation is parallel and straightforward. The author expressed in his report stated that CA has resulted in favorable medical image processing. In this study, several uniform CA algorithms have been applied to various mammography images.   17 The use of a group of classifiers can increase the classification accuracy in comparison to using only single classifier. Mostly, these groups of classifiers are constructed using various methods, where a specific issue would be assigned by the user. To overcome this problem, a group of classifiers which are constructed based on the CA is proposed using a self-organizing system (Kokol et al., 2004). A purpose of their study was to create a combination of several classifiers in a group which would be represented by the CA. They used the self-organizing ability of CA to combine the classifiers using an ensemble approach and various classifiers are assigned to each cell. The simple Classifier Cellular Automata (CCA) is compared with various decision tree approaches (greedy, genetically and boosting) and the results indicated the CCA gives the best results in the case of medical data in comparison with other classifiers. Povalej et al. (2005) also used the CA to solve a classification problem in another study by combining the various classifiers. In this system, the successful information in the classifiers is stored in the corresponding cells. The energy of each cell is determined based on the traction rules. Based on these transaction rules, the neighborhood cells will engage in some interaction to execute the learning processing.
The CA model is a simple distributed system that uses parallel architecture advantage, where each cell is considered a simple computer machine. Abu Dalhoum and Al-Dhamari (2010) in their study proposed a CA model for use in the neuroimaging field, particularly to classify functional Magnetic Resonance Imaging (fMRI) of the brain. In their study, two classifiers which are the Support Vector Machine (SVM) and CA have been applied on the data and the results were compared. The results demonstrated that the CA classifier obtained better accuracy and sensitivity compared to the SVM classifier.
By using the classification approach that is based on the relation between attributes, some suitable rules are extracted to classify the data. As an important point in machine learning, it is worthy to note that the process of learning leads to the improvement of the model that enables it to learn the training problem efficiently and provide a solution.
Therefore, a new classification method based on cellular learning automata as a data classifier which involved three steps was presented (Esmaeilpour et al., 2012). The first step utilizes a complete graph where the pattern was extracted. In the next step of the CLA model, a strong pattern (the most repeated pattern) was selected. In the final step, the rate of compliance was calculated according to the value of the reward and penalty to test the model. After applying the suggested method on some of the UCI Machine Learning datasets, it was compared with previously performed methods such as Naïve Bayes (NB), Decision tree (C4.5) and Multilayer Perception (MLPN) and observations were made regarding the time of training and classification accuracy. In terms of training time, the CLA managed to obtain the second rank. The use of the programming language and techniques are very effective in the overall time of training, while accuracy is influenced by the suggested algorithm. The evaluation results showed that the suggested model obtained a favorable yield in both accuracy and training time. It is also simple to implement and has low computational complexity.
A computer-aided diagnosis system was proposed by Chen et al. (2008) to distinguish between cancerous cells and normal cells based on the cellular automata and learning methods. In their study, the differences between the cancerous cells and normal cells were detected by utilizing the pattern recognition techniques and cellular automata with evolutionary learning. Making a system that is able to detect features independently was the main aim of the mentioned study. The study conducted a computer analysis on the microscopic images of the cell and consisted of three steps. The first step was the recognition of the cells borders along with the segmentation. The next step involved extracting the features. The last step was the classification. Initially, for detection of the cell boundaries, the images noises were removed by applying Photo Impact (an image editing tool). After the pre-processing step, the format of the saved images was bitmapped and then the images were saved in the ASCII format. Initially, according to the obtained cancerous cells patterns, some Feature Detectors (FDs) were created. The generated FDs were compared with the obtained cell patterns to establish whether the pattern of a cell matched any system FDs; this cell was labelled as a cancerous cell and if no pattern was established, it would be labeled as a normal cell. An FD will be known as a favorable FD when it conforms to a minimum of one cancerous pattern and does not have conformity to any normal cell. This will allow the CA model to improve the performance of lesser-performing FDs. The achieved results indicated that the suggested system had a high ability to distinguish between the normal and cancerous cells.
Cellular Automata model only can accept the reinforcement signal as input and is not able to receive another input such as the state of the environment. To solve this drawback, Ahangaran et al. (2017) proposed a new model of CLA, in which an extra input is entered to each cell, in addition to the reinforcement signal. This extra input is the information from the environment. It can improve the computational power and flexibility of the model. It evaluated the proposed model for classification, 18 clustering and image segmentation. Their model provided acceptable result on various data. In terms of the performance of the model in the classification task, the CLA model managed to achieve the second level after SVM by an average accuracy of 84%.

Materials and Methods
Automatic Lung cancer detection is considered as a pattern recognition problem, in which the final step of the data processing is data classification. The data are the lung CT scan images of humans. As mentioned earlier, a PR system involves three basic steps: Preprocessing, feature selection and extraction and classifier design. Since the current study investigates a PR problem, the structure of the proposed system is based on three main parts. These three parts are referred to as the pre-processing, feature extraction and cancerous nodule detection and training and recognition. The overview of the proposed model is illustrated in Fig. 5.

Data
This study was implemented to images from the Parsian medical imaging center lung CT database. Two sets of lung CT scan images were needed. One of them were used as the training set and another one was required for validation. This database contained 73 CT images with the .dcm format in which 70% of the dataset (51 images) were used for training and 30% images (22 images

Pre-Processing
The pre-processing results included a compact representation of the pattern. In the current study, the pre-processing step was divided into two parts: Image enhancement and image segmentation. Most of the images include some undesirable details such as blurring and noise, which should be removed, so the image enhancement methods were applied to the images to enhance the important features. Compared to various enhancing methods, the Gabor filter had a good effect on the data images. Gabor filter is a band-pass filter containing both orientation-selective and frequencyselective properties and comprises optimal joint resolution in both frequency and special domain (Wang et al., 2005). Hence, Gabor was appropriate for removing the noise and enhancing the medical images. A Gabor function is shown as the following:     where the orientation of Gabor filters is depicted by ([0,]) and the variance of Gaussian that envelop along the x and y axes are x and y respectively, the sine radian frequency is shown by. The Gaussian envelop in the following: Where: Wang and Sun (2010).
In the next part of the pre-processing step, to focus on the Region of Interest (ROI), the segmentation method was performed. The main goal of segmentation is to change the representation of the image, which makes it easy to analyze (Sharma and Jindal, 2011). Thresholding is one of the famous segmentation methods, which is quick, easy and has low space keeping. As its name implies, this method works based on one or several assigned thresholds.
Another notable method is the region growing. The region growing method consists of simple concepts, multicriteria that can be selected at once time and the border obtained by this method is very accurate and thin and it is able to distinguish the regions with similar attributes (Kamdi and Krishna, 2012;Saliba and Dipanda, 2013). The 19 region growing segmentation approach is based on the threshold, which starts from the determined seed point and checks all the neighboring pixels according to the similarity with seed point; the region will frequently grow if the similarity range of the neighbor is sufficiently high. If the difference of the newly discovered region is more than the threshold (maximum intensity distance = 0.92), the process would be stopped. The basic formulation of this method is:

Features Extraction
There are many possible features that have been investigated and used as characters. Two major categories can be assumed for pattern recognition features which are structural and statistical features. Several appearances based on statistical features were suggested by the researchers, but that required prior modeling assumptions that would enable the system to overcome the inaccuracy and inability. This is the main disadvantage of statistical pattern recognition approaches (Gaikwad et al., 2010). Structural pattern recognition is more flexible than statistical pattern recognition (Spillmann et al., 2006). Structural pattern recognition has been used and was successful in various applications such as digit recognition (Tuba et al., 2016), shape classification (Chen et al., 2009;Ramesh et al., 2015) and bioinformatics (Marchiori, 2013). Most feature extraction methods are supervised; this means that they need to have prior knowledge about the pattern and predefined training samples. Feature selection is the process of selecting the best subset of the input space in a way that concludes optimum results (Kahaki et al., 2016). After selecting the optimum subset, a classifier should be designed.
Lung cancer diagnosis is one of the many applications of the PR approach. As previously mentioned, PR works as a classification process. The goal of PR applications is extracting patterns optimally and separating the classes. For this purpose, accurate and principle features need to be extracted. There are several methods for the feature extraction and classification stage that help researchers achieve better results. In this phase, a frequent domain filter is applied to images based on texture features. Since the image consists of three main layers (down or background, up or surface and median or content) using the high-pass filter, lowpass filter and the concurrent implementation of both of them and subtracting the yielded images from the original images, a band-pass filter is specifically designed for the detection of a cancer nodule. Therefore, by utilizing a band-pass filter in the frequent domain, the cancer nodule is detected and then it is imported to a cellular lattice as a pattern instead of some statistical features (size, contrast). The ideal band-pass filter is: where, D (u,v) represents the distance between the frequent center and point (u,v) and the low frequency bound and high bound are shown by DL and DH respectively (DL = 60, DH = 120).

Training and Diagnosis
After the images passed all of the phases above, the cancer nodules would be identified. According to previous entries, we intended to utilize the achieved images from prior phases as a pattern for training the classifier. Therefore, for generating a pattern of cancer nodules, the gained images were converted into a binary format. To train the CA, we used the CA model which has been improved by the learning automat, so-also known as the learning automata (CLA) To solve the training problem of the CA model, a new approach was proposed, which involved allocating a learning automaton to each cell. This approach led to the development of the CA to become a powerful and effective model for many problems, which are decentralized (Esmaeilpour et al., 2012). Adjusting the state of the transition probability for the CAs is the basic concept behind the cellular learning automata model. Value of si in time t is shown by v t (si). N is the cell neighborhood set; R describes the supervisor's rules which control the assigning 20 rewards-penalty the cells. Cellular learning is shown by L and this model needs a function that is displayed by C for the purpose of dividing the reward to the lattices cells. Therefore: si: Selected action in step k: After assigning the reward and penalty to the cells, their action's probability vector is updated as follows: b ∈ (0, 1), p(k): Probability of the cells action.
In Fig. 6, the pseudo code for CLA training algorithm is illustrated. The Cellular Automata model is selected for the classification stage in this study. By utilizing the training set, some patterns have been identified and the related rules have been generated. Therefore, according to the generated rules, the pixels of the image are compared in terms of compliance with a pattern. Hence, if the discovered region matches the rules, the system diagnoses the image as an image of lungs with cancer. Otherwise, the image is classified in the non-cancer lungs class.

Research Methodology
In order to evaluate the proposed approach, we used a lung CT scan image dataset, which was obtained from Parsian medical imaging center. A set of 22 images were used for testing the model, which consisted of 11 lung cancer images and 11 non-cancer images. The initial images that were obtained by a digital CT scanner were saved by the Digital Imaging and Communications in Medicine (DCM) file extension. Because the DICOM images include many details, they take a considerable space so, for working on images in this investigation, the DICOM images were converted to images with the JPEG format. Each image has a size of approximately 513 KB, with dimensions of 512×512 pixels. Figure 7 illustrates all the steps of the suggested method.
According to this figure, the image was enhanced using the Gabor filter in the first step. In the next step, we applied the region growing algorithm based on the threshold to segment the lung area from the bones. Then the image was passed from a band-pass filter. To obtain the difference, the obtained image from the band-pass filter was subtracted from the enhanced image. In the next stage, the image was converted to a binary image. The binary image entered to CLA model as an input to locate the effective rules which described the cancerous tissue. Through these stored rules in the database, the test images were classified as either cancerous or non-cancerous cells. if the neighbor rate> assigned rate Reward the pattern else Penalty the pattern end; convert the found pattern to rule; update the probability action set (P); end; end; CLA training algorithm 21 As previously mentioned, various paradigms have been proposed and used for medical images processing. Convolutional neural network (CNN) is one of the well-known models which have been employed for many tasks. Therefore, Anthimopoulos et al. (2016) designed a CNN based system to classify the lung images. This model provides promising results on 120 CT scan images, but it is a time-consuming procedure and the training process becomes slower due to a large number of parameters. This paradigm also needs a large number of training images. Support Vector Machine (SWM) is another popular method utilized by Kaucha et al. (2017) to classify the lung CT scan images. The model is designed based on a combination of image processing and data mining. The result of experiments in this study shows that the system is able to classify the images with 95.16% accuracy. This model was applied and evaluated on the LIDC dataset which contained 1018 cases. But the auteurs expressed that, this model demanded more data and it required considerable computations.

Results and Discussion
In Equations (11), (12) and (13), TP defines the True Positive, TN represents the True Negative, FP shows the False Positive and FN is the False Negative. The achieved values are shown in Table 1 and the evaluation is performed to illustrate the performance of the suggested system.
According to Table 1, the results of the proposed system show that the suggested system successfully diagnosed 10 cancer images and 11 non-cancer images and the sensitivity of the system is 90.9%. The specificity was evaluated (100%), which indicates that the proposed algorithm has just one misdiagnosis. The accuracy of the measuring system is 95.4%, which proves that it has a suitable level of reliability.
The receiver operating characteristic (ROC) curve is a graphical plot and is the best way to compare diagnostic tests. Determining the diagnostic performance of a test or evaluating the capability of a diagnosis system to distinguish the diseased cases from the normal cases, it is developed by utilizing the ROC curve. Each point on the ROC curve represents the relation between sensitivity (TP/ (TP+FN)) and specificity (TN/ (TN+FP)) with a spatial decision threshold. Figure 8 shows the sensitivity of the system based on the ROC curve. Whatever the area under the curve is greater, the performance of the system is higher. N is the radius of the neighborhood.  Fig. 8, it is clearly observed that, by choosing a low neighborhood (N) radius, the proposed algorithm fail to achieve favorable results. When the number of neighborhood radius reaches a range between 58 and 60 cells, suitable results are obtained by the algorithm. However, after this range with an increasing neighborhood radius, only the computation cost is increased, whereas the results have not improved. The neighborhood radius of approximately 60 cells obtains the best result which corresponds with a 0.909 sensitivity value. If it exceeds this amount, the sensitivity of the system will not increase anymore. Hence, choosing a favorable neighborhood radius is one of the more critical characteristics, which affects the diagnosis system's performance. As shown in Fig.  9, with each training epoch, the capability of the system in terms of grouping will be increased and the possibility of error is reduced.

Recommendation
Based on the results, the error rate decreased over the training time shows that if we have appropriate and sufficient training data, the efficient rules can be found and the system becomes more robust. After reaching the radius of 60 neighbor cells, the system provides the best results in terms of accuracy. Hence, choosing an appropriate neighborhood radius is one of the more important characteristics, which affects the diagnosis system performance. By utilizing a low complex method, the obtained system does not need complex computing operation and prohibitive hardware.

Limitation
In this study, the proposed model is applied to the CT lung images with 512*512 resolution. This size can vary in terms of the images captured by different devices and can affect the results. In this procedure, we just used gray level images. The neighborhood radius of approximately 60 cells provides the best result in this work, but it can be different for various tasks.

Conclusion
According to the last image processing techniques which have been used for the lung CT-scan images processing, we have selected a few methods which have better performance to apply on the target database. In the previously suggested CAD systems, various classifiers were used such as neuronal networks, genetic algorithm (Daliri, 2012) and Decision tree (Tartar et al., 2013), which were applied on different databases using diverse image prepressing techniques in each enhancement, segmentation and feather extraction steps. The number of existing images in each used database, the percentage of selected images to training and testing are also dissimilar. While each one of these conditions can affect the yield results, the comparison of the systems under different conditions is not completely correct. Regarding the achieved accuracy, the proposed system is classified in a relatively good category in terms of performance.
In this study, a medical pattern recognition system was designed based on a proposed algorithm using cellular learning automata. The suggested model works based on the communication between its components. Through these relations, the required rules for classification are extracted. With generated rules, the classifier is able to classify images into the correct classes. As previously mentioned, the CLA model communicates with the environment. Therefore, the selection and performance actions receive an answer from the environment. Through this procedure, the CLA is able to learn from an unknown environment then make a wise decision for further. The result of the system evaluation indicates that the proposed model is able to obtain a favorable accuracy value. The implemented system is simple and its complexity and computation costs are low and if training is done properly, the model is able to decrease the error rate, which improves the system's reliability.

Future Works
In this study, we aimed to investigate some image processing techniques and used the pattern recognition methods to design an assistant system to distinguish the normal from the abnormal lung CT scan images. With regard to the goal of this paper to create a roadmap for the study of the CAD systems, their demands are considered. Previously accomplished works are investigated to find a possible solution. As a result, some techniques for solving the problems have been suggested and evaluated on the dataset and the end results and findings were discussed. In this regard as future works; the system accuracy could be improved by applying a detection technique based on a combination of the appearance and static features of the images. Apart from that, only the gray level images were used in the 0 20 40 False negative rate Number of training image 1.5 1 0.5 0 23 experiment to achieve the recognition results, so additional experiments involving color images would be worthwhile to investigate