Swarm Negative Selection Algorithm for Electroencephalogram Signals Classification 1

Problem statement: The process of epilepsy diagnosis from EEG signals by a human scorer is a very time consuming and costly task considering the large number of epileptic patients admitted to the hospitals and the large amount of data needs to be scored. Therefore, there is a strong need to automate this process. Such automated systems must rely on robust and effective algorithms for detection and prediction. Approach: The proposed detection system of epileptic seizure in EEG signals is based on Discrete Wavelet Transform (DWT) and Swarm Negative Selection (SNS) algorithm. DWT was used to analyze EEG signals at different frequency bands and statistics over the set of the wavelet coefficients were calculated to introduce the feature vector for SNS classifier. The SNS classification model use negative selection and PSO algorithms to form a set of memory Artificial Lymphocytes (ALCs) that have the ability to distinguish between normal and epileptic EEG patterns. Thus, adapted negative selection is employed to create a set of self-tolerant ALCs. Whereas, PSO is used to evolve these ALCs away from self patterns towards non-self space and to maintain diversity and generality among the ALCs. Results: The experimental results proved that the proposed method reveals very promising performance in classifying EEG signals. A comparison with many previous studies showed that the presented algorithm has better results outperforming those reported by earlier methods. Conclusion: The technique was approved to be robust and effective in detecting and localizing epileptic seizure in EEG recording. Hence, the proposed system can be very helpful to make faster and accurate diagnosis decision.


INTRODUCTION
The brain activity can be measured in variety of ways such as Magneto Encephalogram (MEG) and optical images. However, the most popular one is Electroencephalogram (EEG) [12] . Therefore, the EEG has long been an important clinical tool in diagnosing, monitoring and managing of neurological disorders especially those related to epilepsy. Signals of EEG contain a wide range of frequency components. However, the range of clinical and physiological interests is between 0.5 and 30 Hz. This range is classified approximately in a number of frequency bands as follows: δ (0.5-4 Hz), θ (4-8 Hz), α (8-13 Hz), β (13-30 Hz) [13] . Since there is no definite criterion evaluated by the experts, visual analysis of EEG signals in time domain may be insufficient [1] . Also large amounts of data are generated by EEG monitoring systems for electroencephalographic changes, make their complete visual analysis is not routinely possible.
In the framework of the epilepsy, when diagnosed properly, about 75% of the epilepsy cases can be effectively treated based on current therapies: Medications or surgical treatments. Unfortunately in case of surgical treatments, the patients undergo long presurgical evaluations. Bulks of multi-channel EEG recordings are acquired during this period for deciding on the areas of the brain to be removed. The visual scoring of the EEG records by a human scorer is clearly a very time consuming and costly task [14] . The other 25% of individuals with epilepsy have seizures that are uncontrollable. The most promising therapies for medically resistant epilepsy are implantable devices that deliver local therapy, such as direct electrical stimulation or chemical infusions, to affected regions of the brain [15] . For effective performance, these treatments will rely on robust algorithms for seizure detection and prediction.
Hence, automated systems to recognize EEG changes have been under study for several years. Most of these detection systems use approaches coming from area of Artificial Intelligence (AI). However, such systems basically use either of two different input representations: The raw EEG signal or the extracted EEG features. In the former case, the raw EEG signal is presented to the classifier after a proper scaling and windowing. In the second case, the extracted features such as Wavelet Transform (WT) coefficients are presented to a classification model for training and testing purposes.
A wide range of AI techniques [1][2][3][4][5][6][7][8][9][10][11] have been proposed in the literature to solve the problem of seizure detection in EEG signals. Alkan et al. [1] used EEG power spectra extracted by Multiple Signal Classification (MUSIC), Autoregressive (AR) and periodogram methods as inputs to Logistic Regression (LR) and back propagation neural networks (BPNNs) classifiers. Their experiments showed that BPNN was more accurate than the LR. Acir and Guzelis [2] introduced an epileptic seizure detection method based on Support Vector Machine (SVM). The raw data fed to the SVM after it filtered using AR-based modified nonlinear digital filter. Further extended by Acir [3] , two discrete perceptron were used to filter the data for modified Radial Basis Function Network (RBFN) classifier. Subasi [4] decomposed EEG signals using WT into the frequency sub-bands which then used as an input to feedforward error backpropagation ANN (FEBANN) and Dynamic Wavelet Neural Network (DWN). The experiments had been approved that the DWN was more accurate than the FEBANN. In [5] , Subasi showed that a Dynamic Fuzzy Neural Network (DFNN) classifier achieved best than neural network model. The Mixture of Experts (ME) neural network was implemented by Subasi [6] for classification of the EEG signals using the features extracted by WT. A hybrid system with two stages: Feature extraction using Fast Fourier Transform (FFT) and decision making using decision tree was developed by Polat and Guzelis [7] . Also, Guler and Ubeyli [8] applied a two stage system for classification of EEG signals: Feature extractions using WT and signals classification based on adaptive neuro-fuzzy inference system (ANFIS) model. Guler et al. [9] evaluated the diagnostic accuracy of the Recurrent Neural Networks (RNNs) on the EEG signals using Lyapunov exponents as features.
Lyapunov exponents were computed based on a technique related with the Jacobi-based algorithms. Derya Ubeyli [10] used eigenvector methods for feature extraction and multiclass SVM for classification decision. However, the algorithms of Artificial Immune System (AIS) have not been widely explored in the field of EEG-based diagnosis. Yet there exist in literature only very few studies in which AIS were applied to epileptic seizure detection. Polat and Guzelis [11] used Artificial Immune Recognition System (AIRS) with fuzzy resource allocation for EEG classification in a hybrid system with three stages: Feature extraction using Welch (FFT) method, dimensionality reduction using PCA and EEG classification using AIRS. Therefore, this study introduced an artificial immune system approach for epileptic seizure detection based on negative selection algorithm (NSA) and Particle Swarm Optimization (PSO) named Swarm Negative Selection (SNS) algorithm.

MATERIALS AND METHODS
In this study, the epileptic seizure detection in EEG signals was performed in two stages: Feature extraction using the discrete wavelet transform and classification using the swarm negative selection algorithm.
EEG data: Our study used the publicly available dataset described in Andrzejak et al. [16] . In this dataset, all EEG signals were recorded with the same 128channel amplifier system, using an average common reference. The data were digitized at 173.61 samples per second using 12 bit resolution. Band-pass filter settings were 0.53-40 Hz (12 dB/oct). The complete data set consists of five sets (denoted A-E) each containing 100 single channel EEG segments of 23.6 sec duration. These segments were selected and cut out from continuous multi-channel EEG recordings after visual inspection for artifacts, e.g., due to muscle activity or eye movements. Sets A and B have been taken from surface EEG recordings that were carried out on five healthy volunteers in an awake state with eyes open and closed respectively, using a standardized electrode placement scheme. Sets C, D and E originated from EEG archive of presurgical diagnosis. EEGs from five patients were selected, all of whom had achieved complete seizure control after resection of one of the hippocampal formations, which was therefore correctly diagnosed to be the epileptogenic zone. Segments in set D were recorded from within the epileptogenic zone and those in set C from the hippocampal formation of the opposite hemisphere of the brain. While sets C and D contained only activity measured during seizure free intervals, set E only contained seizure activity. Fig. 1 shows typical EEG segments, one from each category. In this study, two sets (A and E) have been used of the complete dataset.

Artificial Immune Systems (AIS):
In the 1990s, AIS emerged as a new computational research filed inspired by simulated biological behavior of Natural Immune System (NIS). The NIS is a very complex biological network with rapid and effective mechanisms for defending the body against a specific foreign body material or pathogenic material called antigen [17] .
During the reactions, the adaptive immune system memorizes the characteristic of the encountered antigen by produce plasma or memory cells. The obtained memory promotes a rapid response of the adaptive immune system to future exposure to the same antigen [18] . In order to respond only to antigen, the immune system is distinguishes between what is normal (self) and foreign (non-self or antigen) in the body. The NIS is made up of lymphocytes which are white blood cells circulate throughout the body, mainly of two types, namely B-cells and T-cells. These cells play main role in the process of recognizing and destroying any antigens [19] .

Fig. 1: Samples of five different sets of EEG data
Both the T-Cell and B-Cell created in the bone marrow and they have receptor molecules on their surfaces (the B-cell receptor molecule also called as antibody). The way B-cells and T-cells can identify specific antigen is called a key and key hole relationship as show in Fig. 2 [17] . In this case, antigen and receptor molecule have complementary shapes, therefore they can bind together with a certain binding strength, measured as affinity. After a binding between an antibody's paratope and an antigen's epitope, an antigen-antibody-complex is formed which results into de-activation of the antigen. The B-Cell is already mature after creation in the bone marrow, whereas the T-Cell first becomes mature in the thymus. However a T-Cell becomes mature if and only if it does not have receptors that bind with molecules that represent self cells. Consequently, it is very important that the T-Cell can differentiate between self and non-self cells [20] .
AIS as defined by de Castro and Timmis [21] are: "Adaptive systems inspired by theoretical immunology and observed immune functions, principles and models, which are applied to problem solving". However AIS are one of many types of algorithms inspired by biological systems, such as neural networks, evolutionary algorithms and swarm intelligence. There are many different types of algorithms within AIS and research to date has focused primarily on the theories of immune networks, clonal selection and negative selection. These theories have been abstracted into various algorithms and applied to a wide variety of application areas such as anomaly detection, pattern recognition, learning and robotics [22] .
The negative selection algorithm introduced in 1994 by Forrest et al. [23] inspired by the mature T-Cells of the natural immune system which are self-tolerant, that is mature T-Cells have the ability to distinguish between self cells and foreign/non-self cells. This technique is used to train a set of Artificial Lymphocytes (ALCs) on a set of self patterns to be selftolerant and then these ALCs are applied as detectors to classify new data as self or non-self [21] .

Fig. 2: Antibody-antigen complex
In negative selection, any generated ALC is added to the self-tolerant set of ALCs if the calculated affinity between the ALC and all self patterns is lower than affinity threshold. The algorithm is summarized as in Algorithm 1.

Algorithm 1: Negative selection algorithm:
Create an empty set of self-tolerant ALCs as C; Determine the training set of self patterns as S; Repeat Randomly generate an ALC, x i ; Calculate the affinity between x i and each pattern in S; If the calculated affinity with at least one pattern in S is higher than affinity threshold, then reject x i ; otherwise add x i to set C; Until size of C equal to predefined number;

Particle Swarm Optimization (PSO):
The PSO algorithm was originally designed by Kennedy and Eberhart [25] in 1995, the idea was inspired by the social behavior of flocking organisms. The algorithm belongs to the broad class of stochastic optimization algorithm that may be used to find optimal (or near optimal) solutions to numerical and qualitative problems. PSO uses a population (swarm) of individuals (particles) to probe promising regions of the search space. Each particle moves in the search space with a velocity that is dynamically adjusted according to its own flying experience and its companions' flying experience and retains the best position it ever encountered in memory. The best position ever encountered by all particles of the swarm is also communicated to all particles. Depending on the topology, in the local variant, each particle can be assigned to a neighborhood consisting of a predefined number of particles [26] .
The popular form of PSO algorithm is defined as: Where: v id = The velocity of particle i along dimension d x id = The position of particle i in d c 1 = A weight applied to the cognitive learning portion c 2 = A similar weight applied to the influence of the social learning portion r 1 and r 2 = Separately generated random number in the range of zero and one p id = The previous best location of particle i also known as pbest p gd = The best location found by the entire population, also known as the gbest w = The inertia weight Velocity values must be within a range defined by two parameters -v max and v max . The PSO with the inertia weight in the range (0.9, 1.2) on average have a better performance [27] . To get a better searching pattern between global exploration and local exploitation, researchers recommended decreasing w over time from a maximal value w max to a minimal value w min linearly [27,28] : where, t max is the maximum iteration allowed and t is the current iteration number.

Discrete wavelet transform-feature extraction:
The Discrete Wavelet Transform (DWT) has been particularly successful in the area of epileptic seizure detection due to its capability to captures transient features and localizes them in both time and frequency domain accurately [5] . DWT analyzes the signal s(n) at different frequency bands by decomposing the signal into an approximation and detail information using two sets of functions called scaling functions and wavelet functions, which are associated with low-pass g(n) and high-pass h(n) filters, respectively. Fig. 3 shows the decomposition process of DWT. When the WT is used to analyze the signals, two important aspects should be considered. Firstly, the number of decomposition levels. The decomposition levels number is selected based on the dominant frequency components of the signal. According to Subasi [4] , the levels are selected such that those parts of the signal that correlate well with the frequencies required for the signal classification are retained in the wavelet coefficients. Therefore in the present study, we choose level 4 wavelet decomposition. Thus the EEG signals used in this research were analyzed into the details D1-D4 and one final approximation, A4. Table 1 shows the ranges of various frequency bands of our EEG data. Secondly, type of wavelet. According to Guler and Ubeyli [8] , the smoothing feature of the Daubechies wavelet of order 2 (db2) made it more suitable to detect changes of the EEG signals. Hence in our research, we used the db2 to compute the wavelet coefficients of EEG signals.  The computed discrete wavelet coefficients provide a compact representation that shows the energy distribution of the signal in time and frequency. In order to further decrease the dimensionality of the extracted feature vector, statistics over the set of the wavelet coefficients are used [8] . The following statistical features were used to represent the time-frequency distribution of the EEG signals: • Maximum of the wavelet coefficients in each subband • Minimum of the wavelet coefficients in each subband • Mean of the wavelet coefficients in each sub-band • Standard deviation of the wavelet coefficients in each sub-band

Swarm Negative Selection (SNS) algorithm-EEG classification:
The SNS algorithm is a hybrid classification model based on PSO and negative selection algorithms. It has been introduced in this study to classify EEG signals for diagnosis purposes. The SNS algorithm use adapted negative selection to train a set of ALCs on a set of normal EEG patterns (self) to be self-tolerant, i.e., the ability to not match any self pattern. Consequently, PSO is used to evolve the ALCs away from self patterns towards non-self space and to maintain diversity and generality among the ALCs.
In SNS, all patterns were represented in space as real-valued vectors and Euclidean distance was used as affinity measure. The Affinity Distance Threshold (ADT) of an ALC is used to determine a match with a non-self pattern. The main goal of SNS algorithm is to evolve ALCs to detect the non-self patterns that have not been presented during training. However, not all ALCs will detect non-self patterns. Therefore, each ALC that does not detect any non-self pattern is replaced by a new one. The steps of the SNS algorithm are summarized in Algorithm 2.
The negative selection trains an ALC to not match any self pattern in the training set, therefore it determine the best ADT for the ALC. In the adapted versions of negative selection algorithm, an ALC is trained to have a maximum ADT that does not overlap with the self patterns. To guarantee a maximum ADT with no overlap with self, ADT of the ALC is set to the closest self pattern. However, a pattern will be classified by an ALC as non-self if Euclidean distance between them is less than ADT [20,24] .
The PSO is used in SNS algorithm to evolve a set of ALCs to be memory ALCs. Then these memory ALCs are used to distinguish between self and non-self patterns. Initially the set of memory ALCs is empty. The purpose of the PSO is to evolve one optimal ALC to be added to the set of memory ALCs. However, the evolved ALC is added to the memory ALCs if it detected non-self patterns that have not been detected yet by the existing ALCs in the memory set.
The main objective of the PSO is to maximize the ADT of the evolved ALC. In addition to the main objective, the PSO also needs to evolve an ALC to minimize the average overlap with the existing ALCs in the set of memory ALCs. Maximizing the distance between the new ALC and the memory ALCs set is guaranteed that the evolved ALC has the lowest average overlap with the existing set of ALCs and forces greater coverage of non-self space. Therefore, to evaluate the quality of an ALC, the fitness of each particle is calculated based on the negative selection method using the following fitness function: ii. Find the global best solution gbest iii. Update each particle using Eq. 1 and 2 c. Consider gbest as a candidate memory ALC, c d. Classify non-self patterns using c e. If c detected new patterns, then add c to the set M 3. Until maximum number of iterations is reached or non-self is covered

RESULTS AND DISCUSSION
It is common practice in machine learning and data mining to perform k-fold cross-validation to assess the performance of a classification algorithm. K-fold cross validation is used among the researchers, to evaluate the behavior of the algorithm in the bias associated with the random sampling of the training data. In k-fold cross-validation, the data is partitioned into k subsets of approximately equal size. Training and testing the algorithm is performed k times. Each time, one of the k subsets is used as the test set and the other k-1 subsets are put together to form a training set. Thus, k different test results exist for the algorithm. However, these k results are used to estimate performance measures for the classification system.
The common performance measures used in medical diagnosis tasks are accuracy, sensitivity and specificity. Accuracy measured the ability of the classifier to produce accurate diagnosis. The measure of the ability of the model to identify the occurrence of a target class accurately is determined by sensitivity. Specificity is determined the measure of the ability of the algorithm to separate the target class. The classification accuracies for the datasets are calculated as in Eq. 6: Where: Where: z = The patterns in testing set to be classified z.c = The class of pattern z and classify(z) returns the classification of z by classification algorithm For analysis sensitivity and specificity, the following equations can be used: where, TP, TN, FP and FN denotes true positives, true negatives, false positives and false negatives respectively. The SNS algorithm was evaluated on EEG data in order to investigate its performance in detecting the epileptic seizures. The data sets A and E have been selected to represent the normal and epileptic classes respectively. One hundred EEG segments of 4096 data points for each class were windowed by 256 discrete data. Hence, the EEG dataset was formed by 3200 feature vectors. For each vector, the DWT coefficients at the fourth level (D1-D3, D4 and A4) were computed. The statistical features that have been calculated over the set of the wavelet coefficients reduced the dimensionality of the feature vectors to 20 data points.
For all the EEG signals dataset, the SNS algorithm has been trained and tested as 40-60 (random selection), 60-40% (random selection) and 80-20% (5-fold cross validation) respectively. The class distribution of the data points in the training and testing dataset is summarized in Table 2. In the experiments that have been concluded in this study, EEG signals that have normal activities and epileptic seizure were classified by swarm negative selection algorithm. All the obtained results display in Table 3 for 40-60, 60-40 and 80-20 training-test partitions. As it is seen in Table 3, the obtained test classification accuracies were 99.15, 99.47 and 99.22%, respectively.     Guler et al. [9] Lyapunov exponents-RNN 96.79 Guler and Ubeyli [8] Wavelet-ANFIS 98.68 Subasi [6] Wavelet-MLPNN 93.20 Subasi [6] Wavelet-ME Network 94.50 Ubeyli [10] Eigenvector-SVM 99.30 This study Wavelet-SNS 99.28 As it mentioned above, this study based on two stage methodology: feature extraction and EEG classification. In literature, many methods had been evaluated on the same methodology and EEG dataset. Table 4 shows a comparison between the results reported by those methods and the results of proposed algorithm. As it is shown from these results, the proposed method yields comparable results with SVM model [10] . However, the SNS algorithm gives the highest classification accuracy, 99.28% over other methods.
Thus, the experimental results proved that the proposed automated detection system based on discrete wavelet transform and swarm negative selection algorithm reveals very promising performance in diagnosing the epileptic seizure in EEG signals.

CONCLUSION
In this study, an automated diagnosis system was introduced for epileptic seizure detection in EEG signals. In the proposed system, the diagnosis process is performed in two stages: Feature extraction using discrete wavelet transform and decision maker using swarm negative selection algorithm (hybrid method). The SNS algorithm uses the features produced by DWT to form a set of ALCs (detectors) that have the ability to distinguish between the normal and epileptic EEG signals.
The Experiments that were conducted on the EEG signals dataset showed that The SNS algorithm has very promising performance in detecting the epileptic seizures. The method has better results outperforming those reported by many previous studies. We believe that the proposed system can be an efficient tool to assist the experts by facilitating the analysis of a patient's information and reducing the time and effort required to make accurate decisions on their patients.