Improving the Attack Detection Rate in Network Intrusion Detection using Adaboost Algorithm

: Problem statement: Nowadays, the Internet plays an important role in communication between people. To ensure a secure communication between two parties, we need a security system to detect the attacks very effectively. Network intrusion detection serves as a major system to work with other security system to protect the computer networks. Approach: In this article, an Adaboost algorithm for network intrusion detection system with single weak classifier is proposed. The classifiers such as Bayes Net, Naive Bayes and Decision tree are used as weak classifiers. A benchmark data set is used in these experiments to demonstrate that boosting algorithm can greatly improve the classification accuracy of weak classification algorithms. Results: Our approach achieves a higher detection rate with low false alarm rates and is scalable for large data sets, resulting in an effective intrusion detection system. Conclusion: The Naive Bayes and Decision Tree Classifiers have comparatively better performance as a weak classifier with Adaboost, it should be considered for the building of IDS.


INTRODUCTION
The protection of the computer network by applying intrusion detection methodology becomes an important for the network administrator and it is one of the emerging areas in the research of the network security field. The main focus of network intrusion detection techniques is to capture, look into the various header parts and data portion of the packets and classify the attack packets from the normal packets. There are basically two types of intrusion detection systems namely misuse based detection and anomaly based detection. The anomaly based detection system first learns normal user activities and then alerts all user behaviors that deviate from the already learned activities (Barbara and Jajodia, 2002). The main feature of anomaly based detection is the capability of detecting the novel attacks which are different from the already learned attacks. The main drawback of anomaly based detection is that it erroneously classifies the normal user behaviors as attacks, which would result in a higher false positive rate. The misuse based detection uses the certain standard patterns of attacks to detect intrusions by representation of the same pattern of attacks (Freund and Schapire, 1997). Misuse based detection has higher network attack detection rate than anomaly based detection but it is failing to detect novel attacks.
Related work: Proposed a Bayesian classification approach for intrusion detection. It consists of monitoring the user activities inside the network and the use of a Bayesian classification procedure associated with unsupervised machine learning algorithm to evaluate the variation between the present and the already learned behavior. The reported results showed that there was an increase in attack detection rate. Zainal et al. (2009) demonstrated the ensemble of different learning algorithms by setting the proper weighting to the individual classifiers used in the classification model. They have also observed that there was an enhancement in the network attack detection and considerable drop on false alarms.
Recently, many researchers constructed hybrid Intrusion Detection System (IDS) to deal with the challenges faced by the intrusion detection system by integrating different machine learning methodologies. Horng et al. (2011) were developed a hybrid intelligent IDS by integrating a Hierarchical Clustering and Support Vector Machines (SVM). Xiang et al. (2008) designed IDS by integrating the supervised tree classifiers and unsupervised Bayesian clustering method to detect the network intrusions happening in the network. Zhang and Zulkernine (2006) designed a novel structure of unsupervised anomaly Network IDS based on the outlier detection technique in the random forests approach. This approach reduced the time complexity and cost of memory to a large extent. The framework built by Sarasamma et al. (2005) based on the hierarchical method which improves the attack detection rate and reduces computational cost. Giacinto et al. (2003) approached the intrusion detection problem in a different dimension. Their anomaly IDS was modularized where the protocols and services are modularized which improves the detection results. Gudahe et al. (2010) have demonstrated a new ensemble boosted decision tree for intrusion detection system. Liu et al. (2010) have constructed a classifier by using a decision tree as its base learner. The ability of detecting the attacks of this construction was enhanced than SOM algorithms. Hu et al. (2008) have proposed an Adaboost based algorithm for network intrusion detection which used decision stump as its base learner. They constructed the decision rules for different categories of features such as categorical and continuous features and also they handled the overfitting efficiently. The key difference between our proposed work and that of Hu et al. (2008) is that they have used decision stump as a weak learner, while we use Bayes Net (BN), Naive Bayes (NB) and Decision Tree (DT)) as weak learners. Hu et al. (2008) considered all the attacks as a single category, while our system groups all the attacks based on its characteristics into four categories such as DoS, Probe, R2L and U2R.

Dataset analysis:
Under the sponsorship of Defense Advanced Research Projects Agency (DARPA) and Air Force Research Laboratory (AFRL), the MIT Lincoln laboratory has established a network and captured the packets of different attack types and distributed the data sets for the evaluation of researches in computer network intrusion detection systems. The KDDCup99 data set is a subset of the DARPA benchmark data set.
KDDCup99 training data set is about four giga bytes of compressed binary TCP dump data from seven weeks of network traffic, processed into about five million connections record each with about 100 bytes (KDDCup99, 1999;Tavallaee et al., 2009). The two weeks of test data have about two million sample records. Each KDDCup'99 training connection record contains 41 features and is labeled as either normal or an attack, with exactly one specific attack type. Table 1 and 2 shows the number of samples for each attack category in the training and testing data sets respectively.
The rest of the study is organized as follows. We briefly present the overview of Adaboost algorithm, Bayesian Classifiers and Decision Tree algorithms. In the next part we discussed our proposed work. Experimental analyses are performed and is also given. Finally we conclude the study with suggestions for future work.

Overview of algorithms:
Adaboost algorithm: AdaBoost is an ensemble based machine learning algorithm, which can be combined with many other classification machine learning algorithms in order to improve its classification and attack detection performance. It calls a base learner for a specified amount of iterations in a loop. For each iteration, distribution of weights D t is calculated and updated that indicates the importance of examples in the data set for the classification. On each iteration of the loop, the weights of each incorrectly classified samples are modified which is based on the distribution of the sample in the data set so that the new classifier will concentrate more on those samples classified as incorrect (Zan et al., 2007;Sabhnani and Serpen, 2003). The pseudo code of Adaboost algorithm is given in Fig. 1.

Bayesian classifiers:
Bayesian classification methodology is one of the technique used in the area of data mining for the purpose of classification of samples. Given the probability of distribution of samples in a data set, Bayes classifier can possibly accomplish the best optimal classification accuracy. Bayes Rule is constructed here to find the posterior probability from the prior probability and the likelihood of occurrence, because the latter two is generally easier to be calculated from the specified probability model.
Let X be a sample of a network connection consists of n features and C i represent a class to be calculated (Khor et al., 2010b).    The predictable classification results in an observed network connection is decided by finding P (C i |X), the probability of a class is equal to its likelihood P (X|C i ) times its probability prior to any experimental sample P (C i ), standardized by separating P (Xi) as in (6): Consider a Naive Bayesian Classification method with n nodes, X i to X n . The features and classes are represented by nodes, labeled with X n and C respectively. An assumption is made in Naïve Bayes Classification where features are conditionally independent from each other. Since P (X) is constant for all classes, only P (X|Ci) needs to be maximized as in (7) (Khor et al., 2010a). Hence: Naïve Bayes classifier is an accepted classifier appearing in its competitive performance in many research domains such as medical, business and its simplicity in computation that allows researchers to save a lot of computational costs (Khor et al., 2010b;Han et al., 2005;Friedman et al., 1997;Gupta et al., 2010;Kayacik et al., 2003).
A Bayes Net employs a graphical model to describe the relationship of features. The structure of the graphical model and also a Conditional Probability Table (CPT) of a BayesNet classifier could be built based on a training set.
The graphical model state a factorization of the joint probability distributions, where a value of a node is conditioned on its parent nodes which is given in (8). Hence: A Bayes Net can also be built manually by integrating knowledge of a domain expert. The built process is repetitive process which involves model verification and model revision (Khor et al., 2010b).

Decision tree construction:
The decision tree is frequently used machine learning technique for constructing classification system. In the decision tree construction, each internal node represents a test for a feature and each branch denotes the conclusion of the test. The leaf node of the tree indicates classes or the division of classes (Xiang et al., 2008). The pseudo code for decision tree construction is in Fig. 2. Proposed work: As per the requirements of a Network Intrusion Detection system, the construction of our proposed system consists of four components of Adaboost algorithm as shown in Fig. 3. Feature extraction, Instance labeling, devise of weak classifiers and the building of the strong classifier.

Process 1-Feature extraction:
For each network connection in the data set, the following three key groups of features for detecting intrusions are extracted.

Basic features:
This group summarize all the features that can be extracted from a TCP/IP connection. Some of the basic features in the KDDCup99 data sets are protocol_type, service, src_bytes and dst_bytes.

Content features:
These features are purely based on the contents in the data portion of the data packet.

Fig. 2: Decision tree construction
Traffic features: This group comprises features that are computed with respect to a two-second time window and it is divided into two groups: same host features and same service features. The same host featured inspect only the connections in the past 2 sec that have the same destination host as the current connection. The same service featured inspect only the connections in the past 2 sec that have the same service as the current connection. Some of the traffic features are counted, rerror_rate, rerror_rate and srv_serror_rate.
Process 2-instance labeling: After extracting KDDCup'99 features from each record, the instances are labeled based on the characteristics of traffic as Normal, Dos, Probe, R2L and U2R.

Process 3-selection of weak classifiers:
The various weak classifiers identified to use in our proposed system are Naïve Bayes, Bayes Net and Decision Tree. We have used these weak classifiers along with the boosting algorithm to improve the classification accuracy.

Fig. 3: Framework of our Intrusion detection model
Process 4-building of strong classifier: A strong classifier is constructed by using a mechanism of combining weak classifier and boosting algorithm. The strong classifier results higher attack detection rate than single weak classifier. The Pseudo code of our proposed IDS is shown in Fig. 4.

Experimental analysis:
The main focus of our work was to improve the network attack detection rate and to reduce the false alarm rate to a minimum level. The experiment was conducted using the Bayes Net, Naïve Bayes and Decision Tree weak classifiers. Weka 3.6 is a java language based open source data mining software, which comprises a group of machine learning packages for classification of samples, is chosen to implement our algorithm.

RESULTS AND DISCUSSION
In machine learning and data mining algorithms, many different measures are used to evaluate the classification models (Tan et al., 2006).

True Positive (TP):
Situation in which a signature is fired properly when an attack is detected and an alarm is generated.

Attack Detection Rate (ADR):
It is the ratio between the total numbers of attack connections detected by our proposed model to the total number of attacks currently available in the data set.

Attack Detection Rate (ADR) Eq. 15:
Totaldet ected attacks *100 Totalattacks (15) False Alarm Rate (FAR): It is the ratio between the total numbers of misclassified instances of the total number of normal connections present in the data set.

False Alarm Rate Eq. 16:
Total misclassfied ins tan ces *100 Total normalins tan ces (16) Comparison of performance of weak classifiers: Detection rate comparison: The detection rates (15) of the various attack categories by using the three weak classifiers in the boosting process are shown in Table 3. It can be noticed that, the detection rate of Dos attack increases to 97.3% and the detection rate of Probe attack increases to 91.4% when the weak classifier decision tree is combined with Adaboost. It can also be seen that the Naive Bayes weak classifier with Adaboost gives the better detection rate in the case of U2R and R2L attack categories.
False Alarm rate comparison: The false alarm rate (16) of Naïve Bayes weak classifier with Adaboost decreases to 2.61%, but it shows an increase in the case of Decision Tree as a weak classifier with the Adaboost algorithm as shown in Fig. 5.    Fig. 6 and 7 respectively. The Naive Bayes and Decision Tree algorithms took more time than Bayes Net Algorithm. It shows a decrease in training time and response time in the case of Naïve Bayes and Decision Tree as a weak classifier with Adaboost algorithm.    (Xuren et al., 2006) Based on the attack detection rates and false alarm rates, the weak classifiers with Adaboost seem to have comparable performances. Decision tree was able to give a high detection rate with low computational time in the case of Dos and Probe attack categories and the Naïve Bayes with Adaboost gave a better detection rate in the case of R2L and U2R attack categories as compared to other weak classifier Bayes Net.

Comparisons of detection rate with different algorithms:
The network attack detection rate and false alarm rate of our work are compared with existing work, which are tested on the benchmark KDDCup'99 data set shown in Table 4. Their performances were comparable but the Naïve Bayes classifier with Adaboost and Decision Tree classifier with Adaboost performed well. Since the Naïve Bayes and Decision Tree classifiers have reasonably better performance as a weak classifier with Adaboost, it should be considered for the building of intrusion detection system.
From the Fig. 8, we observe that the Adaboost with Naïve Bayes and Adaboost with Decision Tree perform considerably superior than the earlier reported results including the winner of the KDD'99 cup and Muli-classifier method. The Adaboost with Decision tree have very high network attack detection of 97.3 percent for Dos and 91.4 percent detection for Probe and the Adaboost with Decision tree have very high network attack detection of 19.5 percent for R2L and 51.2 percent detection for U2R.

CONCLUSION
Conclusion and future work: In this work we have combined the adaboost algorithm with various weak classifiers. The weak classifiers such as Bayes Net, Naive Bayes and Decision tree are used with the Adaboost algorithm to improve the classification accuracy. In this work, we have concentrated on the two problems such as attack detection rate and false alarm rate for building healthy and extensible intrusion detection system. It is important to have a very low false alarm rate for an efficient intrusion detection system. The experiment results illustrate that the Naïve Bayes with Adaboost and Decision Tree with Adaboost algorithm have a very low false alarm rate with a higher attack detection rate. We have focused mainly to obtain better classification through the time and computational complexities are theoretically higher. But practically the time and computational complexities are reduced by processing speed of the computing device.
The areas for future research include the considering the other classifiers to search for the opportunity of improving the classification accuracy and to combine two weak classifiers linearly with Adaboost algorithm. The Adaboost algorithm can be further improved in order to detect the attacks more effectively.