Intrusion Preventing System using Intrusion Detection System Decision Tree Data Mining

: Problem statement: To distinguish the activities of the network traffic that the intrusion and normal is very difficult and to need much time consuming. An analyst must review all the data that large and wide to find the sequence of intrusion on the network connection. Therefore, it needs a way that can detect network intrusion to reflect the current network traffics. Approach: In this study, a novel method to find intrusion characteristic for IDS using decision tree machine learning of data mining technique was proposed. Method used to generate of rules is classification by ID3 algorithm of decision tree. Results: These rules can determine of intrusion characteristics then to implement in the firewall policy rules as prevention. Conclusion: Combination of IDS and firewall so-called the IPS, so that besides detecting the existence of intrusion also can execute by doing deny of intrusion as prevention.


INTRODUCTION
With the global Internet connection, network security has gained significant attention in research and industrial communities. Due to the increasing threat of network attacks, firewalls have become important elements of the security policy is generally [7] . Firewall can be allow or deny access network packet, but firewall cannot detect intrusion or attack, so to need intrusion detection and then implemented to firewall is access control systems as prevention. Intrusion detection are also considered as a complementary solution to firewall technology by recognizing attacks against the network that are missed by the firewall [10] . Firewall and IDS represent an old stuff terminology in the field of IT security. Firewall is good for protection a system and network and can minimization risk of attack to network. IDS can detect existence intrusion or attack. The joining ability of IDS and firewalls, that is socalled IPS. That is a functioning tool to detect intrusion and then denying by firewall for prevention.
For each type of network traffics, there are one or more different rules. Every network packet, which arrives at firewall, must be check against defined rules until a matching rule found [1,10] . The packet will be then allow or banned access to the network, depending on the action specified in the matching rule. Each rule identifies specific type of network traffic [4,12] . Characteristics to reflect the current of network traffics can observe from network traffic logs [4] as human pattern recognize [9] .
This Study focus on some methods to prevention from attempt intrusion to find intrusion characteristics in the network traffic as IDS then implementation to firewall policy rules as prevention. To find rules of intrusion characteristics using decision tree machine learning data mining. Method used to generate of rules is classification by ID3 algorithm of decision tree. It is an efficient and optimized to make the rules filtering in firewall.
Theoretical background: Intrusion Detection System (IDS): Intrusion detection can be performed manually or automatically [8] . Manual intrusion detection might take place by examining log files or other evidence for signs of intrusions, including network traffic. A system that performs automated intrusion detection is called an Intrusion Detection System (IDS). IDS play a vital role in ensuring the security of modern computer installations. Such systems are need in order to detect hostile activity and to respond appropriately. As networks continue to expand and become more exposed to a diversity of sources, both hostile and benign, IDS need to be able to deal with a large and ever-increasing flow of alerts and events. Therefore, automatic procedures for detecting and responding to intrusion are becoming increasingly essential [5] . Firewall rules: A firewall security policy is a list of ordered filtering rules that define the actions performed on packets that satisfy specific conditions. Before to develop rules filtering by using packet filter, anything have to be considered beforehand how far demarcation which will be applied, because more and more demarcation applied hence increases the search time and space requirements of the packet filtering process [1] and consequences to make downhill performance progressively [11] . This matter because every incoming network packet and go out the network checked beforehand by rules alternately until matching rule found in firewall [12] . Firewall rules can limit to access the connection of pursuant to parameter: source IP, destination IP, source port, destination port, protocol and others [8,10] .
Following example of firewall rules in Fig Log files can provide a useful profile activity. From a security standpoint, it is crucial to be able to distinguish normal activity from the activity of someone to attack server or network [3] .
Log files are useful for three reasons [11] : • Log files help with troubleshooting system problems and understanding what is happening on the system • Logs serve as an early warning for both system and security events • Logs can be indispensable in reconstructing events, whether determined an intrusion has occurred and performing the follow-up forensic investigation or just profiling normal activity Following some example from log files in Fig. 2   Decision tree of data mining: Decision tree is a technique in classification method of data mining for learning patterns from data and using these patterns for classification. Decision tree are structures used to classify anddata andwith andcommon andattributes andas andshown andin Fig. 4. Each decision tree represents a rule, which categorizes data according to these attributes [4,6] .
Where each node (nonleaf node) denotes a test on an attribute, each branch represent an outcome of the test and each leaf node or terminal node holds a class label. The topmost node in a tree is the root node [2] .
A decision tree classifier is one of the most widely need supervised learning methods used for data exploration. It is easy to interpret and can be rerepresented as if-then-else rules. This classifier works well on noisy data. A decision tree aids in data exploration in the following manner: • It reduces a volume of data by transformation into a more compact form that preserves the essential characteristics and provides an accurate summary • It discovers whether the data contains wellseparated classes of objects, such that the classes can be interpreted meaningfully in the context of a substantive theory • It maps data in the form the leaves to its root. This may used to predict the outcome for a new data or query [6]

MATERIALS AND METHODS
This research using decision tree a technique of data mining machine learning to find the intrusion characteristics for intrusion detection. Algorithm is used ID3 to construct Decision tree. Network traffic logs as data training that describes the human behavior in network traffics as intrusive activities and normal activities. The results of decision tree training will get rules of intrusion characteristics then these rules to implement in the firewall rules as prevention.
Determining occurrence of intrusion or normal activities at network traffic log can be conducted with two way of that is: • Observe manually activities network traffic in log files. Example, application software of log files is syslog, syslog_ng, tcpdump and others. Pattern found to see intrusion through log seen modestly, for example there are some times trying to access using login or password failed, trying port scan, abundant ping, delivery of abundant package by repeat • Using software as a means of assists functioning as Network Intrusion Detection System (NIDS) able to determine intrusion activities or normal activities, for example snort software

RESULTS
Collect and extract log files of intrusive activities and normal activities become five of parameter as attributes and belongs to a class 'Yes' or 'No' of intrusive for the data training of decision tree. The parameter is IP address source, IP address destination, port source, port destination and protocol as shown in Table 1.
Applying Decision Tree to Find Intrusion Characteristic: Suppose train a decision tree using the example in Table 1.   Table.2 the highest of Gain is Source IP. As a note, ignore protocol to the calculation, because only one value of protocol attributes that is TCP, but for each path from the root to a leaf node assume there is a TCP protocol. Meanwhile, Source Port attributes have large numbers of values called to super attributes.
Third, GainRatio can be use for attribute selection between Source IP and Source Port. Source IP has the highest gain ratio, therefore, it is used as the decision node, show in Fig. 7.
This process goes on until all data classified perfectly or run out of attributes. The complete of tree show in Fig. 8.
Decision tree can simplified by pruning all connections are assumed normal and not classified as intrusions as shown in Fig. 9.

Rule extraction and characteristic of intrusion:
The knowledge represented in decision trees can be extracted and represented in the form of IF-THEN rules.  One rule can be created for each path from the root to a leaf node. Each attribute-value pair along a given path forms a conjunction in the rule antecedent ("IF" part). The leaf node holds the class prediction, forming the rule consequent ("THEN" part). The IF-THEN rules may be easier for humans to understand [2,6] is shown Fig. 10.

Implementation to firewall rules:
The examples of extract rule of tree decision is shown in Fig. 10. representing characteristic of intrusion earn implementation into firewall rules is shown in Fig. 11. Do not forget to every rule there is a TCP protocol. Firewall policy rules above representing preventive action, where every network packet with criteria like rules firewall above will DROP.

CONCLUSION
Network traffic logs to describe patterns of behavior in network traffic accident with intrusive or normal activity. Decision tree technique is good for the intrusion characteristic of the network traffic logs for IDS and implemented in the firewall as prevention. The both of this combination is called IPS. The other hand, this technique is also good efficiency and optimize rule for the firewall rules such as avoid redundancy.