Application of Adaptive Neuro-Fuzzy Inference System for Information Secuirty

: Problem statement: Computer networks are expanding at very fast rate and the number of network users is increasing day by day, for full utilization of networks it need to be secured against many threats including malware, which is harmful software with the capability to damage data and systems. Fuzzy rule based classification systems considered as an active research area in recent years, due to their unique capability of classifying. Approach: This study presents a neural fuzzy classifier based on Adaptive Neuro-Fuzzy Inference System (ANFIS) for malware detection. Firstly, the malware exe files was analyzed and the most important API calls were selected and used as training and testing datasets, using the training data set the ANFIS classifier learned how to detect the malware in the test dataset. Results and Conclusion: The performances of the Neuro fuzzy classifier were evaluated based on the performance of training and accuracy of classification, the results show that the proposed Neuro fuzzy classifier can detect the malware exe files effectively.


INTRODUCTION
Computer networks are expanding at a very fast rate and the number of network users is increasing day by day, for full utilization of networks it need to be secured against many threats including malware, which is harmful software with the capability to damage data and systems. The detection of malware and intruders becomes an important part of any modern network for guaranteeing the security issue of information system (Kim et al., 2011;Zhou et al., 2010;Beg et al., 2010;Altaher et al., 2011).
Technical reports from detection vendors increasingly warn about new malware and monitor the increased number of infected computer systems. McAfee released its Threat Report for the Fourth Quarter of 2011 which indicated that the number of malware increased continuously. In the Q4 2011 report, McAfee Labs detected approximately 9,300 new malicious sites every day, up from 6,500 per day in Q3. McAfee currently counts more than 700,000 active malicious URLs in its database McAfee Labs, 2011.
Neural networks and Neuro fuzzy techniques have been effectively used in various fields of science, e.g., Detection system, classification, prediction, intelligent systems and decision making.
Application of adaptive neural-fuzzy inference system for identification of malware portable executable files: This study presents a neural fuzzy classifier based on Adaptive Neuro-Fuzzy Inference system (Jang, 1993) for malware detection.Firstly, the malware exe files was analyzed and the most important API calls were selected and used as training and testing datasets, using the training data set the ANFIS classifier learned how to detect the malware in the test dataset.

Malware exe file analysis and feature extraction:
The malware exe files were analyzed to extract the Application Programming Interface (API) as a feature to differentiate between the normal files and the malware files, then the API features were ranked to determine the most effective features which can reflect the behavior of the malware files. We used Information Gain Ratio method (IGR) algorithms which work based on the extraction of similarities between sets of e-mails and then gives the highest weight to the most effective features based on the class of Phishing and ham e-mails belonging to IGR (Mori, 2002), as explained in the following Eq 1.
where, gain_r (X, C) represents the gain ratio of the feature X frequency in class C Eq. 2: where, Ci and |Ci| denote the frequency of features X in class C, the i-th sub-class of C and the number of features in Ci, respectively. All the features selected to be used by the classifier were ranked using the information gain ratio method. The more the information gain is, the more helpful a feature will be in the differentiation between the malware and normal files.
Artificial Neuro-fuzzy inference system: Fuzzy inference system: Fuzzy Inference Systems (FIS) are efficient techniques for studying the behavior of nonlinear systems using fuzzy logic rules. ANFIS is a Neuro-fuzzy system that uses the learning techniques of neural networks, with the efficiency of fuzzy inference systems (Esposito et al., 2000). ANFIS uses a hybrid learning algorithm to specify parameters of Sugeno-type fuzzy inference systems. It uses the least-squares method with the backpropagation gradient descent method train FIS membership function parameters simulate a given training data set. ANFIS can be called using optional parameters to validate the model. ANFIS Architecture: ANFIS structure is similar to the neural network structure based on the Takagi Sugeno model, as illustrated in Fig. 1.
According to the Sugeno fuzzy model, rule sets are as follows: • If x is A 1 and y is B 1 then f 1 = p 1 x+q 1 y+r 1 • If x is A 2 and y is B 2 then f 2 = p 2 x+q 2 y+r 2 Layer 1: Layer 1 is an input and falsification layer. Every node i in this layer is an adaptive node with a node function Eq. 3 and 4: , for i = 3, 4 (4) Layer 2: Layer 2 is the rule layer. Each node in this layer computes the impact of each rule through multiplication Eq. 5: Layer 3: Layer 3 is normalization layer. Each neuron in this layer computes the normalized effect of a given rule Eq. 6: Layer 4: Parameters in this layer are considered as consequent parameters Eq. 7: Layer 5: This layer is designed to calculate the sum of the output of all incoming signal Eq. 8:

Experiment and results:
We used MATLAB version 7.10, for the implementation of the adaptive fuzzy inference system. In our test, we used datasets consist of 288 normal executable files and 416 malware executable files, the dataset downloaded from Nexginrc, 2010 and divided into two datasets, training and testing. The ANFIS classifier was trained using training dataset. Figure 2 shows that that training error and testing error were decaying as the number of epochs increased. The Membership functions were generated by ANFIS classifier as in Fig. 3. Based on the obtained results from Fig. 2, the Adaptive Neural Fuzzy Inference System (ANFIS) proved its capability to detect the malware executable files by decreasing the rate of testing and training errors while increasing the level of accuracy. The structure of ANFIS extends like for some to the structure of a neural network, which use the mapping of the input and output functions and related parameters can be used to interpret the input/output map. The parameters associated with the input membership functions will change through the learning process. Figure 3 shows the training membership functions generated by the developed ANFIS classifier.

CONCLUSION
The objective of this study was to develop an ANFIS classifier for malware exe file identification. It was observed that the ANFIS classifier learned how to detect the malware in the test dataset. The performances of the Neuro fuzzy classifier were evaluated based on the performance of training and accuracy of classification, the results show that the proposed Neuro fuzzy classifier can detect the malware exe files effectively.