Predicting Packet Transmission Data over IP Networks Using Adaptive Neuro-Fuzzy Inference Systems

Problem statement: The statistical modeling for predicting network traffic has now become a major tool used for network and is of significant interest in many domains: Adaptive application, congestion and admission control, wireless, network management and network anomalies. To comprehend the properties of IP-network traffic and system conditions, many kinds of reports based on measured network traffic data have been reported by several researchers. The goal of the present contribution was to complement these previous researches by predicting network traffic data. Approach: The Adaptive Neuro-Fuzzy Inference System (ANFIS) was realized by an appropriate combination of fuzzy systems and neural networks. It was applied in different applications which have been increased in recent years and have multidisciplinary in several domains with a high accuracy. For this reason, we used a set of input and output data of packet transmission over Internet Protocol (IP) networks as input and output of ANFIS to develop a model for predicting data. Results: ANFIS was compared with some existing model based on Volterra system with Laguerre functions. The obtained results demonstrate that the sequences of generated values have the same statistical characteristics as those really observed. Furthermore, the relative error using ANFIS model was better than this obtained by Volterra system model. Conclusion: The developed model fits well real data and can be used for predicting purpose with a high accuracy.


INTRODUCTION
Several works developed in the literature have been focused on developing dynamic controllers for computer data network [1][2][3] which present dynamic characteristics in the time domain. Indeed, with the growing size and the diversity of computer networks, network traffic including real-time application has increased and shows much more complex behaviors. To comprehend the properties of IP-networks traffic and system conditions, many kinds of reports based on measured network traffic data have been reported in literature [4,5] . So, the internet traffic statistical modeling and the optimization of response time or to warranty the reliability of the network are very important issues for the improvement of the network management and design [6] . Furthermore, the internet traffic is increasingly dominated by new real-time multimedia applications such as audio-video on-demand, IP telephony and multimedia teleconferencing, all of which prefer some guarantee of an acceptable Quality of Service (QoS) in the transmission and delivery of data [7] . These different applications have diversified traffic characteristics with different requirements and techniques for each one.
The statistical modeling for predicting network traffic has now become a major tool used for network and is of significant interest in many domains: Adaptive application, congestion and admission control, wireless, network management and network anomalies [8][9][10][11] . An accurate traffic prediction model should have the ability to capture the prominent traffic characteristics, e.g., short-and long-range dependence, self-similarity in large-time scale and multifractal in small-time scale [12] .
In the literature many models are used for analyzing network traffic time series, for instance Auto Regressive (AR) and Moving Average (MA), Auto Regressive Moving Average (ARMA), Fractional Auto Regressive Integrated Moving Average (FARIMA) [13][14][15] , Auto Regressive Moving Average (ARMA) with generalized auto regressive conditional heteroscedasticity (ARIMA/GARCH) [12] . The Volterra models [5] and the Higher Order Cumulant (HOC) techniques [16] are used also respectively for analyzing IP network and modeling time series of internet traffic. Other authors have used artificial intelligence techniques as fuzzy logic for their many advantages to solve different identification and prediction problems. Indeed, the theory of fuzzy logic provides a mathematical framework to capture the uncertainties associated with human cognitive processes, such as thinking and reasoning [17,18] . Furthermore, the application of this theory has increased in recent years and has multi-disciplinary in nature, includes automatic control, signal processing and time series prediction….
The fuzzy logic technique is also important for approximating of function and modeling static and dynamic systems. As a result of the research, can now in addition to taking the linguistic information (linguistic rules) from human experts, adapt itself using numerical data (input/output pairs) to have a good performance. The fuzzy modeling of dynamic systems is addressed, as well as the methods to construct fuzzy models from knowledge and data [19] . Furthermore, the fuzzy inference system is mapped onto a neural network-like architecture [20] .
Every intelligent technique has particular computer properties for example: The capacity to learn and the explanation of the decisions. According to this particularity we can choice the convenient technique to solve some problems. In this case, the neural networks are good to identify some models but they are not good to explain how they take their decisions. The fuzzy systems, which can reason with the imprecise information, are good to explain their decisions but they cannot acquire automatically the rule which employ for taking these decisions. These limitations orient the research toward the creation of neuro-fuzzy systems, where the two techniques are combined to pass these limitations. Neuro-fuzzy is applied various learning techniques developed in the fuzzy inference systems and the neural networks.
In this study, our contribution is to complement the literature cited about the prediction of internet traffic using adaptive neuro-fuzzy inference system (ANFIS). We used the packet transmission data over IP networks as the input and output of ANFIS, which in turn used for obtaining an adequate model. In order to evaluate this model we compare it with a Volterra system model with Laguerre functions using the relative error criterion.

MATERIALS AND METHODS
One of the most common structures of neuro-fuzzy network is identified as adaptive neuro-fuzzy inference systems [21][22][23] . ANFIS is realized by an appropriate combination of fuzzy systems and neural networks. This hybrid combination enables to utilize both the verbal and numeric power of intelligent systems. Commonly, from the theory of fuzzy systems, different fuzzification and defuzzification mechanisms with different rule-based structures can result in various solutions to a given task. ANFIS is a technique for automatically tuning first-order Sugeno type inference systems based on training data [18,21] . It can be applied in different domain such as nonlinear function modeling [24,21] , time series prediction (solar radiation, internet traffic, wind speed) [25,21] , parameter identification for control systems [5] , fuzzy controller designing [17,26,27] .
Without loss of generality, we assume that the fuzzy inference system, under consideration, has two inputs x and y and one output f. We assume for now a first-order Sugeno fuzzy model [28,29] of rule base with two fuzzy if-then rules can be expressed as: • Rule 1: If x is A 1 and y is B 1 , then f 1 = p 1 x+q 1 y+r 1 • Rule 2: If x is A 2 and y is B 2 , then f 2 = p 2 x+q 2 y+r 2 Figure 1 shows the reasoning mechanism for this Sugeno model [22,23] . Where x and y are the input, f is the output, A 1 , A 2 , B 1 , B 2 are the input membership functions, w 1 , w 2 are the rules firing strengths and {p i , q i , r i } is the parameter set.
In this structure, antecedent of rules contains fuzzy sets (as membership function) and consequent is a first order polynomial (a crisp function) [23] . We used the Gaussian membership function with product inference rule at the fuzzification level. Fuzzifier outputs the firing strengths for each rule. The vector of the firing strengths is normalized and the obtained vector is defuzzified by utilizing the first order Sugeno model. The corresponding equivalent ANFIS architecture is drafted in Fig. 2 [21] . Nodes of the same layer have the same function, as described subsequently. We note that O l,i represents the output of the i th node in layer l.

Layer 1:
Each node i in this layer is an adaptive node with a node output defined by: or O 1,i = µ Bi (y) for i = 3, 4 Where: x and y = The input to the node A i and B i-2 = A fuzzy set associated with this node In other words, outputs of this layer are the membership values of the premise part. In this study we used Gaussian membership functions described by: In above, c ij and ij characterize the center and width of i th rule's j th membership function respectively. Parameters in this layer are referred to as premise parameters.
Layer 2: Each node in this layer is a fixed node labeled Π, which multiplies the incoming signals and outputs the product. For example: Every node in this layer calculates the firing strength of a rule.

Layer 3:
Each node in this layer is a fixed node labeled N. The i th node calculates the ratio of the i th rule's firing strength to the sum of all rules' firing strength: For convenience, output of layer will be called normalized firing strengths.

Layer 4:
Every node i in this layer is an adaptive node with a node function: where, i w is the output of layer 3. Parameters in this layer will be referred to as consequent parameters.

Layer 5:
The single node in this layer is a fixed node labeled ∑, which computes the overall output as the summation of all incoming signals: Using this method, a fuzzy inference system is designed based on system specifications. This initial model is changed to a neuro fuzzy network and then trained by experimental measured data from the real system. Therefore, modeling and predicting, using ANFIS, start by obtaining a data set (input-output data points) which is divided into training and validating data sets. Figure 3 shows the ANFIS training and modeling processes [20] .
The adaptive neuro-fuzzy inference system is applied to the trafiic internet data. Indeed the internet is a global collection of autonomously administered packet-switched networks that use the TCP/IP protocols to form a single, virtual network allowing every part to communicate with any other. Over the past few years the internet has exploded into a vast selection of information and knowledge. The number of computer with registered IP addresses knows an important increasing in this last year.
There are four types of data traffic expected to be transported over cellular systems: background, interactive, streaming and conversational [30,31] . The system condition of video streams over IP network depends on several factors, including packet transmission delay and video transmission rate, so it is naturally categorized as "best effort service" in the internet [5] . The complexity of networks and diversity of the problems involved in network management, the challenges of networks with high flow require a sufficient solution which is independent of the system and specific networks of the manufacturer and specific needs for the users [10] .
We employ the adaptive inference system techniques, as a way of analyzing the nonlinear timeseries of packet transmission data over IP networks. Figure 4 shows the concept of analysis of responses in IP based packet transmission (when video packets from a video server are transmitted to a client via an IP networks) [5] . The parameters T1 and T2 represent respectively, the time interval between departures of consecutive packets from the video server and the transmission total time between the video server and terminating sides for each packet. The considered data A and B (Fig. 5) measured with 100 and 8 Mbps final-leg access lines respectively [5] . The A and B contain eight data sets consisting of 3000 packets measured from 20:00 at 30 min intervals on April 30, 2003. The Fig. 6 represents the real data of packet transmission for A and B. We use the Adaptive Neuro-Fuzzy Inference System (ANFIS) for predicting real data of packet transmission over IP networks [5] . Figure 7 shows the initial and the final membership functions of packet transmission over IP networks.

RESULTS AND DISCUSSION
The prediction process is important to generate data in most cases when the real measurements are not available. To assess the predictive capacity of our identified ANFIS model, we use data of the packet transmission over IP networks for measurement A. These data are divided in two parts: training and checking data.
For evaluating the performances of the predicting model we used respectively the root Mean Square Error  [32,33] : where, y m and m y ) represent respectively real and estimated data and N represents the sample size. Table 1 shows the obtained results of the RMSE and VAF criteria for measurement data of A and B. This result demonstrates that the ANFIS modeling produced more precise results for measurement A than B. So, we prefer to analyze the evolution of measurement A.
For this reason, we plot on the one hand in Fig. 8 the evolution of measured and predicted values of packet transmission data. We observed that there is almost a complete agreement between measured and predicted data. On the other hand we draft in Fig. 9a the cumulative distribution of measured and predicted data. Figure 9 demonstrates clearly the similarity between measured and predicted values. So, the identified ANFIS model can be used for predicting data of packet transmission over IP networks. Another test can be performed to appreciate the quality of a model to generate data having the same frequency distribution as the real measurements (Fig. 9b) . It is important to note that the frequency distribution comparison confirms the accuracy of the model proposed. Furthermore, the scatter diagram (Fig. 10) presents a comparison between measured and predicted data using ANFIS model which constitutes another mean to test the performance of the model. From Fig. 10, we observed a strong concentration of the points (y m , m y ) ) around the bisectrix. So, the distribution of the values around the bisectrix informs us on the validity of the model developed using ANFIS technique.
Furthermore, we use the relative error criterion described as: 2   To evaluate the performance of the model developed, based on adaptive neuro-fuzzy inference system (ANFIS), to the model based on the Volterra system with Laguerre functions developed by Masugi and Takuma [5] .
The obtained results are shown in Table 2. We remark that the ANFIS model produced more accurate results than Volterra system model. So, the AFNIS model can be used to analyze and predict the data of packet transmission over IP networks.

CONCLUSION
In this study, we have used the adaptive neurofuzzy inference systems technique for developing a model which permits to predict data of packet transmission over IP networks from video server using real data observed in a test-bed connected to the internet. We used two types of data: Time intervals between consecutive packets from a video server at the originating side and the transmission time of packets between originating and terminating sides as respectively input and output of the ANFIS model. This values predicted by the model are very similar to real values. The performance test provides satisfactory results and shows a good precision in terms of the variance accounting and the root mean square error criteria. Furthermore, a comparison is performed between ANFIS model and Volterra system model. The obtained results demonstrate the advantage of ANFIS model on the Volterra model. So, the ANFIS model can be used as an alternative method to generate network traffic data.

ACKNOWLEDGEMENT
We would like to thank M. Masugi and T. Takuma of the NTT Energy and Environment System Laboratories, Tokyo, Japan for giving us the packet transmission data in IP Networks.