Development of Software Reliability Growth Models for Industrial Applications Using Fuzzy Logic

Problem statement: The utilization of Software Reliability Growth Mod els (SRGM) plays a major role in monitoring progress, accurately predi cting the number of faults in the software during b oth development and testing processes; define the relea se d te of a software product, helps in allocating resources and estimating the cost for software main tena ce. This leads to achieving the required reliability level of a software product. Approach: We investigated the use of fuzzy logic on building SRGM to estimate the expected software faults durin g testing process. Results: The proposed fuzzy model consists of a collection of linear sub-models , based on the Takagi-Sugeno technique and attached efficiently using fuzzy membership functions to rep resent the expected software faults as a function o f historical measured faults. A data set provided by John Musa of bell telephone laboratories (i.e., rea l time control, military and operating system applications ) was used to show the potential of using fuzzy log ic in solving the software reliability modeling proble m. Conclusion: The developed models provided high performance modeling capabilities.


INTRODUCTION
Machine Learning (ML) and Soft Computing techniques, such as Genetic Algorithms (GAs), Genetic Programming (GP), Evolutionary Strategies (ESs), Artificial Neural Net-works (ANN), Fuzzy Logic (FL) and Particle Swarm Optimization (POS), to solve software engineering problems expanded in the recent years. Sheta (2007), author explored the use of Particle Swarm Optimization (PSO) algorithm to estimate SRGM parameters. The proposed method shows significant advantages in handling variety of modeling problems such as the Exponential Model (EXPM), power model (POWM) and Delayed S-Shaped Model (DSSM). Parameter Estimation of Hyper-Geometric Distribution Software Reliability Growth Model by Genetic Algorithms was presented Minohara and Tohma (1995). Predicting accumulated faults during the software testing process using parametric and non-parametric models were explored in many articles (Aljahdali et al., 2001;Sheta, 2006). Zeng and Rine (2004), author provided a strategic solution for estimating software defect fix effort using self-organizing neural network. Genetic Programming (GP) was successfully used to find a model that fits the given data points without making any assumptions about the model structure (Afzal and Torkar, 2008;Paramasivam, 2009). GP found to be a powerful technique in developing software reliability growth modeling.
Fuzzy logic has been successfully used to solve variety of problems in system identification, signal processing and control (White and Sofge, Wang and Mendel, 1993;Wang and Mendel, 1993;Brown and Harris, 1994). Fuzzy modeling has been regarded as one of the key problems in fuzzy systems research (Wang, 1992;Dubois and Prade, 1992). In the past years, research focused on the development of fuzzy systems from both theoretical and applications oriented prospective were presented Yager and Filev (1994), Lotfi (2002), Lotfi and Garibaldi (2004).
A fuzzy model structure can be represented by a set of fuzzy If-Then rules (Kosko, 1998). A fuzzy rule has two parts the antecedent and the consequence. The antecedent variables reflect information about the process operating conditions. The consequent of the rule is usually a linear regression model which is valid around the given operating condition (Babuska, 1996;Babuska et al., 1996;Huang et al., 1999;Sheta et al, 2009;Abdelrahman et al., 2009).

MATERIALS AND MATHODS
In this study, we explore the use of fuzzy logic to predict faults during the software testing process using software historical faults data. The proposed fuzzy model structure is presented in section III. Detailed information about the data set and the experiments developed are provided in sections IV, VI.
Software reliability growth models: In the past three decades, hundreds of models were introduced to estimate the reliability of software systems (Musa, 1975;Xie, 2002). The issue of building growth models was the subject of many research work (Lyu, 1996;Musa, 2004) which helps in estimating the reliability of a software system before its release to the market. Serious application such as weapon systems and NASA space shuttle applications were explored (Carnes, 1997;Keller and Schneidewind, 1997;1992).
Faults may be bump into market released software. This is a challenge for software companies. It might affect their reputation and finance. Software reliability growth models were significantly used to help in computing the number of faults which is still resides in the software (Teng and Pham, 2002). Thus, it is important to specify the effort required to fix faults, the time required before software can be released and the cost of repair. Soft-ware reliability growth models employ system experimental data for testing to predict the number of defects remaining in the software.
Software reliability models can be classified to two types of models according to prediction style either from: (1) the design parameters thus called "defect density" models (2) the test data thus "software reliability growth" models. Some known SRGM are Logarithmic, Exponential, Power, S-Shaped and Inverse Polynomial models (Farr, 1996;Jones et al., 2001). They are typical analytical models. They normally describe the fault process as a function of execution time (or calendar time) and a set of unknown parameters. The model parameters normally estimated using least-square estimation or maximum likelihood techniques (Lyu, 1996).

RESULTS AND DISCUSSION
Proposed fuzzy model structure: Our objective is to approximation the dynamics of the fault measurements during the testing process and instead of representing it in a single nonlinear model we can extend it by a set of local linear models. Each local model should be able to represents the relationship between the historical faults y(k−1), y(k−2), y(k−3), y(k−4) and the current fault y(k) in a certain range of operating conditions. Such a proposed fuzzy model structure can be successfully represented by means of fuzzy If-Then rules. The proposed model equation is given as follows: y(k) = FM(y(k−1), y(k−2), y(k−3), y(k−4)) (1) Using membership functions and the antecedent of the rule we can define the fuzzy region in the product space. The antecedent variables give the condition of the process status now. The consequent of the rule is typically a local linear regression model which relates y(k) with y(k−1), y(k−2), y(k−3), y(k−4).
A rule-based fuzzy model requires the identification of the following: (1) the antecedent (2) the consequent structure, (3) the type of the membership functions for different operating regions and 4) the estimation of the consequent parameters. The developed fuzzy models implemented based the Takagi-Sugeno technique (Babuska, 1997;Babuska et al., 1996). The proposed technique does not require any a prior knowledge about the operating regimes. If a sufficiently number of measurements are collected which reflects the operating ranges of interest, the developed fuzzy model will be an efficient one.
The software reliability data: John Musa of Bell Telephone Laboratories compiled a software reliability database Musa. His objective was to collect fault interval data to assist software managers in monitoring test status, predicting schedules and to assist software researchers in validating software reliability models. These models are applied in the discipline of software reliability engineering. The dataset consists of software fault data on 16 projects. Careful controls were employed during data collection to ensure that the data would be of high quality. The data was collected throughout the mid 1970s. It represents projects from a variety of applications including real time command and control, word processing, commercial and military applications.
In our case, we used data from three different projects. They are Real Time Control, Military and Operating System. A MATLAB toolbox for modeling of fuzzy systems (Babuska, 1998) was used to implement the following results. The routines of the toolbox contain the Gustanfson-Kessel (GK) clustering algorithm, whose implementation is given Gustafson and Kessel (1979).
where, y, ŷ are the actual measured faults and the estimated fuzzy model faults, respectively.

CONCLUSION
We run the Fuzzy Model Identification Toolbox (Babuska, 1998) along with three membership functions. The data set was split into two parts: 1) 70% of the collected data for training and 2 30% for testing (i.e., validation). The set of rules which describe the three software projects (i.e., Real Time Control, Military and Operating Systems are presented in Tables 1-3.
In Fig. 1, we show the membership function for the real time control application. We used three clusters to build the fuzzy model. Figure 2 show the actual and predicted faults over the training and testing data for the real time control applications. The fuzzy membership functions for the military application and operating systems are shown in Figs. 3 and 5, respectively. In Figs. 4 and 6, we show the actual and predicted faults over the training, testing data for the military and operating systems applications.