Intelligence System for Software Maintenance Severity Prediction

: The software industry has been experiencing a software crisis, a difficulty of delivering software within budget, on time, and of good quality. This may happen due to number of defects present in the different modules of the project that may require maintenance. This necessitates the need of predicting maintenance urgency of the particular module in the software. In this paper, we have applied the different predictor models to NASA five public domain defect datasets coded in C, C++, Java and Perl programming languages. Twenty one software metrics of different datasets and Java Classes of thirty five algorithms belonging to the different learner categories of the WEKA project have been evaluated for the prediction of maintenance severity. The results of ten fold cross validation are recorded in terms of Accuracy , Mean Absolute Error ( MAE ) and Root Mean Squared Error ( RMSE ) for different project datasets. The results show that logistic model Trees (LMT) and Complimentary Naïve Bayes (CNB) based Model provide a relatively better prediction consistency compared to other models and hence, can be used for the maintenance severity prediction of the software. The developed system can also be used for analysis and to evaluate the influence of different factors on the maintenance severity of different software project modules.


INTRODUCTION
Software maintenance is defined as the process of modifying existing operational software after delivery to the customer to correct faults, to improve performance, and/or to adapt the product to a changed environment. Maintenance is inevitable for almost any kind of product. However, most products need maintenance due to the wear and tear caused by use. On the other hand, software products do not need maintenance on this count, but need maintenance to correct errors, enhance features, port to new platforms etc. Maintenance requests [1] can be of corrective, perfective, adaptive, user support and preventive types. The software maintenance life cycle (SMLC) concept recognizes four stages [2,3] in the life of an application software system: introduction, growth, maturation, and decline.
The software industry has been experiencing a software crisis, a difficulty of delivering software within budget, on time, and of good quality. At the same time, the industry has experienced a dramatic increase in the software life cycle costs of maintenance. Pigoski [4] illustrates that the percentage of the industry's expenditures used for maintenance purposes was 40 percent in the early 1970s, 55 percent in the early 1980s, 75 percent in the late 1980s, and 90 percent in the early 1990s. Given its dominance in the industry, the study of software maintenance is increasingly prudent. It has also been noted [5] that over 50% of programmer effort is dedicated to maintenance. According to Mall [12] the effort of development of a typical software product to its maintenance effort is roughly in the 40:60 ratios. Given this high cost, some organizations are beginning to look at their maintenance processes as areas for competitive advantage.
With real-time systems becoming more complex and unpredictable, partly due to increasingly sophisticated requirements, traditional software development techniques might face difficulties in satisfying these requirements. Future real-time software systems may need to dynamically adapt themselves based on the run-time mission-specific requirements and operating conditions. This involves dynamic code synthesis that generates modules to provide the functionality required to perform the desired operations in real-time. However, this necessitates the need to develop a real-time assessment technique that classifies these dynamically generated systems as being faulty / maintenance free [6] .
A variety of software maintenance predictions techniques have been proposed, but none has proven to be consistently accurate. These techniques include statistical method, machine learning methods, parametric models and mixed algorithms. Therefore, there is a need to find the best prediction technique for a given maintenance prediction dataset (MP) to calculate the maintenance severity. In this paper we have proposed a prediction model for quantifying the impact of defects on the overall environment by predicting maintenance severity.
The basic hypothesis of software quality prediction is that a module currently under development has defects if a module with the similar product or process metrics in an earlier project (or release) developed in the same environment had defects [7] . Therefore, the information available early within the current project or from the previous project can be used in making predictions. This methodology is very useful for the large-scale projects or projects with multiple releases.
Maintenance managers can apply existing techniques that have been traditionally been used for other types of applications. One system is not enough for prediction purposes. The empirical study detailing software maintenance for web based java applications can be performed to aid in understanding and predicting the software maintenance category and effort [8] .
With the advent of Total Quality Management, organizations are using metrics to improve quality and productivity [9] . Software maintenance organizations are no exception. In 1987, the U.S. Navy established centralized Software Support Activity (SSA) to provide software maintenance for cryptologic systems. At that time two systems were supported and a software maintenance metrics program was established to support the goals of the SSA.
Visual approach [10] can be used to uncover the relationship between evolving software and the way it is affected by software bugs. By visually putting the two aspects close to each other, we can characterize the evolution of software artifacts.
Software maintenance is central to the mission of many organizations. Thus, it is natural for managers to characterize and measure those aspects of products and processes that seem to affect cost, schedule, quality, and functionality of a software maintenance delivery [13] . The importance o software maintenance in today's software industry can not be overestimated.
Statistical, machine learning, and mixed techniques are widely used in the literature to predict software defects. Khoshgoftaar [14] used zero-inflated Poisson regression to predict the fault-proneness of software systems with a large number of zero response variables. He showed that zero-inflated Poisson regression is better than Poisson regression for software quality modeling. Munson and Khoshgoftaar [15,16] also investigated the application of multivariate analysis to regression and showed that reducing the number of "independent" factors (attribute set) does not significantly affect the Accuracy of software quality prediction.
Menzies, Ammar, Nikora, and Stefano [17] compared decision trees, naïve Bayes, and 1-rule classifier on the NASA software defect data. A clear trend was not observed and different predictors scored better on different data sets. However, their proposed ROCKY classifier outscored all the above predictor models. Emam, Benlarbi, Goel, and Rai [18] compared different case-based reasoning classifiers and concluded that there is no added advantage in varying the combination of parameters (including varying nearest neighbor and using different weight functions) of the classifier to make the prediction Accuracy better.
Bayesian Belief Networks (also known as Belief Networks, Causal Probabilistic Networks, casual Nets, Graphical Probability Networks, Probabilistic Cause-Effect Models, and Probabilistic Influence Diagrams) [19] have attracted much recent attention as a possible solution for the problems of decision support under uncertainty. Although the underlying theory (Bayesian probability) has been around for a long time, the possibility of building and executing realistic models has only been made possible because of recent algorithms and software tools that implement them. Clearly defects are not directly caused by program complexity alone. In reality the propensity to introduce defects will be influenced by many factors unrelated to code or design complexity.
Many modeling techniques have been developed and applied for software quality prediction. These include logistic regression, discriminant analysis [20,21] , the discriminative power techniques, Optimized Set Reduction, artificial neural network [22][23] , fuzzy classification Bayesian Belief Networks (Fenton & Neil, 1999), recently Dempster-Shafer Belief Networks. For all these software quality models, there is a tradeoff between the defect detection rate and the overall prediction Accuracy. The software quality may be analyzed with limited fault proneness data [24] .

METHODOLOGY
The following steps are proposed for the prediction of maintenance severity: 1. Deciding the relevant attributes of software maintenance prediction and choosing the metric corresponding to the selected attribute that could have contribution towards prediction of maintenance urgency/severity. 2. The Collection of sampled relevant MP data, analyze and refine metrics data for different projects.

Evaluate different prediction techniques and
selecting the best technique based on Accuracy Percentage, Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). . Developing an intelligence system using the best technique as evaluated in the previous step. 5. Testing of the developed system.
The real-time defect data sets used in this paper has been accessed from the NASA's MDP (Metric Data Program) data repository. The KC1 data is obtained from a science data processing project coded in C++, containing 2107 modules. Out of these 293 modules have defects. The JM1 data is obtained from a predictive ground system project, written in C, containing 10878 modules. Out of these 2102 modules have defects. The PC4 data is collected from a software system coded in C, containing 370 modules. Out of these 178 modules have defects. The KC3 data is collected from a software system coded in Java, containing 458 modules. Out of these 29 modules have defects. The KC4 data is collected from a software system coded in Perl, containing 125 modules. Out of these 60 modules have defects as shown in Table 1. All these data sets varied in the percentage of defect modules, with the KC3 dataset containing the least number of defect modules and the JM1 dataset containing the largest.
The Table 2 shows the different types of predictor software metrics (independent variables) used in our analysis. These complexity and size metrics include well known metrics, such as Halstead, McCabe, line count, operator/operand count, and branch count metrics. Halstead metrics are sensitive to program size and help in calculating the programming effort in months. The different Halstead metrics include length, volume, difficulty, intelligent count, effort, error, and error estimate. McCabe metrics measure code (control flow) complexity and help in identifying vulnerable code. The different McCabe metrics include cyclomatic complexity, essential complexity, design complexity and lines of code. The target metric (dependent variable) is the "Severity".

RESULTS AND DISCUSSIONS
The Severity value quantifies the impact of the defect on the overall environment with 1 being most severe to 5 being least severe. For, example severity 1 may imply that the defect caused a loss of functionality without a workaround where severity 5 may mean that the impact is superficial and did not cause any disruptions to the system.  The Table 3 shows the no of modules with defect associations of different projects having maintenance severity value of 1, 2, 3, 4 and 5 respectively. We have used MATLAB 7.2 and Java Classes of Weka Project [11] to conduct these experiments. Thirty five algorithms belonging to the six learner categories of the WEKA project have been evaluated on five projects for the prediction of maintenance severity. The ten fold cross validation results are recorded in terms of Accuracy, MAE and RMSE for different project as specified earlier. Table  4, Table 5 and Table 6 are derived from Table 7 and  Table 8 by selecting the best algorithm from each category based on Accuracy, MAE and RMSE. The detailed tables implementing all the prediction algorithms on five different projects are shown in the Appendix section of the paper.     So, the predicted model can be used to automate the calculation of maintenance severity of defective modules .We can also prioritize that which module should be maintained first based on predicted maintenance severity value and this will reduce the amount of effort required to maintain that particular module. Hence, the productivity and ease of use of the software will be increased. In Future, the developed system can also be used for analysis and to evaluate the influence of different factors on the maintenance severity of different software project modules.

APPENDIX
Appendix has two tables: Table 7 and Table 8. The Table   7 shows the results of prediction algorithms on KC1, JM1 and KC3 projects datasets. The Table8