On the Predictability of Risk Box Approach by Genetic Programming Method for Bankruptcy Prediction

: Problem statement: Theoretical based data representation is an important tool for model selection and interpretations in bankruptcy analysis since the numerical representation are much less transparent. Some methodological problems concerning financial ratios such as non-proportionality, non-asymetricity, non-scalicity are solved in this study and we presented a complementary technique for empirical analysis of financial ratios and bankruptcy risk. Approach: This study presented new geometric technique for empirical analysis of bankruptcy risk using financial ratios. Within this framework, we proposed the use of a new ratio representation which named Risk Box measure (RB). We demonstrated the application of this geometric approach for variable representation, data visualization and financial ratios at different stages of corporate bankruptcy prediction models based on financial balance sheet ratios. These stages were the selection of variables (predictors), accuracy of each estimation model and the representation of each model for transformed and common ratios. Results: We provided evidence of extent to which changes in values of this index were associated with changes in each axis values and how this may alter our economic interpretation of changes in the patterns and direction of risk components. Results of Genetic Programming (GP) models were compared as different classification models and results showed the classifiers outperform by modified ratios. Conclusion/Recommendations: In this study, a new dimension to risk measurement and data representation with the advent of the Share Risk method (SR) was proposed. Genetic programming method is substantially superior to the traditional methods such as MDA or Logistic method. It was strongly suggested the use of SR methodology for ratio analysis, which provided a conceptual and complimentary methodological solution to many problems associated with the use of ratios. Respectively, GP will provide heuristic non linear regression as a tool in providing forecasting regression for studies associated with financial data. Genetic programming as one of the modern classification method out performs by the use of modified ratios. Our new method would be a general methodological guideline associated with financial data analysis.


INTRODUCTION
In classical prediction models a convenient representation of ratios are in closed form of graphical presentation of data. In contrast, achieving better accuracy often relies on visualization of predictors. It is at this stage when the selection of a proper graphical presentation scheme becomes essential for a correct scaled visualization. Since numerical presentation of ratios cannot be a good representative of characteristics of companies, some other ways of displaying them must be found. Graphical tools give this possibility.
This study presents a complementary perspective on the study of ratios and bankruptcy. One possible explanation for this effect that is consistent with the "efficient market hypothesis" that ratio is a proxy for risk. Also in banking, the ratios taken to be a proxy for the charter value of banks [17] . Statistical techniques applications to corporate bankruptcy started in the 60's with the development of computers. The first technique introduced was Discriminant Analysis (DA) for univariate and multivariate models [1] . Then Altman [1] , used Multiple Discriminant Analysis (MDA) and applied to prediction of business failure. Altman [2] examined railroad bankruptcy propensity and Deakin [9] replicated study Edmister [10] testing the usefulness of financial ratio in order to predict small business failure. Altman, Margaine, Schlosser and Vernimmen [3] developed a model in order to determine the credit worthiness of commercial loan applicants in a cotton and wool textile sector in France. Altman, Haldeman and Narayan [4] developed their classical Z model and named it Zeta Analysis. After DA and Multiple Discriminant Analysis (MDA), the logit and probit models were introduced in Martin [19] , Ohelson [23] . Nowadays these models are widely used in practice. The solution in the traditional framework is a linear function separating successful and failing companies. A company score is computed as a value of that function [6] .
Northon and Smith [21] who compared the prediction of bankruptcy using ratios computed from General Price Level (GPL) financial statements to the prediction of bankruptcy using ratios computed from traditional historical cost financial statements, Taffler [28] who used linear discriminant analysis for the prediction of bankruptcies in UK with financial ratios.
Moreover recursive partitioning also known as Classification and Regression Trees (CART) performs classification by dividing the data space. Moreover Genetic Programming (GP) is a population of linear classifiers (genes) that are connected with one another in a pre-specified way. The outputs of some of the genes are inputs for others. The performance of GP greatly depends on its structure that must be adapted for solving different problems. However, as there is no widely accepted economic theory, every study has based their model specification on an empirical framework. This results in different accounting ratios used in different models. Generally, these multivariate models are conducted on procedure that is structured in such a way that an equal number of bankrupt and nonbankrupt firms are chosen randomly with respect to company size or industry or large and small samples avoiding matching procedure.
Problem statement: According to literature, predictors used in various studies, generally exhibit non-normal distribution and high standard errors [8,20,24] . Some researchers made correction for univariate nonnormality and tried to approximate univariate normality by transforming the variables prior to estimation of their models [29,5] . Deakin [9] used log transformation, then square root and log-normal transformation of financial ratios were used by Ooghe [13] and Gu [14] . Other researchers approximate univariate normality by 'trimming' or 'outlier deletion', which involves segregating outliers by reference to normal distribution [12] . Furthermore rank transformation been used by Perry et al. [25] and Kenjegalieva et al. [15] . Recently Bahiraie et al. [7] used Geometric transformation of ratios, which may become general guidelines concerning the transformation details was discussed.
Objectives: Our objective in this study is to discuss about new geometric approach to ratios, which involves data transformation and we illustrate the use of this methodology for bankruptcy predictions. For illustration of this new methodology, book and market ratio values (X, Y) are used as numerator and denominator of common ratio values and represented as Cartesian coordinates in our constructed modification box in which we derive the isoclines of associated components of bankruptcy risk. This study is regarded as one of the classic studies in this field. We show that Genetic Programming (GP) as one of the modern classification methods outperform by risk box method in compare to ratios.

MATERIALS AND METHODS
Genetic Programming (GP): Genetic Programming (GP) is a search methodology belonging to the family of Evolutionary Computation (EC). GP can be considered as an extension of Genetic algorithms, GA [16] . GA is stochastic search techniques that can search large and complicated spaces stemmed on the ideas from natural genetics and evolutionary principle. They have been demonstrated to be effective and robust in searching very large spaces in a wide range of applications. GP is basically a GA applied to a population of Computer Programs (CP). While a GA usually operates on strings of numbers, a GP has to operate on CP. GP allows, in comparison with GA, the optimization of much more complicated structures and can therefore be applied to a greater diversity of problems [22] . While bankruptcy prediction can be considered as a classification problem, we provide necessary description of GP with emphasis on its application in classification role [16] . Genetic programming models were inspired by the Darwinian theory of evolution. According to the most common implementations, a population of candidate solutions is maintained and after a generation is accomplished, the population is fitted better for a given problem. Genetic programming uses tree-like individuals that can represent mathematical expressions. Such a GP individual is shown in Fig. 1. Three genetic operators are mostly used in these algorithms: Reproduction, crossover and mutation. First the reproduction operator simply chooses an individual in the current population and copies it without changes into the new population. In second step two parent individuals are selected and a sub-tree is picked on each one. Then crossover swaps the nodes and their relative sub-trees from one parent to the other. If a condition is violated the too-large offspring is simply replaced by one of the parents. There are other parameters that specify the frequency with which internal or external points are selected as crossover points. Figure 2 and 3 show an example of crossover operators.
The mutation operator can be applied to either a function node or a terminal node which in the tree is randomly selected. If the chosen node is a terminal node it is simply replaced by another terminal and if it is a function and point mutation is to be performed, it is replaced by a new function with the same parity [18] . When tree mutation is to be carried out, a new function node is chosen and the original node together with its relative sub-tree is substituted by a new randomly generated sub-tree. A depth ramp is used to set bounds on size when generating the replacement sub-tree. Naturally it is to check that this replacement does not violate the depth limit. If this happens mutation just reproduces the original tree into the new generation. Further parameters specify the probability with which internal or external points are selected as mutation points. An example of mutation operator is shown in Fig. 4.
The last step for obtaining the best fitness function for all classification problems, in order to apply a particular fitness function, the learning algorithms must convert the value returned by the evolved model into "1" or "0" using the 0/1 Rounding Threshold. If the value returned by the evolved model is equal to or greater than the rounding threshold, then the record is classified as "1"and "0" otherwise. There are many varieties of fitness function such as number of hits, sensitivity/specificity, Relative Squared Error (RSE), Mean Squared Error (MSE), that can be applied for evaluating performance of generated classification rules. We used "number of hits" as fitness function because of its simplicity and efficiency which is based on the number of samples correctly classified. More formally, the fitness f i of an individual program corresponds to the number of hits and is evaluated by f i = h where h is the number of fitness cases correctly evaluated or number of hits. So, for this fitness function, maximum fitness f max is given by f max = n where n is the number of fitness cases.
Its counterpart with "parsimony pressure" uses this fitness measure f i as "raw fitness", rf i and complements it with a parsimony term. Parsimony pressure puts a little pressure on the size of the evolving solutions, allowing the discovery of more compact models. Thus, in this case, raw maximum fitness rf max = n and the overall fitness fpp i that is, fitness with parsimony pressure is evaluated by where S i is the size of the program, S max and S min represent minimum and maximum of program population respectively. Maximum and minimum of program sizes are evaluated by the formulas: Thus, when rf i = rf max and S i = S min , with fpp max = 1.0002×rf max the process will be optimized. The described procedure is depicted in the flowchart of [27] . Once fitness function is defined, bankruptcy prediction problem becomes a search problem of the best solution in the search space of all the possible solutions, that is to say an optimization of the fitness function for which optimization techniques can be used. The implementation of a genetic model is to automatically extract an intelligible classification rule for prediction classes of bankrupt and non-bankrupt firms in a sample by the given values of some financial ratios, called predicting variables. Each rule is constituted by a logical combination of these ratios. The combination determines a class description which is used to construct the classification rule. Given a number of variables describing each firm and their related domains, it is easy to understand bankruptcy prediction problems by the number of possible solutions obtained which is enormous.
The share risk box methodology: The framework is a two-dimensional box in which associated with ratio values in which pair values of each risk ratios (X i , Y i ) are represented as Cartesian coordinates. For expositional purposes suppose our proxy for risk chosen is employed by X i as numerator and Y i as denominator values of i i X Y ratio. For any number of firms, ∀i = 1,2,3,…,n, proposed Share Risk (SRi) is defined as a function of X i and Y i . Consider a square two-dimensional space that captures all changes in numerator X i and denominator Y i , for any firm i and any period t where X and Y can be positive, negative or zero (It is applicable to any level of aggregation such as cross-country studies, cross sector and ratios). Assume a hypothetical study of risk covering n years for sector j. For ∀t = 1,2,3,…,n, we have: X t , Y t >0. All risk components measure indices such as, Total Risk TR = X+Y, Net Risk NR = |X-Y|, Overlapping Risk OR = (X+Y)-|X-Y| and lastly the proposed Share Measure of Risk (SR) as we define below, are linear functions of X and Y which X+Y = TR = NR+OR: Following Bahiraie et al. [7] , we can construct a two dimensional box that encapsulates all of these variables for n years. The dimensions of the risk box are generated by the maximum value of either X i and Y i value during the period of study. From the definition of TR, NR, OR, SR, we obtain: Each respective risk box will have sides equal to max(X i ) if for i∈t then max(X i )>max(Y i ) or max(Y i ) if otherwise. Our exposition of the dimensions of the box is as follows which confirms the elasticity and unit-free nature of SR measure: Locus of EQUI TR: A 45° line from the origin bisects the box into two equal triangles. This positive slope diagonal is the locus of balanced risk where X = Y, TR equals OR, SR equals unity and NR equals zero. This is the risk components' axis of asymmetry [26] .
The two triangular planes in the box consists of an upper triangle containing coordinate points (X i , Y i ) where X i >Y i in and points Y i >X i in the lower triangle. A fix value TR = TR* implies = TR*-Y. Comparing with y = mx + c, we have the gradient m equals minus unity. Hence, locus of EQUI TR is perpendicular to the axis of asymmetry.

Locus of EQUI NR:
Recall that Net Risk NR = |X-Y|. The line 45° line, Y-X = NR* so X = Y-NR*, which also slopes upward at 45°, meeting the (horizontal) Y axis at NR*. Above the 45° line through the origin we have another segment of same contour, namely the line Y-X = NR* or X = Y+NR*. These two 45° lines from the contour are corresponding to NR*. Increasing the value of constant NR* moves both segments higher up their respective axis, away from the central NR* line: Increasing the value of constant NR* moves both segments higher up their respective axis, away from the central NR* line. Comparing with y = mx + c, we have for a net book value, m = 1 with a vertical intercept c = NR. Since the central line balanced is the axis of symmetry for NR, m = 1 and c = NR (Fig. 5).
Thus the EQUI corresponding to constant overlapping risk OR* is L-shaped (Fig. 6), the kink occurring along the central 45° line. As OR* increases, the kink moves up the line, away from the origin. . Given a constant value SR* we obtain X = γ −1 Y, whose slope γ −1 satisfies 1≤γ −1 <∝ Thus the EQUI corresponding to a particular value SR* consists of two rays in the positive quadrant meeting at the origin, with slopes γ and γ −1 . In Fig. 8 these rays are shown as OC and OB. Note that the symmetry of the diagram about the central 45° line implies that the angles θ 1 and θ 2 are equal.
In Fig. 7, relationships between the four risks measures and slopesγ and γ −1 , consider rays OB and OC subtending the angles θ 1, θ 2 measured from the symmetry axis. These will confirm that SR values are constant along any ray from origin and the two extreme case the two extreme cases are (i) θ 1 = θ 2 = 45°, in which case SR = 0 and either the Y value or the X value is zero and (ii) θ 1 = θ 2 =0, in which case SR = 1 and X = Y.

Data collection:
The database used in our illustrative empirical study consists of 200 Malaysian companies from Kuala Lumpur Stock Exchange (KLSE) which 60 companies went bankrupt and 140 companies are nonbankrupt companies from the same period of listing.

Variables:
In this study on the basis of the financial ratios successfully identified by past studies and availability, 40 indices have been built by using balance-sheet data.
Significance mean test: Ratios and significances on mean differences for each group is tested and presented in Table 1. These indices reflect different aspects of firm structure and performance and have been calculated as one-year ratios prior to bankruptcy.
Genetic programming: variable selection using Genetic Programming (GP) is to illustrate that this new transformation will produce more accurate prediction statistically and can be used as an alternative for common ratios. Following recent research by Etemadi et al. [11] we tested these selected variables with Genetic Programming (GP) to obtain fitness function tree and to illustrate that this new transformation will predict more accurate and can be used as an alternative for common ratios even with GP. Fig. 9: The best GP model obtained for SR method In the final regressions with fewer significant variables in different classification trees where as expected and we observed that different variables were identified as significant indicators for each procedure from the selected list. For implementing GP process and developing bankruptcy model, GeneXproTools software version 4.1 was used. Crossover and mutation operators were set as 0.44 and 0.05 respectively. Figure 9 and 10 show the best GP model obtained for each approach. These models have been divided in three sub-trees which each tree representing a Gene meaning the model is a chromosome consisting of tree genes. Sum of the returns of sub-trees for a firm should be compared with "Rounding Threshold" for determining the class of the firm. From the classification sub-trees depicted in Fig. 9, decision trees for SR approach with 95% accuracy rate obtained.
From the classification sub-trees in Fig. 10, decision trees for common ratios approach with 89% accuracy level. Variables, which are found significant in each sub-trees are represented in Table 2.

---------------------------------------------------------------------------------------------------------------------------------------------------Definition
Means of non-   The representation of a solution for the problem provided by the GP algorithm is in the form of decision sub-tree. Each node of this tree is a function node taking one of the values from the set +, -, *, ^, EXP and etc. Some of operators which were used in our study are shown in Table 3. For decision making of whether a firm is bankrupt or non-bankrupt through the genetic programming decision tree, a benchmark value of 0.5 is used. If the value for specific training or test firm is greater or equals 0.5, then this firm is marked as "bankrupt firm". If the value of the GP model for a training or test firm is less than 0.5, then this firm is classified as "non-bankrupt firm".   Misclassification cost: An alternative to error rate is a misclassification cost which is simply a number that is assigned as a penalty for making a particular type of a mistake. An average cost of misclassification can be obtained by weighing each of the costs by the respective error rate. Computationally this means that errors are converted into costs by multiplying an error by its misclassification cost. In Table 4 possible classifications  and misclassifications are shown and Table 5 shows the comparison accuracy by each classification model respect to different data representations.   Table 5, exhibits the summarized accuracy level for GP procedures and clearly the results improved under data transformation procedure. Due to better performance observation of this new transformation, data set is not collected form particular industry type or similar firm size or any outlier deletion applied. Thus, our process is free of any potential explanatory effect errors, which may caused by independent variable's distribution Deakin [9] .

K-fold cross validation:
In order to confidently lesson the effects of biasness, we conduct the K-fold cross validation procedure. Each one of the subsets is then in turn as testing set after all other sets combined have been training set on which a tree has been built. This cross validation procedure allows mean error rates to be calculated which gives a useful insight into classifiers decision. This technique is simply k-fold cross validation whereby k is number of data instances. This has advantage of allowing the largest amount of training data to be used in each run and conversely means that the testing procedure is deterministic. With large data sets this is computationally infeasible however and in certain situations the deterministic nature of testing results in weir errors. Further, k-fold crosses validation primary method for estimating turning parameters, dividing the data into k equal parts. For each k = 1,2…, k fit the model with parameters to the other k-1 parts and the kth part as testing sample. In our experiment we set our sample to 5-fold accuracy results. Table 6 represents the comparison of 5-fold accuracy results.
Description results highlight the following evidences that under transformation process better classification accuracy results achieved. While the pattern of not only liquidity variation is alternatively favorable to active companies but also turnover indices are higher for active firms. Assets to operating income ratio are higher for failed firms because of their reduced capital resources. Earning indices, display greater solvency for active firms, even though debts have increased for those firms with respect to go bankrupt. Operating structure ratios for active companies have a lower incidence of interest charges on sales and value added and higher depreciation charges over gross fixed assets for failed ones. Capitalization ratios clearly reflect the superior growth of active versus failed firms. Results suggest that some indicators like earnings to total debt traditionally considered in the empirical analysis but is not being significant in each of the three considered models. Profitability ratios emphasize the overall higher profitability of active enterprises. Finally, additional indices such as market share holders' dividend, sale, return and operating assets are significantly higher for healthy companies.

DISCUSSION
In this study we demonstrated the application of new graphical geometric approach for variable representation and data visualization. We believe that graphical analysis will have an increased importance as becoming more and more popular. On the other hand graphical ratio representation can facilitate the acceptance of prediction models in various areas, e.g., finance, medicine, sound and image processing. This will contribute to the development of those areas since better represent reality and provide higher forecasting accuracy. Within our new transformation methodology each company is described by a set of variables X i , such transformed financial ratios instead of original ratios. Financial ratios, such as debt ratio (leverage) or interest coverage (earnings before interest and taxes) characterize different sides of company operation. They are constructed on the basis of balance sheets and income statements. We used 40 ratios (predictors) computed using the company statements from their corporate bankruptcy data base. The predictors and basic statistics are given in Table 1. Initially, an unknown classifier function f: x→y is estimated on a training set of companies (x i , y i ), I = 2,…,n. The training sample classification regression represents prediction for companies which are unknown to be survived or gone bankrupt for testing sample.

CONCLUSION
This study presented a complementary perspective on the study of risk and bankruptcy with use of financial ratios. In this study, a new dimension to risk measurement, bankruptcy and ratio transformation with the advent of the share risk was proposed. We briefly derived the respective properties of new risk approach components of which were over come of using common ratios limitations. Our simple methodology, called Risk Box index, provided a geometric illustration of our new proposed risk measure and transformation behavior. Our study employed 60 distressed companies with matched sample of another 140 non-failed companies listed in Kuala Lumpur Stock Exchange (KLSE). We found a rise in classification accuracy on application of this new independent variables transformation using Genetic Programming (GP). The Share Risk model (Risk Box) can be employed as a tool of analysis in providing a crucial first stage for analysing studies associated with changes in risk patterns, in particular those assumed to be linked with potential bankruptcies. The adaptability of our proposed methodology is emphasised by its applicability for any number of years on sectoral or cross-country studies on risk and bankruptcy studies.