Robust Logistic Regression to Static Geometric Representation of Ratios

Problem statement: Some methodological problems concerning financial r atios such as nonproportionality, non-asymetricity, non-salacity wer e solved in this study and we presented a complementary technique for empirical analysis of f inancial ratios and bankruptcy risk. This new method would be a general methodological guideline associated with financial data and bankruptcy risk. Approach: We proposed the use of a new measure of risk, the Share Risk (SR) measure. We provided evidence of the extent to which changes in values o f this index are associated with changes in each ax is values and how this may alter our economic interpre tation of changes in the patterns and directions. O ur simple methodology provided a geometric illustratio n f the new proposed risk measure and transformation behavior. This study also employed R obust logit method, which extends the logit model by considering outlier. Results: Results showed new SR method obtained better numeri cal results in compare to common ratios approach. With respect to accuracy results, Logistic and Robust Logistic Regression Analysis illustrated that this new trans formation (SR) produced more accurate prediction statistically and can be used as an alternative for common ratios. Additionally, robust logit model outperforms logit model in both approaches and was substantially superior to the logit method in predictions to assess sample forecast performances and regressions. Conclusion/Recommendations: This study presented a new perspective on the study of f irm financial statement and bankruptcy. In this stu dy, a new dimension to risk measurement and data repres ntation with the advent of the Share Risk method (SR) was proposed. With respect to forecast results , robust loigt method was substantially superior to the logit method. It was strongly suggested the use of SR methodology for ratio analysis, which provided a conceptual and complimentary methodological solutio n many problems associated with the use of ratios. Respectively, robust logit regression can b e employed as a tool of regression in providing regression for studies associated with financial da ta.


INTRODUCTION
In recent decades, business failure prediction has been one of the major research domains in financial researches to evaluate the financial health of companies [14] . It is obvious that Bankruptcy involves large costs and corporate failure prediction has been stimulated both by private and government sectors all over the world [9] . Moreover, company failure may inflict negative shocks for each of the shareholders, thus the total cost of failure will be large regarding to economic and social costs [25] . Besides, bankruptcy prediction models have been proven necessary to obtain a more accurate statement of firm's financial situation [18] .
First Beaver [7] showed that corporate failure could be reliably predicted through the combined use of sophisticated quantitative using selected financial ratios. Then Altman [1] extended this narrow interpretation by investigating a set of financial ratios as well as economic ratios as possible determinants of corporate failures using multiple discriminant analysis, in particular the Z-score model. Since Altman [1] , literature on predicting bankruptcy has witnessed numerous extensions and modifications. Previous researchers all emphasized that financial ratios have significant effect on bankruptcy risk, return, credit risk, commercial risk, market and economic conditions [27] . While attempts have been made to solve problems of using accounting-based financial ratios, none has been entirely successfully developed in quantitative and objective systems for bankruptcy prediction [2] . Some attempts included trimming the sample ratios, eliminating negative observations and use of various transformations such as logarithms and square roots to achieve more normal distributions [8] . However, most of these attempts have utilized use of common ratios, which may exceeded cost of errors in the analysis and problem of miss-specification [4,6] .
Some researchers made correction for univariate non-normality and tried to approximate univariate normality by transforming the variables prior to estimation of their model. Deakin [10] used logarithmic transformation for the lack of normality for distributions and other study used square root and lognormal transformation of financial ratios [13] . However, logarithmic and square root transformation may also be arbitrary [26] . The rank transformation used by Kane et al. [17] reported improvement in fit and less biased results by linear models with transformed data set. Logarithmic and rank transformations and square roots are even more difficult to interpret because they can alter the natural monotonic relationships among data [8,21] . There are many methods to estimate the probability of bankruptcy but none of them have taken the outliers into account when there is a discrete dependent variable. Outliers, which can seriously distort the estimated results, have been welldocumented regression model [11] . Although methods and applications that take outliers into account are well known when the dependent variables are continuous [22,24] , few have conducted empirical studies when the dependent variable is binary. Atkinson and Riani [3] , Flores and Garrido [12] have developed the theoretical foundations as well as the algorithm to obtain consistent estimator in logit model with outliers, but they do not provide applied studies.
There is no general guideline concerning the appropriate data representation, which is able to solve ratio difficulties. Respectively there is a need of regression method application in order to consider outliers. Furthermore, none of the previous attempts had perfect prediction in the functional form. While all of procedures utilizing the use of common ratios without considering numerator and denominator of each ratio in specific, which are the most essential factor concerning each ratio value.
Our first objective in this study is to propose a new approach, which involves data representation, followed by illustrating the use of this methodology for measuring financial risk in ratio analysis and prediction bankruptcies. The second aim of this study is to predict bankruptcy probability with the consideration of outliers. We developed the method used by Atkinson and Riani [3] . According to literature, present study is the first one that using the Robust logit model for financial data and bankruptcy predictions.

Review of statistical methods of prediction:
The methods of Rousseew [22,23] such as Least Median of Squares (LMS), Least Trimmed Squares (LTS) are now standard options in many econometric soft wares. The literature, however, is slow in the consideration of outliers when the logit model is involved till 1990. Furthermore, all developments are on the theoretical derivations of outliers in logit method and there is a lack in applications of financial fields.
Since Altman [1] , MDA is a prevalent technique in bankruptcy prediction in terms of classification or prediction ability among traditional models [5] . Some studies have found logit model superior to MDA [15] . However, the research by Aziz and Dar [5] , has shown that the two models are equally efficient. Robust statistics provides an alternative approach to classical statistical methods. Robust methods provide automatic ways of detecting, down weighting (or removing) and flagging outliers, largely removing the need for manual screening. A robust statistic is resistant to errors in the results produced by deviations from assumptions. The median is a robust measure of central tendency, while the mean is not; for instance, the median has a breakdown point of 50%, while the mean has a breakdown point of 0% [20] . The median absolute deviation and inter quartile range are robust measures of statistical dispersion, while the standard deviation and range are not [16] .

Robust regression:
The Robust Library in S-Plus software enables us to robustly fit Generalized Linear Models (GLIM's) for response observations y i , = i = 1, 2…, n, that may follow one of the Poisson or Binomial distributions.
The Binomial Distribution is x η = β . The linear predictor η and the expected value µ i are related through the link function g which maps µ i to η = g(µ i ) The inverse link transformation g −1 maps η to µ I = g −1 (η).
Following binomial model canonical link (the logit link), we have For the Binomial model, is conditional expectation is: In the Bernoulli distributions, the response y i is either 0 or 1 and so cannot be an outlier. In the general Binomial model when n i is large, the y i can also be outliers in cases where the expected values of i y n are small. Thus, in the general Binomial cases, influential y i outliers need for a robust alternative to the MLE.
Regarding misclassification results which are important in our research we used misclassification model approach to estimate β i instead of Cubif or Mallows approaches, as a solution of the estimating equation: The mis-classification model gives F: This estimator, introduced by Rousseeuw [24] has properties similar to those of the Mallows-type unbiased bounded influence estimates. , proposed Share Risk (SRi) is defined as a function of X i and Y i . Consider a square two-dimensional space that captures all changes in numerator X i and denominator Y i , for any firm i and any period t where X and Y can be positive, negative or zero (It is applicable to any level of aggregation such as cross-country studies, cross sector and ratios). Assume a hypothetical study of risk covering n years for sector j. For t 1,2,3,..., n ∀ = , we have: X t , Y t > 0. All risk components measure indices such as, Total Risk TR = X + Y, Net Risk NR |X-Y|, Overlapping Risk OR = (X + Y)-|X-Y| and lastly the proposed Share Measure of Risk (SR) as we define below, are linear functions of X and Y which X + Y = TR = NR + OR: Following Bahiraie et al. [6] , we can construct a two dimensional box that encapsulates all of these variables for n years. The dimensions of the risk box are generated by the maximum value of either Xi and Yi value during the period of study. From the definition of TR, NR, OR, SR, we obtain: Each respective risk box will have sides equal to max(X i ) if for i∈t then max(X i ) > max(Y i ) or max(Y i ) if otherwise. Our exposition of the dimensions of the box is as follows which confirms the elasticity and unit-free nature of SR measure: Locus of equi TR: A 45° line from the origin bisects the box into two equal triangles (Fig. 1). This positive slope diagonal is the locus of balanced risk where X = Y, TR equals OR, SR equals unity and NR equals zero. This is the risk components' axis of symmetry. The two triangular planes in the box consists of an upper triangle containing coordinate points (X i , Y i ) where X i >Y i in and points Y i >X i in the lower triangle. A fix value TR = TR* implies X = TR*-Y. Comparing with y = mx + c, we have the gradient m equals minus unity. Hence, locus of equi TR is perpendicular to the axis of symmetry. . Given a constant value SR* we obtain X = γ −1 Y, which γ −1 satisfies 1 ≤ γ −1 < ∞ Thus the equi corresponding to a particular value SR* consists of two rays in the positive quadrant meeting at the origin, with slopes γ and γ −1 . In Fig. 1 these rays are shown as OC and OB. Note that the symmetry of the diagram about the central 45° line implies that the angles θ 1 and θ 2 are equal. Fig. 1, relationships between the four risk measures and slopes γ and γ −1 , consider rays OB and OC subtending the angles θ 1 , θ 2 measured from the symmetry axis. These will confirm that SR values are constant along any ray from origin and the two extreme case the two extreme cases are (i) θ 1 = θ 2 = 45°, in which case SR = 0 and either the Y value or X value is zero and (ii) θ 1 = θ 2 = 0°, in which case SR = 1 and X = Y.

Geometry of SR and risk box: In
The natural distribution of SR transformation ensures data are not skewed and should be more robust to the assumptions of Gaussian statistical methods. SR method can be applied equally to variety of distributional forms, thus making the technique particularly useful in ratio analysis where a diverse set of distributional functions have been identified.
Negative values will be transformed to specific variation, thus removing the necessity of deletion of negative data used in previous studies.

Data collection:
The database used in our illustrative empirical study consists of 200 Iranian companies from Tehran Stock Exchange (TSE). Fifty companies went bankrupt under bankruptcy rule number 167 of Iranian companies' law act 1965, which a firm is bankrupt when its total value of retained earning is equal or greater than 50% of its listed capital. 150 companies are "matched" companies from the same period of listing 1998-2005.
Indicator variables: Base on the financial ratios successfully identified by previous studies and availability, 40 indices been built by using balancesheet data. Ratios and significances on mean differences for each group is tested and presented in Table 1. These indices reflect different aspects of firm structure and performance: Liquidity, turnover, operating structure and efficiency, capitalization and finally profitability. Bankrupt companies are indicated as 1 and non failed companies as 0. Thus, a firm will have a higher failure probability and will be classified into failing group if its score is higher than cut-off point in each approach.       Stepwise method: For primary variable selection and testing each variable's effectiveness on discriminating power, CartProEx V.6.0 software with Mahalanobis D 2 measure was used. Table 2 reports selected variables that produced greatest effectiveness on separation for each groups to have more stable and well-balanced model.

Regression analysis:
We tested these selected variables using Logistic and Robust Logistic Regression Analysis to illustrate that this new transformation will produce more accurate prediction statistically and can be used as an alternative for common ratios. Results show that Robust logit model outperforms logit model in both data sets. Table 3 report the estimated results using the logit and the Robust logit models, respectively. When the logit model is used, less coefficients show are significant compare to Robust logit model. Alongside this, the psudo-R2 is higher for the Robust logit models in both approaches, suggesting that in-sample fitting is much better in the Robust logit model than in the logit model.

K-fold cross validation test:
In order to observe the effects of biasness, we conduct the K-fold cross validation procedure. Each one of the subsets is then in turn as testing set after all other sets combined have been training set on which a tree has been built. This cross validation procedure allows mean error rates to be calculated which gives a useful insight into classifiers decision. This technique is simply k-fold cross validation whereby k is number of data instances.  Original ratios  DRS approach  --------------------------------------------------------------------- This has advantage of allowing the largest amount of training data to be used in each run and conversely means that the testing procedure is deterministic. In our experiment, we set our sample to 5-fold accuracy results. Table 4 shows the comparison of 5-fold accuracy results. Descriptive results highlighted the following evidences that under transformation process better classification accuracy results achieved while Robust logit model outperforms logit model.

DISCUSSION
In this study, a new dimension to risk measurement and data representation with the advent of the Share Risk method (SR) was proposed. We briefly derived the respective properties of new risk approach components of which can overcome using common ratios limitations. Our simple methodology provided a geometric illustration of the new proposed risk measure and transformation behavior. SR method can be applied equally to variety of distributional forms, thus making the technique particularly useful in ratio analysis where a diverse set of distributional functions have been identified. SR approach is naturally bounded and unaffected by distance between observations, outlier effect if present will be reduced. Similarly, distance data containing white noise and the sensitivity and power of statistical test are improved. Negative values will be transformed to specific variation, thus removing the necessity of deletion of negative data used in previous studies. Besides, proportionality is a theoretical assumption that may not in fact hold and the degree of departure varies across industries and size classes. We also compared the forecast ability between logit and Robust logit methods, where the latter consider the possible outliers. With respect to forecasts, Robust Loigt method is substantially superior to the logit method.

CONCLUSION
One of the most well known anomalies of the risk factors is the effect of some ratios on bankruptcy risk and firm returns. In banking, ratios are taken as a proxy for the charter value of banks [19] . The convince use of financial ratios may exceed cost of errors in analysis caused by ratio-related model mis-specification and in general, no equally convenient, or superior alternative to ratios has been developed and applied to financial ratio analysis.
This research was motivated to develop an alternative for ratio-based methodology for financial studies. The properties derived form described in our methodology may be general guidelines for ratios analysis, in which there is no arbitrary conditioning, because the numbers of transformations are equal the number of observations. According to proven properties of new SR method discussed in methodology and better numerical results obtained, it is strongly suggested the use of this new methodology for ratio analysis, which provided a conceptual and complimentary methodological solution to many problems associated with the use of ratios. Respectively, Robust logit regression can be employed as a tool of regression in providing regression for studies associated with financial data.
Since previous studies used one and two year prior to bankruptcy, consequently, generalize ability of model with expansion for an additional year is recommended for further studies. Furthermore, as reported by IMF, to undertake such research to understand the capital structures and other financial indicators such as macro and micro economic variables simultaneously that might be effect on firms' performance and eventually can improve prediction is necessitate, therefore testing above model respect to this issue will be important to be continued.