Quantifying Business Process Optimization using Regression

Corresponding Author: Solly Matshonisa Seeletse Department of Statistics and Operations Research, Sefako Makgatho Health Sciences University, PO Box 107, MEDUNSA, 0204, Gauteng Province, South Africa Email: solly.seeletse@smu.ac.za Abstract: The paper applies regression methods to model Business Process Optimisation (BPO) in order to derive measures for the extent of BPO achievement if efforts to optimise have already started. This will help to identify components of business that still need to be improved if full optimisation has not yet been achieved in a business. Regression methods were used to explain the tentative relationship of BPO with the variables identified as components of BPO. Two models (one with dummy coefficients and another with probabilistic coefficients) were developed. The first one was found to be unsuitable and lacked resources for further development. The second was satisfactory. A measure of BPO progress was then developed. The data used in the experiments were obtained from a private bank in South Africa. A regression model was designed and then fitted, statistically tested and found to be acceptable. Also, an estimate of the measure of BPO attainment level was developed. The study achieved its main goal, but acknowledgment is made to do more experiments with several larger data sets.


Introduction
Modern businesses have started to incorporate Scientific and Engineering (S&E) concepts in their operations to position themselves better in their markets. While attempting to do this, some business cases cannot be translated into S&E models (Amaral et al., 1997). Hence, scientifically sound approximate models for such cases are often acceptable, especially where statistical tests can be conducted. When S&E are applied to business, they can assist to improve business efficiency. This in turn leads to increases in revenue and profits (Burlton, 2001). In essence, business organisations customise S&E benchmarks in an attempt to maximise business benefits while minimising the detriments. Maximising benefits while minimising detriments within the applicable context is optimisation.
Companies contest against rivals to have a higher market share (Armstrong and Greene, 2007;Farris et al., 2010). Their strategies include a drive to retain existing clients and scrambling for new ones, including displacing others from the competitors. The emergence of new companies has intensified competition (Cranston, 2011). Business Process Optimisation (BPO) is one business approach to ensure that the company remains focused and competitive. BPO is an important business concept. However, largely it lacks proper scientifically investigated models to enable efficient approaches to it. The proposed study contributes by incorporating S&E in BPO by defining a statistical approach to model BPO using regression models.

Regression Methods
Regression analysis is a statistical method for analysing data with two or more variables with at least one variable being dependent of others (Sen and Srivastava, 2013). Let Y be the dependent variable of interest. A regression model relates the dependent variable to an independent variable, X = [X 1 X 2 …X k ] T through the unknown parameter vector β = [β 1 β 2 …β k ] T by a mathematical equation of the form (Freedman, 2005) This function explains how X affects variables of interest Y. Carrying out regression analysis requires specifying the function f.

Regression Conditions
Regression assumptions simplify the conditions under which multiple regression can be performed properly, ideally with unbiased and efficient estimates (Kutner et al., 2004;Wichura, 2006). When calculating a regression equation, an attempt is to use the independent variables (the X's) to predict what the dependent variable (the Y). In the process of calculating the regression equation, it is assumed that certain assumptions are satisfied with regard to the data. When these assumptions are met, unbiased and efficient estimates are likely to be achieved (Walter and Prozanto, 1997). Unbiased estimates have a systematic tendency to be reliable. Efficient estimates have to do with how much variation there is around the true value (e.g., the standard error). They have small standard errors. This efficiency can be established by using measures of precision.
Regression approach to modelling is moderately robust because it typically provides reasonably unbiased and efficient estimates even when some assumptions are not fully satisfied (Breiman, 2001). However, large violation of assumptions result in poor estimates and, consequently, to wrong conclusions.

General Linear Model
Let x ij be the i th observation on the j th independent variable. Madsen and Thyregod (2010) describe the general multiple regression model with p independent variables as: Residuals can be written as: The normal equations are: In matrix notation, the normal equations are written as: where, the ij th element of X is x ij , the i th element of the column vector Y is y i and the j th element of β isˆj β .
Thus X is n×p, Y is n×1 and β is p×1. The solution of Equation 5 is: The least squares parameter estimates are obtained from p normal equations.

Regression Diagnostics
Once a regression model has been constructed, Dobson and Barnett (2008) counsel that it may be important to confirm the model's goodness-of-fit and the statistical significance of the estimated parameters. Commonly used checks of goodness of fit include the Rsquared, analyses of the pattern of residuals (using measures of bias/precision) and hypothesis testing. The statistical significance can be checked using an F-test of the overall fit, followed by t-test of individual parameters. According to Christensen (2002), interpretations of these diagnostic tests rest heavily on the model assumptions. Although examination of the residuals can be used to validate a model, the results of statistical tests become difficult to interpret if the assumptions are violated. This could occur if the error term does not have a normal distribution. In small samples also, the estimated parameters will not follow normal distributions. This may complicate inference. With relatively large samples, however, a central limit theorem can be invoked such that hypothesis testing proceeds using asymptotic approximations.

Application of Regression
The goal of regression analysis is to determine the values of parameters in Equation (1) that best fit the observed data (Freeman, 1947;Yates et al., 2008). This goal is basically to create a mathematical model where the predicted and observed parameter values are close. By creating the "best fit" line for all the data points in a two-variable system, values of Y can be predicted from known values of X. Linear regression is used in business to predict events, manage product quality and analyze a variety of data types for decision making (Tishler and Lipovetsky, 2000).

Variables Defining BPO
Business process effectiveness, risk management and success factors and change management are three variable components of BPO (Apostolou et al., 2010;Babulall, 2011;Gong and Janssen, 2012). Each of these factors is a random variable because it attains various levels according to the random occurrences controlling their conditions. In formalising this assertion, define these random variables using the following notation: X 1 = Business process effectiveness X 2 = Risk management X 3 = Success factors and change management Babulall (2011) showed that the component or random variable X 1 has 12 attributes, X 2 has four and X 3 has five given by probabilistic attributes below: X 1 = Business Process Effectiveness X 11 = Time saving X 12 = Follow up with resources from other divisions X 13 = Work on many systems to complete tasks X 14 = Work involves technological processes X 15 = Allows for the best customer service delivery X 16 = Cost effective processes X 17 = Competitiveness in the organisation X 18 = Ability of organisation to attract new clients X 19 = Increase in profits X 1,10 = Ability to identify new opportunities X 1,11 = Launch of new innovative products X 1,12 = Serve as a platform for new system selection X 2 = Risk Management X 21 = Business processes mapped in a suitable business framework X 22 = Access to these mapped processes X 23 = Processes allow easy identification of risks X 24 = Risks mitigated through processes updating X 3 = Success Factors and Change Management X 31 = Process change initiatives align with the organisation's strategy X 32 = Organisation has effective mechanisms for managing process change X 33 = Business processes continuously reviewed X 34 = Process training provided for effecting process change initiative X 35 = Staff involved in the process change from start to finish None of the attributes has common features with others. This is an indication that the random variables X 1 , X 2 and X 3 are mutually exclusive. This property was tested and confirmed by Babulall (2011) who also found out that these factors are the only ones explaining the main variables, therefore concluding that the attributes are exhaustive.

Method to Quantify the Variables
The three descriptor variables of BPO are exhaustive. They can be measured separately since each of them is a full business feature. Hence, by counting the attributes of the various variables, BPO has a total of 21 units (given by the sum of 12 + 4 + 5 individual mutually exclusive attributes). The variables contribute unequally to the measurement of BPO. Each of these three variables may fail to occur (given as 0) or may occur (given as 1) in a business.

Business Process Effectiveness
The Business Process Effectiveness (BPE) variable can measure from zero (0) if all the measures of their presence in a business process indicate 0, up to 12 if all the measures of their presence are 1. In BPO, therefore, BPE can contribute from 0 to 12 units. Since BPE can be measured as an independent variable, the extent of each attribute can be assessed.

Risk Management
By the same approach, Risk Management (RM) has four attributes. Hence, the units it can contribute range from 0 up to 4 units.

Success Factors and Change Management
Lastly, Success Factors and Change Management (SFCM) has five attributes. Thus, the units that SFCM can contribute range from 0 up to 5 units.

Relative Importance to BPO
The talk about BPO attainment generally assumes that business processes have been optimised through change (Hammon, 2007). Therefore the value 21 appears in one's mind. This also therefore, implies that each of BPE, RM and SCFM has been achieved in full. However, according to the unequal numbers of attributes that each variable contributes to BPO, the component variables of BPO have different levels of importance or worth in their contribution to BPO. In the 21 units of BPO, BPE has value of 12, RM has value 4 and SFCM has value 5. This shows that in a complete BPO environment, BPE has relative worth of about 0.57(12/21), RM has relative worth of about 0.19(4/21) and SCFM has relative worth of about 0.24(5/21).

Description of BPO
In simpler terms, BPO is an effort to make firms more process centric (Vernon, 2004). It entails reviewing of processes that are mapped in a suitable business framework to make the best outcome from what is available. It is clear that in most instances, elements playing a role in business cannot always all be simultaneously maximised to maximise the worth. Usually, while some elements are maximised, others are only lowered to a point where they cannot reduce item value in a trade-off (or catch-22) situation. This intent can be achieved through the application of BPR approaches and other suitable business methodologies. This BPR is a concept that entails analysing and designing workflows and processes within an organisation (Davenport, 1993).

Component Variables of BPO
BPO (Y) components, which are also its advantages, include business effectiveness (X 1 ), risk identification and mitigation (X 2 ) and profit maximisation (X 3 ).

Study Design
The study is a combination of qualitative and quantitative methods. The BPO concept is mainly qualitative while regression modelling is quantitative. The study aims to integrate BPO and regression into a useful quantitative model.

Data Collection
Secondary data were used. A study was undertaken by Babulall (2011) to investigate how business processes in a South African private bank could be optimised. The condition under which data were released for the study was that the name of the bank would not be revealed.

Data Management and Analysis
SAS, STATA and SPSS were the statistical packages used in the statistical analyses and data evaluation in this study. The previous section connected BPO with regression by describing BPO variables that should be used in the model. Initial efforts to measure BPO have also been covered.

BPO Regression Model Construction
This section models BPO using a linear regression model. It then provides an estimate of the extent of BPO attainment. Define: where, X ij = 0 for all i = 2, j ≥5 and for all i = 3, j ≥6. Also, let: ( ) The nature of the other alpha values depends on the approach outlined by the analyst in measuring BPO. Then define the parameters: The BPO (Y) in this instance is proposed to be a regression function (NB: Absence of error term is deliberate): The regression model using probability coefficients on k variables leads to the form: 1 1 2 2 k k Y = p X + p X + . . . + p X where, 0≤p i ≤1, for i = 1, 2,..., k and If dummy variables δ are used, the values 0 and 1 indicate absence and presence respectively, of an attribute in each component. Model (11) will then take the form:

Quantifying BPO
The variables of BPO were identified. The relative worth of each variable in evaluating BPO was clarified. Ideally, full achievement of BPO is to have all the attributes of the three BPO components being included in a business process. Thus, as there are 21 attributes, a respondent who believes that all these components are included in the systems is convinced of optimality. The complete optimal system would have a BPO of measure 1 (=21/21). If there is none of the attributes, then BPO measure is 0 (=0/21). Others lead to values between 0 and 1. Values near 1 indicate high optimisation and values near 0 indicate low optimisation. Therefore, due to the trade-off acceptance, BPO measures close to 1 are still acceptable for considering a system as optimal.

Explaining the Quantification of BPO
The values for evaluating BPO are fractions or proportions between 0 and 1. They can also be expressed as percentages. Since the origin of the variables showed randomness, the variables can also be considered to be some probability measures if necessary.

Probability Coefficients
Writing the regression model (12) to reflect that X 1 has 12 attributes, X 2 has four and X 3 has five by weighting according to these numbers from the total is 21 attributes (where the component variables are column vectors), then it becomes:

Dummy Coefficients
This approach suggests the use of dummy variables 0 and 1 to indicate absence and presence respectively, of an attribute in each component. Model (11) will then take the form: Here, δ i = (δ i1 , δ i2 ,..., δ i,12 ), i = 1,2,3 are the coefficients explaining absence (= 0) or presence (= 1) of the attributes corresponding to the components of BPO as well as with the assigning of δ ij = 0 for all i = 2, j≥5 and for all i = 3, j≥6.

Model Building
The model developed appears initially in a tentative form and later improved. The improvement can be done by adapting some elements of the tentative model, removing other elements in the model, or adding elements as may be proper according to the tests conducted. Generally, there are four fundamental steps in model building. They are the development of a tentative model, fitting the model, testing the model and then adapting the model as may be needed (Henderson and Quant, 1958;Hogg and Craig, 1995). Knowledge of mathematics, specifically in the building of mathematical models, is necessary. Then statistical methods are needed to estimate and then test the model before presenting it for use in prediction. Various measures of quality can also be used to qualify the model even further. Such measures evaluate bias and precision based on residual analysis.

Data for Building the Experiential Learning Model
For mathematical and statistical models, the model has to be fitted by estimating the unknown parameters. The estimated coefficients should also be tested. The presence of reliable and valid data is often paramount while the absence of data may be a blow for model development. Further, only suitable data should be used. Currently, the data available is inadequate to estimate and carry out tests reliably. As a compromise, this study used data collected from a bank in South Africa, using a questionnaire to determine the number of people who experienced or judged the various items of the bank service. Thus frequency data were used and the applicable relative frequencies were used to estimate probability values.

BPO Measurement and Model Fitting
The illustration in this section fits the regression model discussed earlier. The independent variable of interest, BPO, is regressed on the attribute component variables BPE, RM and SFCM. A measurement of BPO extent is given in this section. Then a regression model is introduced and tested.

Estimating Extent of BPO Achievement
The ideal state is to have full BPO attainment. However, sometimes employees are blamed unreasonably for not having achieved full BPO without receiving credit for the extent to which they are towards full BPO. That is, when accusation of failure to have the ideal BPO state, there may be good progress towards the ideal state. This section aims to give an accurate account when some progress has been made towards BPO. It also intends to be able to identify all the fragments that have been achieved towards BPO and what is still outstanding to work on towards the ideal state of BPO.
The study involves a total of 21 attributes. Based on the simple experiment made up to this stage, the level at which BPO has been achieved can be estimated. The data showed that five attributes of X 1 were reliably 'achieved', together with two of X 2 and two of X 3 , making them 9. Then the estimated BPO achievement in this case was 9/21, or 42.9%. This can be interpreted to mean that minimising detriments and maximising benefits occurred in about 43% attributes, with compromises being effected on the other 57% remaining others.

Tentative Model
The model we are testing in this presentation is Equation 13, given as: 1 1 2 2 3 3 Y = p X + p X + p X

Estimation of Parameters
Based on the study propositions, the coefficient values can be given as the levels of importance or weights of the various component variables in defining the model, which is:

Model Testing
Model testing refers to testing for goodness-of-fit of the model using a chi-square test. The chi-square statistic is a squared distance measure that compares the expected data with the actual data. It is based on the hypothesis that the form given is the correct one. If a model seems incorrect, some transformation methods may be needed. The (null) hypothesis being tested is given by:  The alternative hypothesis obviously suggests that at least one of the coefficients is not what the null hypothesis states.

An Issue with the Data
The data received from a survey in a bank, collected under the guidance of a professional statistician were based on a random sample of 93 respondents. (The fine details of the exercise were not supplied due to the sensitivity of those facts, as told by the IT bank personnel who supplied the data). As informed by this staff from the bank, the data indicated that 49 responses were allocated to X 1 , 21 to X 2 and 23 to X 3 towards the BPO contribution. These serve as the observed data. The expected frequencies for the chi-square were obtained by multiplying the H 0 frequencies by 93. Despite not having the hand in the data, the study assumed that what the bank official supplied was truthful.
In order to test this hypothesis, we need the expected values according to the null hypothesis. This requires the total observed of 93 to be distributed according to the percentages of this hypothesis. The observed values 'o i ' in Table 1 are given in the first row the derived expected values 'e i ' are in second row.
From statistical tables in Bless and Kathuria (1993), the critical chi-square (x 2 ) value at two degrees of freedom at the 5% level of significance is x 2 2;0.05 = 5.991. This value is the benchmark for the test statistic. That is, the test statistic has to be compared with it. If the test statistic exceeds the critical value, then the null hypothesis cannot be trusted. It is thus rejected. The test statistic is calculated and found to be:

Discussion
A procedure to measure the extent to which BPO has been achieved in business optimisation has been developed. This method points at the achieved elements.
It will be easy to know how much effort to invest in the business in order to complete the remaining gaps towards optimisation. On the regression model developed and tested, this model was not rejected by the goodness-of-fit test.

Conclusion
The procedure to measure BPO is straightforward and should be maintained, especially with its advantages of knowing how much has been achieved and how far there is still a lack. Regarding the developed regression model for BPO, at this stage this shows to be an acceptable model.

Study Limitation
The model based on dummy coefficients was not developed enough. It was limited by not allowing interim measures of BPO. There may be cases where interim measures can be applied. This can enable scenarios for data collection. In this study a model was only introduced but not pursued further. With the regression model that was based on probability coefficients, the problem was that only one data set was used.
Companies refuse to give out information about their business affairs. The data was also not big to reliably inform the true values of the coefficients of the model.

Recommendations
Three recommendations are made for the study. The study recommends: • Collection of large relevant data sets on an anonymous condition • Utilization of several and large size data sets from different business organisations to fit and test the model with probability coefficients • Exploration of the model with dummy variables for its merits, not for its limitation which was the reason for not developing it further

Author's Contributions
Gezani Richman Miyambu: Solved exercises based on the model idea, developed the model, tested it and presented results. Paper development was a joint effort.
Solly Matshonisa Seeletse: Supervised the first author, gave the exercises to GRM to solve and selected this journal after joint effort of the paper development.

Ethics
The development of this paper was based on the ethical clearance by the bank and the Department of Statistics and Operations Research of the university in undertaking this study. No ethical issues are anticipated as a result.