Using Regression Analysis to Predict the Demand Function of Electricity: A Case Study

Corresponding Author: Mohammad Najjartabar Bisheh Department of Industrial and Manufacturing Systems Engineering, Kansas State University, Manhattan, Kansas, USA Email: Mnajjartabar@ksu.edu Abstract: Due to the growing electricity consumption in Iran, investigating the changes of the electricity demand is one of the fundamental challenges facing many professionals and planners. The planners always invest efforts to address this issue by accurately predicting the electricity demand over the years and increasing the extra capacity respectively. One of the main tools for predicting the electricity demand is a regression model. Generally, in the papers, to estimate the annual electricity demand, the electricity prices and GDP per capita have been considered as independent variables. In this study, we used the data pertinent to the electricity prices, GDP per capita and investment per capita from 1974 to 2007, to estimate the annual electricity demand. In our estimated model, price elasticity, income elasticity and investment elasticity were 0.187, -0.566 and 1.207 respectively. The annual demand for the electricity for years 2008 and 2009 was predicted. The low error rate between the actual values and the predicted values shows that this model is an acceptable model.


Introduction
Electricity as a form of energy is a backbone of any country's economy. Energy itself is not a consumer demand. What is used for (i.e., heating, cooling, cooking, lighting and motive power) makes it on demand. In Iran, after oil and gas, electricity has been extensively used for various purposes. Using electricity in the energy cycle for various reasons (e.g., cleanness), its simple conversion to energy and its use in electrical appliances are the factors increase the tendency for the electricity consumption (Hosseinalizadeh et al., 2016;Zahraee et al., 2020). Technical issues such as mitigation of the power (Hamidi and Kiany, 2019), transformers (Khan et al., 2018; and overestimation or underestimation of power (Karimi and Natarajan, 2020) will makes huge cost in energy network. However, concerning such a tendency, there are some questions which have been overlooked and need more investigation: (a) What is the expected demand for the electricity? (b) What factors influence electricity consumption? And to what extent? and (c) what policy needs to be employed to decrease the consumption of electricity?
Answering to these questions require understanding the mechanism of electricity consumption and electricity demand of the country. Respecting a country's economy, there are several sections that interactively influence each other. Generally, in the microeconomic theory, Electricity demand is influenced by two main factors, price and income. Except these factors, there are some external factors that can affect these sectors. One of the most important sectors is the energy economy.
The energy economy is a significant percentage of the global economy. It also influences the entire industry, transport, household and can contribute to a country's economic success or failure. In Iran, the energy sector is an important factor in economic growth and, the country's biggest source of revenue is oil-exporting.
One of the most common discussions about the future of the energy focuses on the contribution of the income elasticity of energy demand to economic growth in the future. The results of the studies conducted in this area used for developing the infrastructure of energy planning and energy policy making. These studies also highlight 760 how to meet the future energy needs. Compared to other energies, electrical energy requires more attention in planning and policy because it is one of the cleanest and the easiest forms of the energy.
Proper and efficient planning requires identifying the problem and finding some possible solutions. Regression based models have been giving promising performance to find possible solutions for these kind of problem (Fanoodi et al., 2019). Different type of regression analysis has been used successfully to predict demand in different context such as software development (Urbaczek et al., 2020), supply chain management (Najjartabar-Bisheh et al., 2018), financial management , analyzing of productivity (Esmaeili et al., 2019) and healthcare systems (Jahantigh et al., 2017) and many other areas has been used. In this study, the electricity demand of the other variables is described in the following section and the electricity demand in relation to the relevant parameters was calculated. In our preliminary research, using the data from 1974 to 2007, we developed a model to estimate the electricity demand per capita in terms of price variables, GDP and investment. Then, tests of the accuracy have been carried out and its results were analyzed. This paper is structured as follows: In the second part, we reviewed, the relevant literature in both Iranian and international contexts. In section three, the mathematical and economical models of the electricity demand are presented. Following the statistical model that we described in section four, we developed our model in section five, the initial model was estimated in section six. In part seven, the estimated accuracy of the testing was evaluated and the errors of the model were corrected. In section eight, using the data of 2008 and 2009, we tested the model and presented the results in section nine.

Literature Review
To understand Iran's energy plan, much efforts have been made by scholars (Esmaeili et al., 2015;Montazeri and Najjartabar-Bisheh, 2017). For example, in 1977, a research institute affiliated with Stanford University attempted to investigate and document the "long-term energy plan" in Iran. After the revolution and the war, in the nineties, some studies were carried out by the Planning and Budget Organization (PBO). The pertinent studies in Iranian context and international contexts are reviewed in the following section. Lam (1998) studied the time series data from 1971-1993 in Hong Kong and identified the effect of different factors (i.e., price of electricity, household income, household size and number of hot days) on the electricity consumption. The study hypothesized that the number of cold days and wet days could predict the residential electricity consumption. However, the study did not confirm the effect of these parameters (Lam, 1998). Hondroyiannis (2004) investigated the stability of electricity demand, price and income elasticity in both long-term and short-term in the vector error correction model. The results of the study showed that income, price and weather conditions can influence residential electricity demand. Weighted average temperature of different months of the years was used to show the weather conditions. Kamerschen and Porter (2004) conducted a study to examine the residential, industrial and total electricity demand. To this purpose, he used a partial adjustment method and a simultaneous equation method. He found that climate can predict some changes in electricity consumption and cold weather were shown to better predict the demand than warm weather. Atakhanova and Howie (2007) and Howie inspected the electricity market in Kazakhstan based on the elasticity of demand for electricity in the country; they offered some suggestions for improving the conditions. In their paper, they calculated the consumption of residential, industrial and total electricity in Kazakhstan. In the proposed model for the residential sectors, household expenditure, economic restructuring, cost and consumption of the electricity in the previous period were important variables. The study showed that the price elasticity of demand was very low (Atakhanova and Howie, 2007). Narayan et al. (2007) reviewed the legislation trend in G7 countries in the field of electricity and suggested an equation for estimating per capita consumption of the electricity in their country. Variables in this equation were the real price, real income and the price of the gas. Amarawickrama and Hunt (2008) reviewed the various elasticity demand in Sri Lanka and predicted the demand for 2025. They reported that electricity and gas prices and family income would be the most important variables influencing the consumption (Amarawickrama and Hunt, 2008). Bianco et al. (2009) estimated the demand function in Italy. According to the study, per capita income, the price of electricity and three courses of delay and electricity consumption with the three courses of delay were effective in the equation. In this research, both short-term and long-term price elasticity of consumption were calculated and the income elasticity was bigger than the price elasticity.

Review of Literature in Iranian Context
Fatholahzadeh examined the electricity demand in the residential sectors as a function of the country's households' real expenditure per capita, the real price of electricity, the real price of alternative energy carriers of electricity (i.e., gas, oil) and per capita consumption with 761 a lag period. The study found that electricity demand in the household sectors was inelastic to changes in both price and income and always the income elasticity was bigger than the price elasticity (Aghdam, 1993). In 1999 using the error correction techniques of Engel and Granger, Emami Meybodi investigated per capita electricity demand as a function of per capita income and average household electricity prices in the domestic sector. In his research, the income elasticity was smaller than the price elasticity and both were less than the unity (Emami, 1999). Attar (2000) considered the energy consumption in the residential sector as a function of energy prices and disposable income. The results indicated that energy demand was inelastic with respect to changes in the explanatory variables, but in terms of absolute value, the price elasticity was less than the income elasticity (Attar, 2000). Pazhoyan and Teimouri (2000) examined the country's electricity demand as a function of the real price of electricity, alternative energy prices (i.e., weighted average prices of kerosene, diesel, fuel oil, LPG, etc.) and GDP. The price elasticity was less than the unity and the income elasticity was bigger than unity. Mirza Mohammadi and Karimi (2010) considered the domestic electricity demand function as a function of prices to household tariff, per capita income and number of warm days in a year and per capita consumption with a lag period. They concluded in their study that both income and price elasticity were very low.
In the present study, we investigated the total electricity demand in Iran and in all the sectors. To this purpose, we estimated demand per capita as a function of electricity price, GDP and investment per capita.

Economic and Mathematical Models
Before developing our models, we reviewed not only the important standards, efficacy and effectiveness of the economy sectors but also their limitations. One of the most important tools for economic analysis is supply and demand functions. Supply functions can help understand the behavior of producers and predict long-term economic impact. Moreover, supply functions, along with other economic parameters can be used to have a better understanding of the industry situation.
The demand function and the supply function can describe the market conditions, price, growth and the future of that industry. They can also predict consumers' utility function. Therefore, both identification and careful analysis of the supply and demand functions for each economic sector need to be prioritized y. Furthermore, this method allows studying for the analysis of other sections.
Studies on the energy sector in Iran are quite limited because the supply is monopolized by the government. From an economic point view, this imposed limitation seems reasonable. Because, in the monopoly market, we can consider the production level and market prices as exogenous variable -a variable that influences the model from outside and it is not influenced by the model parameters-. Investigating on of the supply of energy by Governmental organizations can influence the model. Therefore, in studying the energy sector, the great emphasis should be on the demand function.
Demand function has several functions. For example, the demand can be studied in relation with changes in income and the consumption of a product. It is expected that by increasing a consumer's income, his consumption also increases. Demand function can also help us determine if the types of the commodity that we are going to analyze are common goods or inferior goods. It is expected that by decreasing the price, the number of the demand will increase. The goods that are used more as the prices increase are called Giffen.
There are various forms of the demand functions. The most important of them are linear, linear-logarithmic and translog. The linear-logarithmic function or Cobb-Douglas function have been extensively used for their advantages. With regards to the advantages, we used them for our data analysis. Assuming electricity consumption per capita as a function of the electricity price and GDP per capita and the investment; we will have this formula: In the above equation, α1 is the price elasticity and α2 is the income elasticity of demand. Therefore, consumer demand function can be used to estimate elasticity values. Elasticity or sensitivity of demand to price shows that how demand would be in responds to price changes. Defining the elasticity in terms of the derivative is impossible because, the demand function may have different units. This means that it can be defined as the ratio of the changes in quantity divided by the ratio of the changes in price, which can be calculated by using the following formula: If p = 0, then the elasticity = 0 and If q = 0, then the elasticity is equal to negative infinity.

762
It should be noted that the elasticity of demand for normal goods is negative. More sensitive a commodity or market is to changes in the price, larger the absolute value would be. If the absolute value of the elasticity is less than 1, we can say that demand is inelastic or insensitive and if it is greater than 1, we can say that demand is sensitive or elastic. Elasticity or sensitivity of demand to income shows how the demand will shift in relation to the changes in income. Therefore, income elasticity can be defined as follows: As Energy consumption is associated with the functional relationship between energy consumption and economic activity, we can expect that the energy demand to be positively correlated with GDP.

The Statistical Model
In this study, we used Cobb-Douglas model which was proposed by Paul Douglas economist and mathematician Charles Cobb. The model can be used in the two forms of the exponential and logarithmic. In exponential form is as where D is the dependent variable and X1 to Xn are independent variables. Using a one-to one logarithmic transformation (for positive values), mathematical function of this model will be: The price elasticity for α1 and income elasticity for α2 were estimated for the analysis.

Identifying the Model Datasets
The changes of demand in relation to each dependent variable is shown in Fig. 1.
As can be seen in Fig. 1, the electricity consumption increases in proportion to the following parameters: Population, the amount of investment, GDP and electricity price. The diagram shows that the electricity price increases in proportion to the electricity consumption. Since electricity is an essential commodity and not Giffen, the price increase seems to be unreasonable. However, this observation can be interpreted this way; the electricity price growth over the years has not been commensurate with inflation. This means that the real price of electricity in the household consumer price basket has decreased. In reality, the electricity price is not a real price and it has been subsidized by the government and at the same time, its real price has decreased over time.
As the demand function in this study is the estimation of the per capita consumption of electricity function in terms of price, investment and GDP per capita, the values of our new variables, per capita consumption, per capita investment and GDP per capita, were obtained by dividing those values by the population. As our model is based on the natural logarithm, we take the natural logarithm from these variables. Figure 2 shows the behavior of the natural logarithm of per capita variables and Figure 3 shows distribution of the error terms.

Estimation of the Model
As mentioned before, for estimation of this model, we used the coefficients of the price and GDP variables per capita rather than the price and income elasticity of electricity consumption. Therefore, to calculate the price and income elasticity, the model was estimated based on the preliminary results obtained from a regression analysis. The results are shown in the Table 1.
As it is shown in the table all the coefficients except the constant of the model, are meaningful. This means that the constant of the model is zero. Mathematical interpretation of this issue is that the model passes through the origin of coordinates. The economic interpretation of this issue is that if someone has no income and no capital, of course, he/she has no electricity consumption and his electricity consumption is zero. Now, we estimate the model without any constant.
As it can be seen in the Table 2, the values for R 2 that obtained from 2 R regression analysis increases to 0.9999 and all the coefficients are meaningful

Confirmatory Tests
In the econometric methodology, the five common assumptions of non-randomness of each independent variable, normality of error terms, non-autocorrelation terms of disruption, homogeneity variance of the error terms and non-linearity between independent variables must be established to estimate the acceptability of the model.

Non-Randomness of Each Independent Variable
The data related to the independent variables such as prices, GDP per capita and per capita investment are Consecutive and have been extracted from the real database.

Normality of Error Terms
As mentioned above, one of the basic assumptions in the econometric model is the normality of the error terms. Different tests can be performed to examine if this assumption has been met or not. Testing for normality of error terms can be done by drawing the distribution of the values of the residuals and comparing it with the distribution function of the density of the Normal distribution. In the following chart the distribution of error terms is depicted. As the distribution of the error terms is very close to the normal distribution, the assumption of the normality of error terms has been met.

Testing for the Presence or Absence of the Autocorrelation
Another concern is whether there are any correlations between the variables in the model because the data of the model are time series. To diagnose this issue, the model error terms need to be tested. Autocorrelation test is one of the tests that can be used for this purpose. In these tests, the null hypothesis shows that no autocorrelation exists, while the alternative hypothesis shows that autocorrelation exists. In other words: One of the tests which are used for checking the absence or presence of autocorrelation is the Durbin-Watson test. Durbin-Watson-statistic which is based on the estimated residual of the model is calculated as follows: Durbin-Watson-statistic falls between numbers of the zero and four. The obtained values are interpreted as follows: When the value is close to zero, the autocorrelation is positive and when it is close to 4, the autocorrelation is negative. When it is close to 2, there is no autocorrelation. We obtained 0.419 after performing this test. Because this value is close to zero, the autocorrelation is positive. To solve this problem, the variable with the delay of residuals were used. This is because the trends in the residuals are controlled by the variable containing the delay. Therefore, the estimated values of the model parameters, the income elasticity, price elasticity and investment, can be estimated more accurately The rest of the delay was produced and then the regression on the previous model with the residuals containing the delay as well as was run and then, the Durbin-Watson test was conducted. These commands and results are as follows.
As we can see in Table 4, adding the variables' residual increased R 2 and 2 R to 0.999. In addition, Durbin-Watson test reached to 1.439. This shows the sign of autocorrelation is mainly disappeared.

Testing for the Presence or Absence of Multi-Collinearity
To test whether or not there is a multi-collinearity between the models, several criteria can be considered. One of these indicators is Variance Inflation Factor (VIF). If the value of the VIF is greater than 10 then the multi-collinearity between the independent variables of the model are problematic. As Table 4 suggests, the VIF does not show any multicollinearity between the variables.

Testing for the Presence or Absence of Non-Homogeneity Variance
One of the assumptions made in estimating the proposed model is assuming the same variance for all of the error terms. Thus, with regards to this assumption we will have et ~ N(0,  2 ) non-Homogeneity Variance makes the OLS method deficient and the OLS estimator will not be the minimum variance estimator. Common tests to determine the presence or absence of nonhomogeneity variance are as below: (1) Breusch-Pegan, (2) white.
The zero hypothesis of these tests is unavailability of non-homogeneity Variance and the alternative hypothesis is availability of non-homogeneity variance. In other words:

765
In these tests, if the probability is very small, we should reject the zero hypothesis and accept the alternative hypothesis. This hypothesis states that he variances are not homologous. However, for both of these tests, we can calculate the statistics in detail. Here these two tests are used directly. For dependent variable of the Ln(Demand_per_capita), Breusch-Pegan test was run and the result is as following.
As can be seen in Table 5, the hypothesis of the unavailability of non-homogeneity variance is rejected. It means that the non-homogeneity variance is available. We ran these tests for each variable separately and the following results were obtained.
As can be seen in Table 6, the hypothesis. Of the unavailability of non-homogeneity variance for is rejected. White test also was used. To perform this test, quadratic error terms in fixed terms, explanatory variables, quadratic explanatory variables and binary multiplication of the explanatory variables were regressed. In white test the number of observations should be multiplied by R 2 . For the H0 hypothesis, the asymptotic chi-square distribution with degrees of freedom equal to the number of explanatory variables was observed. However, the low number of observations in the model can influence the accuracy of the test. By running this test, the probability would be equal to 0.237 that does not reject the assumption of the unavailability of non-homogeneity variance. Probably due to the low number of observations, there is an error in the test. In the previous test this assumption was rejected and the availability of non-homogeneity variance was approved. For solving the nonhomogeneity variance problem, GLS estimation should be used. The reason is that in this type of the estimation a weight is considered for each observation that helps to eliminate the non-homogeneity variance. The results of this estimation are presented in the following Table 3: As can be seen in Table 7, all the coefficients of these variables are meaningful. After running the above tests, the final estimation of the model parameters can be considered as follows. The estimation seems to be good as the potential problems of the tests are addressed:

Computational Results
As mentioned before, the model is developed based on the data from 1974 to 2007. In this section, we use real data to test our model. For this purpose, we use the data from 2008 and 2009. As it is clear from the Table 8, the maximum error rate for the two models is 1.55% that shows the efficacy of our model. It should be noted that data on capital stock and GDP are the survey data which is presented in the above Table. Conclusion As our model shows the income elasticity of electricity consumption is negative and the price elasticity of electricity consumption is positive. The price and income elasticity is smaller than unit and the capital elasticity is greater than unit. A unit increase in the level of investment per capita raised the electricity consumption to 1.207 units. The model shows that the price elasticity of electricity demand is positive and income elasticity of demand is negative and cross elasticity of investment is positive. This observation seems odd. Because the price elasticity of electricity is usually negative and income elasticity is positive. This means that if the prices rise, then our consumption decreases and if the income increases, then our consumption shows a rise. However, our model suggests the opposite.
This observation can be interpreted as follows: Electricity prices over the past years have not been the real prices and it has been subsidized. This issue has caused electricity commodity because the price has low elasticity. On the other hand, although the price of electricity has increased in recent years, the price increase has not been in proportion to inflation and in reality; the relative price of electricity commodity in comparison to the other commodities has decreased. Therefore, its elasticity is positive. It means that the extent to which the real and relative prices of electricity has decreased, to that extent the consumption has increased.
On the other hand, the negativity of income elasticity of electricity can be explained initially as the incomes of the population increase, the use of more expensive but low-energy electrical appliances rises and consequently, electricity consumption drops and then, as the revenue of the whole economy grows, investment in capital goods of the production increases. Likewise, capital goods of companies improve and the use of low energy equipment shows a rise. Because, there is a concept in the energy economic that is called energy intensity. This concept explains that in the economy of each country, how much energy is consumed for producing a certain amount of goods and services. Our results show that over the past years, for a certain amount of goods and services, less energy has been consumed and that is why the growth of GDP per capita, the electricity consumption has decreased.