Analyzing the Electricity Consumption Using Experimental Design Technique

Experimental design technique is a powerful tool th at is used in uncertainty cases, when a great amount of data is available. In this paper, t he effects of different factors on the determinatio n of electricity consumption are analyzed. This analysis i based on experimental design technique. The implementation of the proposed technique is shown a pplying electricity consumption in Iran.


INTRODUCTION
Due to the uncertainty of electricity consumption and the great amount of data available on electric power databases, experimental design is a powerful tool for analyzing the electricity consumption data.
Experimental design is a set of experiments that are performed on the process or system, the input data is changed by the output responses and the relations between inputs and outputs. The main goal of experimental design is to determine the variables which have the maximum effect on the responses. Initially, the experimental design was applied in the agronomy and chemistry [1][2][3] . The electronics industry used this method to develop the processes and products [4] . It can be seen that experimental design, is used as a main tool for statistical analysis of data in several areas. The application of experimental design technique for analyzing the variables which affect the spot price is also illustrated in [5] .
The aim of this paper is to analyze the effect of different factors on determination of electricity consumption based on experimental design technique. First the mathematical model is proposed, then the application of this method of determination of electricity consumption is shown and finally a case study of the method is presented.
Factorial design: Many experiments involve a study of the effects of two or more factors. Generally for this type of experiment, it can be shown that, factorial design is most efficient. By factorial design each complete trial or replication of the experiment, all possible combinations levels of the factors are investigated. The effect of a factor is defined for as the change in response produced by a change in the level of the factor. This is frequently called a main effect because it refers to the primary factors of interest in the experiment [6] .
The three factor analysis of the variance model is as follows: There are a levels of factor A, b levels of factor B and c levels of factor C.
where, µ is the total mean, i τ is the effect of the ith level of the factor A, β j is the effect of the jth level of the factor B, k γ is the effect of the kth level of the Total sum of squares SS T is described as follows: SS T = SS A +SS B + SS C + SS AB + SS AC + SS BC + SS ABC + SS E where SS A is the sum of squares for the main effect A, SS B is the sum of squares for the main effect B, SS C is the sum of squares for the main effect C, SS AB is the sum of squares of the interaction between factors A and B, SS AC is the sum of squares of the interaction between factors A and C, SS BC is the sum of squares of the interaction between factors B and C, SS ABC is the sum of squares of the interaction between factors A, B and C and SS E is the sum of squares of the errors. The tests of hypothesis are based on a comparison between the independent estimates of σ 2 provided by the division of each term of SS T by their degree of freedom, known as mean square: abc(n 1) The effect of a factor is defined by the variations in the levels of factors, that is called main effect because it refers to the primary factors. Assuming fixed factors A, B and C, the expected mean squares are: σ . However, if there is a difference between first factor effects, then MS A will be larger than MS E . Similarly, this is true for the MS B , MS C , MS AB , MS AC, MS BC and MS ABC . So, to test the significance of three main effects and their interactions, simply divide mean square of the error mean. Large values of this ratio imply that the data do not support the null hypothesis.
If assume that the error terms ijkl ε are normally and independently distributed with constant variance 2 σ , then each of the ratios of mean squares: are follow an F distribution with two degrees of freedom; one degree related to the numerator term and the other related to the denominator . The critical region would be the upper tail of the F distribution; is shown in Fig. 1. implies that there are differences between the mean, although the exact nature of the differences is not specified. In this case, multiple comparison technique between levels is useful. Next part illustrates the application of these methods to determine the most influential factors in electricity consumption.

Electricity consumption determination:
The electricity consumption data are available from January 1994, taken every hour a day, 7 days a week and 12 months a year. This information can be useful for making a response surface, which can be used to identify the value of consumption in different months of the year, different days of the week or different hours within a day.
Consider the electricity consumption is determined every hour during 24 hours for 7 days a week for 12 months a year. These data can be considered as a result of a three-factorial experiment. The first factor is the month of the year analyzed in 12 levels, the second factor is the day of the week analyzed in 7 levels and the third factor being the hour of the day which is analyzed in 24 levels, four observations per cell are selected. Each factor is previously determined, so, this is a fixed factor model. To illustrate the proposed method, the hourly electricity consumption data on Iranian from March 2003 to February 2004 is taken as an example. (The Iranian year begins on the 21th (March).
The first step is to determine of the data that satisfies the hypothesis of statistical linear model Eq.
(1). The error, referred as residual is considered as the difference between the observed value ijk y and the estimated value ijk y : ijk ijk ijk y y = − ε Figure 2 shows the accumulative distribution residuals from Mar. 2003 to Feb. 2004, the histogram is presented in Fig. 3. Figure 2 and 3 shows that, the residuals are normally distributed with zero mean, hence, its representation should be a line. Next ANOVA Table is drawn from the data. Table 2    These probabilities quantify the significance of the level of decisions. The probability that the first, second, third, fourth, fifth and sixth conclusions are wrong is near zero, whereas the probability of the seventh conclusion to be correct is almost one.
The next step is to determine if there are, months of the year, days of the week or hours of the day for which the consumption is considerably different. This is obtained using multiple comparisons based on the test proposed by Tukey and Kramer, with a trust degree of 95%. The following figures show the results. Figure 4 shows the consumption calculated for each month of the year. (Appendix A, Table A.1) From this Fig. 4 it can be observed that the mean electricity consumption on first month is lower than the mean consumption on the other months. Figure 5 shows the result of different days of the week. (Appendix A, Table A.2) It can be observed that mean consumption on the 7th day is considerably lower than the other days. It is mentioned that in Iran this day of the week (Friday) is a holiday. Figure 6 shows the result for different hours of a day. (Appendix A, Table A.3) It is observed that the mean consumption is significantly greater from 7 to 11 p.m. where the residential load increases. Now, it is possible to group months, days and hours. Month 3, 4, 5, 6 and 7 have the most consumption. Month 2, 8, 9, 10, 11, 12 have equal consumption so they are considered as normal month. In addition, the first month has the lowest consumption that is an atypical month. Days 1, 2, 3, 4, 5 and 6 are normal days and grouped as working days and the 7th day is considered as a non working day.
Hours are classified as: peak from 6 to 12 p.m., Valley form 1 to 10 a.m. and rest form 11 a.m. to 5 p.m.

Development:
Let us consider as a developer, the fourth factor is the season of the year. Now add the season of the year to the model as a factor. Therefore, suppose that the electricity consumption is determined every hour during 24 hours for 7 days a week, 3 months of the season and 4 seasons of the year. These consumptions can be considered as the result of a four factorial experiment where the first factor is the season of the year analyzed at 4 levels, the second factor is the month of the season analyzed at 3 levels, the third factor is the day of the week analyzed at 7 levels and the fourth factor is the hour of the day analyzed at 24 levels. As the previous experiment, four observations per cell are selected. So the model is described as the following:  Table 3 shows the results. From Table 3, it can be seen that the season of the year influences the electricity consumption. Now it should be determined if there are seasons of the year for which the electricity consumption is considerably different. By using the test method of Tukey and Kramer, with a trust degree of 95%, we obtain Fig. 7      In this Fig., it can be observed that the mean electricity consumption for the second season (summer) is remarkably higher than the mean electricity consumption calculated for the other seasons. But in other seasons the electricity consumption is nearly equal. So seasons can be grouped as, the summer is peak and the others namely winter, spring and autumn is normal.

Statistical analysis:
As it is shown on the model, electricity consumption is considered as a function of three factors: 1-months of year, 2-days of the week and 3-hours of the day. With regard to the Fig. 2, all of the three factors are affected on the electricity consumption. By using multiple comparison, it could be determined that, which levels of factors are considerably different. Let us show the statistical analysis. We want to determine if we omit the levels which are considerably different, as a result the effect of the factor is still remain or not?
Analysis of the consumption versus days of the week: Figure 5 shows the result of different days of the week. It can be easily concluded that the mean consumption in 6 th and 7 th days is lower than the other days.
Hence, the 6 th and 7 th days from levels of this factor are omitted. Then the levels of the days are reduced to 5 levels (from 1 st to 5 th ) by using pair comparison  Table 4. Table 4, shows that by omitting the day 6 th and the day 7 th of the week, the days of the week have no effect on the consumption. This can be described that the effect of the days of the week belongs to the mentioned days of the week which the consumption is remarkably lower than other days. It can be explained that, the consumption on the day 6 th and the day 7 th of the week in which the consumption is remarkably lower than other days, affected on the consumption. As a result, by omitting the day 6 th and 7 th , we can omit the effect of factor "days" in our model. Figure 4 shows the consumption of each month of the year. By analyzing it shows that, the effect of the month of the year is omitted only by omitting the first eight months of the year. The following Table, shows the pair comparison between consumption and the "month" of the year which the first eight months of the year are omitted and only the last four months of the year is considered.

Analysis of the consumption versus months of the year:
As it is shown in Table 5, by dominating the first eight months of the year and using only the last four months of the year, the effect of the month of the year on consumption is eliminated. However, because of the eliminating over than 66 percent (8/12) of the levels of the factor it is resulted that the effect of the month of the year on consumption is not because of the special levels of the factor, but it has full random effect.
A similar statistical analysis is used for hours of the day. It could not be found any "hour" during 24 hours a day in which, by omitting that "hour" the effect of it on consumption is dominated. So, the effect of the hours of the day is random and special levels of this factor are not causing the effect on consumption.
Analyzing the consumption versus seasons of the year: Figure 7 shows the consumption of each season of the year. In this Fig. It can be observed that the consumption for summer is significantly higher than the other seasons. Now, for the statistical analysis, we are dominating summer from the levels of the seasons. So, the season of the year is analyzed in three levels: spring, autumn and winter. By using pair comparison between consumption and seasons of the year, Table 6 is resulted.
It is obvious that the seasonal affecting on the consumption is because of high consumption in the summer.
It can be resulted from statistical analysis that the two factors, hours of the day and month of the year, have a random effect on the consumption and special levels have no role on the effect of the mentioned factors on consumption. By eliminating the summer from the levels of the seasons and 6 th and 7 th days of levels of the day, the effect of these two factors on consumption is omitted, it shows, that the effect of these two factors on consumption belongs to levels of these factors that the consumption is remarkably different from other levels. So, special levels of these factors influence on their effect on consumption.

CONCLUSION
The application of the experimental design to analyze the effect of different factors on electricity consumption, is illustrated in this paper. An example of this method is implemented for Iranian electricity consumption. By analyzing the result, factors can be grouped for economic purposes. Months 3,4,5,6 and 7 have the most consumption. Other months have lower consumption and considered as normal months. Days 1,2,3,4,5 and 6 are working days and the day 7 th of the week is considered as a non working day. Hours are classified as: peak from 6 to 12 p.m. Valley from 1 to 10 a.m. and the rest from 11 a.m. to 5 p.m.
It can be resulted from statistical analysis that the two factors, hours of the day and month of the year, have a random effect on consumption. But special levels of two factors: days of the week and season of the year, that the consumption is remarkably different from other levels, influence on their effect on consumption.