Trade-GDP Nexus in Iran: An Application of the Autoregressive Distributed Lag (ARDL) Model

This study employed annual time series data (19602003) and unit root tests with multiple breaks to determine the most likely times of struct ural breaks in major factors impacting on the trade GDP nexus in Iran We found, inter alia, that the endogenously determined structural break s coincided with important events in the Iranian economy, inclu ding the 1979 Islamic revolution and the outbreak of the Iraq-Iran war in 1980. By applying the Lumsd aine and Papell (1997) approach, the stationarity of the variable under investigation was examined an in the presence of structural breaks, we found that the null hypothesis of unit root could be reje cted for all of the variables under analysis except one. Under such circumstances, applying the ARDL procedu r was the best way of determining long run relationships. For this reason, the error correctio n version of the autoregressive distributed lag procedure (ARDL) was then employed to specify the s ort and long-term determinants of economic growth in the presence of structural breaks. The r esults showed that while the effects of gross capit al formation and oil exports were important for the ex pansion of the Iranian GDP over the sample period, non-oil exports and human capital were generally le ss pivotal. It was also found that the speed of adjustment in the estimated models is relatively hi gh and had the expected significant and negative sign. JEL classification numbers: C12, C22, C52.


INTRODUCTION
The Iranian macroeconomy has been subject to numerous and ongoing shocks and regime shifts in recent decades, including the 1974/75 OPEC oil crisis, social and political upheaval associated with the 1979 Islamic Revolution, a destructive eight-year (1980)(1981)(1982)(1983)(1984)(1985)(1986)(1987)(1988) war with Iraq, the freezing of the country's foreign assets, a volatile international oil market, economic sanctions and international economic isolation. Determining the correct timing of these structural breaks is clearly of paramount importance in any macroeconomic time-series analysis. Leybourne and Newbold [1] , for example, argue that if structural breaks are not dealt with appropriately, empirical results obtained from the use of, say, cointegration techniques could be spurious and misleading. At the same time, conventional techniques allow the incorporation of only single structural breaks in time series. Accordingly, this study employs Lumsdaine and Papell's [2] procedure (hereafter LP) to examine the unit root hypothesis with two structural breaks, without imposing predetermined dates for structural breaks. After the timing of major structural breaks are determined endogenously, they are included in autoregressive distributed lag (ARDL) procedure with impulse and/or shift dummy variables.
The remainder of this study is structured as follows. Section II explains and applies the LP unit root procedures as determined by a recursive, rolling or sequential approach. Section III discusses the ARDL and error correction versions of this approach followed by the empirical findings in section IV. Finally, Section V presents some concluding remarks and policy implications.
Unit root test with structural breaks: It goes without saying that structural change is of considerable importance in the analysis of macroeconomic time series. Structural change occurs in many time series for any number of reasons, including economic crises, changes in institutional arrangements, policy changes, regime shifts and war. An associated problem is the testing of the null hypothesis of structural stability against the alternative of a one-time structural break. If such structural changes are present in the data generating process, but not allowed for in the specification of an econometric model, results may be biased towards the erroneous non-rejection of the nonstationarity hypothesis [1,3,4] .
Conventionally, dating of the potential break is assumed to be known a priori in accordance with the underlying asymptotic distribution theory. Test statistics are then constructed by adding dummy variables representing different intercepts and slopes, thereby extending the standard Dickey-Fuller procedure [3] . However, this standard approach has been criticized, most notably by Christiano [5] , who argued that data-dependent procedures are typically used to determine the most likely location of a break: evidence of an endogeneity or sample selection problem. This invalidates the distribution theory underlying conventional testing.
In response, a number of studies have developed different methodologies for endogenising dates, including Zivot and Andrews [6] , Perron and Vogelsang [7] , Perron [4] , Lumsdaine and Papell [2] and Bai and Perron [8] . These studies have shown that by endogenously determining the time of structural breaks, bias in the usual unit root tests can be reduced. Perron and Vogelsang [7] and Perron [4] , have proposed a class of test statistics which allows for two different forms of a structural break: namely, the Additive Outlier (AO) model, which is more relevant for series exhibiting a sudden change in the mean (the crash model) and the Innovational Outlier (IO) model, which captures changes in a more gradual manner over time.
With this in mind, LP [2] introduced a novel procedure to capture two structural breaks in a series. They found that unit root tests accounting for two structural breaks are more powerful than those, which allow for a single break. In support, Ben-David et al. [9] argued that "… just as failure to allow one break can cause non-rejection of the unit root null by the Augmented Dickey-Fuller test, failure to allow for two breaks, if they exist, can cause non-rejection of the unit root null by the tests which only incorporate one break" (P. 304). LP uses a modified version of the ADF test, which specifies two endogenous breaks as follows: where, DU1 t =1 if t>TB1 and otherwise zero; DU2 t =1 if t>TB2 and otherwise zero; DT1 t = t-TB1 if t>TB1 and otherwise zero; and finally DT2 t =t-TB2 if t>TB2 and otherwise zero. Two structural breaks are allowed for in both the time trend and the intercept, which occur at TB1 and TB2. The breaks in the intercept are shown in equation (1) by DU1 t and DU2 t respectively, whereas the slope changes (or shifts in the trend) are represented by DT1 t and DT2 t . The optimal lag length (k) is based on the general to the specific approach suggested by Ng and Perron [10] . Table 1 presents the two most important structural breaks which affected the variables under investigation in this study using the procedure proposed by LP [2] . The data were expressed in 1997 constant prices and have been collected from the Central Bank of Iran [11] and the International Financial Statistics (IFS [12] ). Y denotes real GDP, k is gross capital formation, x is total real exports, m is total real imports and hc is human capital, (as represented in this research by the number of employed persons with tertiary education). Finally, oil and non-oil exports are shown by xo and xno, respectively. Non-Rejection Note: (1) * and ** Indicates that the corresponding null is rejected at the 1% and 5% level respectively. (2) Kmax=8, the letter "L" denotes that the variables are in log form As it is clear from the empirical result in Table 1, the timing of the structural breaks for the majority of variables under investigation coincides with either the oil boom in 1975, the Islamic revolution in 1979 or the Iran-Iraq war in the 1980s. These unit root results are consistent with LP [2] and Ben-David et al. [9] as most I(1) variables according to the ADF test now become stationary. The results of unit root tests with two structural breaks in both the intercept and the slope of the trend function show strong evidence against the unit root hypothesis in all of the variables under investigation except Lm. Under these circumstances and especially when we are faced with mix results, applying the ARDL model is the efficient way of the determining the long-run relationship among the variable under investigation. This methodology is explained and applied in the following section.
The ARDL cointegration approach: Recently, an emerging body of work led by Pesaran and Shin [13] , Pesaran and Pesaran [14] and Pesaran et al. [15] has introduced an alternative cointegration technique known as the 'Autoregressive Distributed Lag' or ARDL bound test. It is argued that ARDL has a number of advantages over conventional Johansen cointegration techniques.
To start with, the ARDL is a more statistically significant approach for determining cointegrating relationships in small samples [17] , while the Johansen co-integration techniques still require large data samples for the purposes of validity. A further advantage of the ARDL is that while other cointegration techniques require all of the regress to be integrated of the same order, the ARDL can be applied whether the regressors are I (1) and/or I (0), i.e. Whether the results are all unit root or all stationary or, indeed, even if mixed results are obtained. This means that it avoids the pre-testing problems associated with standard cointegration, which requires that variables are already classified I(1) or I(0) [15] . In this research having first applied the Perron [4] Innovational and Additive Outlier Models, it was observed that in the presence of one structural break, we could not reject the null hypothesis of a unit root in all cases, but by considering two structural breaks we found the reverse as the majority of variables under investigation became stationary. In fact, the Lumsdaine and Papell [2] approach deemed to be more relevant for oil-exporting countries, particularly Iran which has been subject to numerous structural changes and regime shifts. This approach enabled us to examine the stationarity of the variables under investigation in the presence of multiple structural breaks. The empirical results indicated that the null hypothesis of unit root could be rejected for all of the variables under analysis except one. With such mixed results, we applied the ARDL procedure in this research.
Bahmani-Oskooee and Nasir [18] , for example, argues that the first step in any cointegration technique "is to determine the degree of integration of each variable in the model", but this can depend on the specific unit root test used: different tests could lead to contradictory results.
For example, applying conventional unit root tests like the Augmented Dickey Fuller and the Phillips-Perron tests, one may incorrectly conclude that a unit root is present in a series that is actually stationary around a one-time structural break [3,4] . The ARDL is then useful because it avoids this problem.
Yet another difficulty of the Johansen cointegration technique which the ARDL avoids concerns the large number of choices which must be made. These include decisions regarding the number of endogenous and exogenous variables (if any) to be included, the treatment of deterministic elements, as well as the order of VAR and the optimal number of lags to be specified. The empirical results are generally very sensitive to the method and various alternative choices available in the estimation procedure [16] . Finally, with the ARDL it is possible that different variables have differing optimal number of lags; while in Johansen-type models this is not possible.
According to Pesaran and Pesaran [14] , the ARDL procedure is represented by the following equation: Where: where, y t denotes the dependent variable, X it is the i dependent variables, L is a lag operator and w t is the S × 1 vector representing the deterministic variables employed, including intercept terms, dummy variables, time trends and other exogenous variables. The optimum leg length is generally determined by minimizing either the Akaike Information Criterion (AIC) or the Schwarz Bayesian Criteria (SBC). Using the ARDL specific model, the long-run coefficients and their asymptotic standard errors are then obtained. The long-run elasticity can then be estimated as follows: The long-run cointegrating vector is given by: In this equation, the constant term is equal to: We can now rearrange equation (2) in terms of the lagged levels and the first differences of 1 2 , , ,...., t t t kt y x x x and t w to obtain the short term dynamics of the ARDL as follows: and finally, one can define the error correction term in the following manner: In equation (6)  Empirical results based on the ARDL approach: Since this study aims to detect the short-run as well as the long-run relationships between exports, economic growth and other variables, we make use of the already well-known though relatively new cointegration techniques of ARDL. Drawing upon the literature on the trade-growth nexus and following Feder [19] , Salehi-Esfahani [20] and Van den Berg [21] , we consider the following extended Feder type models in order to identify the relationship between trade and economic growth in an oil-based economy. Similar to the Federtype model, output in each economic sector is produced by labor and capital factors which are allocated to each sector. In addition and similar to Salehi-Esfahani, we include total imports as a new factor in the following equations though these have been neglected in most studies of the relationship between exports and economic growth.
These models are a kind of production function, which is augmented by the addition of trade factors, exports (X) and imports (M). However, it should be noted that in Feder type models, the GDP is considered to be simply a function of ordinary labor force growth together with the other relevant factors. In the Iranian economy, however, due to the low productivity of the labor force and its surplus in the economy, we follow the endogenous growth theory and consider instead, human capital (the number of the employed workforce with a university degree) rather than the total labor force in our empirical models. Therefore, we use the following two modified Feder-Salehi model in logarithmic form to examine the trade-growth nexus: In equation (9) the possible effects of exports for economic growth have been disaggregated into oil (xo) and non-oil. As discussed earlier, the inclusion of exports in the model captures the positive externality effects of exports on economic growth. The externality effects of total exports including the introduction of improved technology; the training of productive labor and the development of more efficient management were introduced first by Feder [19] . Moreover, according to Salehi-Esfahani [20] by helping to prevent shortages of intermediate inputs and by providing better quality inputs, capital and intermediate imports can positively affect productivity. In this research following the endogenous growth theory, economic growth is determined by endogenous growth factors physical capital (R&D effects), human capital (representing knowledge spillover effects), export expansion (proxying positive externality effects) and capital and intermediate inputs (capturing learning-by-doing effects).
Following Pesaran et al. [15] and Bahmani-Oskooee and Kara [22] the error correction representation of the ARDL model is: implying no cointegration) in the first step is tested by computing a general F-statistic using the variables in levels. To begin with one has estimated equation (4) excluding the ECM, then this term is incorporated in the ARDL model. At this stage, the calculated F-statistic is compared with the critical value tabulated by Pesaran et al. [15] or Pesaran and Pesaran [14] , these critical values are calculated for the different number of regressors and whether the model contains an intercept and/or a trend. According to Banmani-Okkooee and Nasir [18] , these "critical values include an upper and a lower band covering all possible classifications of the variable into I (1) and I (0) or even fractionally integrated". The null hypothesis of no cointegration is rejected if the calculated F-statistic falls above the upper bound. If the computed F-statistic falls below the lower bound, then the null hypothesis of no cointegration cannot be rejected. Finally, the result is inconclusive if it falls in between the lower and the upper bound. In such an inconclusive case an efficient way of establishing cointegration is by applying the ECM version of the ARDL model [18] .
Since all observations are annual and the number of observations is limited, we choose 2 as the maximum lag length in the ARDL model. The value of the Fstatistic is 2.88. We now disaggregate exports in equation (10) to specify model 2. That is to say total exports are divided into oil exports and non-oil exports as two separate variables appearing in equation (10). The calculated F-statistic for model 2 is 2.96. Since both of the calculated F-statistics fall between the lower bound and the upper bound at the 5 percent level, the results are inconclusive. As mentioned above, in this circumstance the ECM version of the ARDL model is an efficient way of determining the long-run relationship among the variables of interest. We have also calculated the F-statistic when each of x, m or k appear as a dependent variable separately in the testing procedure. These results are as follows: F (Lx | Ly, Lm, Lhc, Lk)=2.24, F (Lm|Ly, Lx, Lk, Lhc)=1.8216 and F (Lk | Ly, Lm, Lhc, Lx)= 2.2481. These F test statistics are all less than the corresponding critical values tabulated in Pesaran et al. [15] . The null hypothesis of no cointegration cannot be rejected in these cases. Therefore, we can have a possibility of a long-term relationship if and only if Ly appears as a dependent variable followed by its 'forcing variables' (i.e. Lx, Lm, Lk and Lhc).
With this in mind, the long-run coefficients of the models (1) and (2) are estimated in the second step and the results are reported in Table 2. As discussed, one of the more important issues in applying the ARDL is the choice of the order of the distributed lag function. Pesaran and Smith [16] argue that the SBC should be used in preference to other model specification criteria because it tends to define more parsimonious specifications: the small data sample in the current study underlies this preference. The SBC lag specifications for model (1) and (2) are shown in the appendix. For these two models, the optimal numbers of lags for each of the variables are shown as ARDL (1,0,0,2,1) and ARDL (1,2,0,2,1,1) respectively. The long-run coefficients are shown in the following table. The long-run coefficients of the variables under investigations are shown in the Table 2.
As presented, the long-term coefficients for models (1) and (2) follow a similar pattern. The results show that in the long run physical capital has a very significant effect on GDP and a one percent increase in this variable leads to a 0.48 % and 0.55% increase in GDP for models (1) and (2), respectively. Alternatively, a one percent increase in human capital leads to a respective GDP increase of 0.018% and 0.02% for models (1) and (2). This indicates that human capital in Iran does have not an important effect on GDP. In addition, the coefficients of Lhc in both models are not statistically significant. If we consider the effect of total exports to GDP, a one percent increase in total exports leads to a 0.39% increase in GDP for model (1). This means that total export has a very significant and sizable effect on GDP.
The results for model (2), where total exports are disaggregated into oil and non-oil exports, shows that a one percent increase in oil and non-oil exports leads to 0.37% and 0.036% increases in GDP, respectively. It is obvious that while non-oil exports do not have very important effects on the Iranian economy, crude oil exports are still a major export and the oil sector acts as the major leading sector of the economy. The results also show that a one percent increase in total imports leads to a -0.08% decrease in GDP in model (1) and -0.13% in model (2). The coefficient of LM is significant at the 5% level and the sign of the coefficient conforms to a priori expectations. After estimating the long-term coefficients, we obtain the error correction representation of an equation (10) for both aggregate and disaggregated exports case in models (1) and (2). Table 3 reports the short-run coefficient estimates obtained from the ECM version of the ARDL model.
As discussed, the error correction term indicates the speed of the adjustment which restores equilibrium in the dynamic model. The ECM coefficient shows how quickly variables return to equilibrium and it should have a statistically significant coefficient with a negative sign. Bannerjee et al. [23] holds that a highly significant error correction term is further proof of the existence of a stable long-term relationship. Table 3 shows that the expected negative sign of ECM is highly significant in both models. This confirms once again, the existence of the cointegration relationship among the variables of these two models. The coefficients of ECM (-1) are equal to (-0.46) and (-0.60) for models (1) and (2) respectively and imply that deviations from the long-term growth rate in GDP are corrected by 0.46 percent in model (1) and 0.60 percent in model (2) over the following year. This means that the adjustment takes place relatively quickly, i.e. the speed of adjustment is relatively high, especially in model (2). Figure 1 and 2 represents the forecasting errors and the plots of the graphs of the actual and forecast values for models (1) and (2).   Diagnostic and stability tests: Diagnostic tests for serial correlation, functional form, normality, hetroscedasticity and structural stability of the models are considered in this study. As shown in the appendix both models (1) and (2) generally passes all diagnostic tests in the first stage. These tests show that there is no evidence of autocorrelation and that the models pass tests for normality and thus proving that the error is normally distributed. The adjusted R bar shows that around 99% of the variation in GDP is explained by the regress in both models. Finally, when analyzing the stability of the long-run coefficients together with the short-run dynamics, the cumulative sum (CUSUM) and the cumulative sum of squares (CUSUM) are applied. According to Pesaran and Pesaran [14] the stability of the estimated coefficients of the error correction model should also be empirically investigated. A graphical representation of CUSUM and CUSUMQ statistics are shown in Fig. 3 and 4. Following Bahmani-Oskooee [24] the null hypothesis (i.e. That the regression equation is correctly specified) cannot be rejected if the plot of these statistics remains within the critical bound on the 5% significance level. As it is clear from Fig. 3 and 4, the plots of both the CUSUM and the CUSUMQ are within the boundaries and hence these statistics confirm the stability of the long-run coefficients of the GDP function in models 1 and 2.

CONCLUSION
The objective of this study was to determine the major drivers of GDP growth in Iran. In this study we first used all available annual time series data  to endogenously determine the two most significant structural breaks in the 6 variables (expressed in constant 1997 prices or actual numbers) employed in this empirical analysis. The empirical results based on the Lumsdaine and Papell [2] approach provided strong evidence against the null hypotheses of a unit root in the majority of the series under investigation. We found that the most significant structural breaks detected during the sample period correspond to the regime change associated with the 1979 Islamic revolution and the Iran-Iraq war beginning in 1980. This provided complementary evidence to models employing exogenously imposed structural breaks in the Iranian macroeconomy.
After determining the two structural breaks, with mixed results about the stationarity of the data, we applied the new cointegration technique (ARDL) to the data by incorporating these breaks into the model. The error correction version of the ARDL approach was used to specify and estimate two models. Model 1 included aggregate real exports as well as human capital, physical capital and real imports as major determinants of GDP. Model 2, similar to Model 1 but with a single difference --total exports were disaggregated into oil exports and non-oil exports. Applying the ECM version of the ARDL models showed that the error correction coefficients, which determine the speed of adjustment, had an expected and highly significant negative sign. The results indicated that deviation from the long-term growth rate in GDP was corrected by approximately 46 percent over the following year (for Model 1) and by 60 percent over the following year (for Model 2). The results of the diagnostic and stability tests indicated that both models passed all the diagnostic tests and there was no evidence of autocorrelation. The error terms were normally distributed. The CUSUM and CUSUMQ stability tests showed that the estimated coefficients of the error correction models were stable. Finally, the estimated long-term coefficients showed that while the effects of gross capital formation and oil exports are highly significant on GDP, those of the non-oil exports and human capital were less influential.