Application of Soil and Water Assessment Tools Model for Runoff Estimation

Problem statement: The most of the distributed or physically-based hydrologic and water quality models from developed countries are not directly applicable in developing countries due to both lack of data and different climatic conditions. Hence, there is a need for a study to be conducted a catchment of developed countries. Approach: From a review of various models to estimate runoff using a semi-distributed model, Soil and Water Assessment Tool (SWAT) was selected. Sequential Uncertainty fitting (SUFI-2), a program that is linked to SWAT was utilized for calibration and validation analysis. SUFI-2 is linked with SWAT in the Calibration Uncertainty Program known as SWAT-CUP. There are two stream gages with adequate data for calibration and validation in Taleghan basin with an area of 800 km in northwest of the Tehran, Iran. Joestan gauging station is located in the upstream and measures runoff from an area of 413 km whereas Galinak station located at the outlet of 800 km Taleghan catchment. Results: The results showed surface runoff was 21% of the precipitation for the upper part of the catchment and 33% at the outlet. Groundwater and lateral flows took place mostly in the mountainous upper part of the catchment with contribution of 23% and 17 %, respectively. Evapotranspiration losses at Joestan and Galinak stations were around 38% and 49% of the precipitation, respectively. Conclusion/Recommendations: This research has successfully developed a customized SWAT model by SUFI-2 program to be used by water engineers and managers in their planning of future land and water developments in Taleghan Catchment. The database system created in the study area, using dispersed datasets in GIS environment could be used not only for modeling purposes but also for decision making. High surface runoff and low interflow at Galinak station and inversely at Joestan station showed downstream of Joestan stations on need of greater soil conservation measures. The main reason is snowpack in the winter and good rangeland in other seasons. The study has produced a technique with reliable capability and high accuracy for annual and monthly water balance components of the Taleghan catchment.


INTRODUCTION
In last decades, hydrological models are more broadly applied by hydrologists and water resource managers as tools to analyses water resource management systems. Hydrological models usually involve a large number of parameters that are used for consideration of surface and subsurface runoff, groundwater, deep percolation, evapotranspiration, soil properties, land use, precipitation (Sorooshian and Gupta, 1995) and water quality components (Yu and Salvador, 2005). The development of these kinds of models requires adequate observed data in time series and field experience which are often unavailable in developing countries (Ndomba et al., 2005). Lack information on water resources is very important especially in qualitative studies (Yisa and Jimoh, 2010).
Numerous parameters are recognized for comprehensive simulation by complex hydrological models (Eckhardt and Arnold, 2001) where, interaction of parameters requires attention by experts. Abbaspour et al. (2007), states two very different parameters sets produce similar signals in the observed values in the calibration process. A comprehensive, complex hydrologic model is also characterized by a multitude of parameters (Eckhardt and Arnold, 2001). The real magnitude of many parameters is not exactly known due to spatial variability, inaccurate measurements and so on. Therefore, for recognize the correct value of each parameter calibration of model to be used to estimate them as correct as possible. Godio (2009), focused on snow pack parameters on density and thickness of snowpack to compare the data were calibrated and compared with the results coming from direct measurements of the density and thickness.
The main restricting factor in models performance is lack of strategies that explicitly account for model error calculation during calibration (Yapo et al., 1996). Users' experience in modeling and in recognizing parameters are two main significant skills to reach success in manual calibration of models (Eckhardt and Arnold, 2001). Many hydraulic and hydrologic modeling have been performed in the world where according to Neitsch et al. (2005) Civita et al. (2009)  The SWAT model was developed by United States Department of Agriculture-Agricultural Research Service (USDA-ARS) to predict the impact of land management practices on water, sediment and agricultural chemical yields in large engaged basins (Arnold et al., 1995). Sequential Uncertainty fitting (SUFI-2) is a program that is linked with ArcSWAT and was used for calibration and validation analysis by Abbaspour et al. (2007). SUFI-2 is one of five different programs (SUFI2, ParaSol, GLUE, MCMC and PSO) that are linked with SWAT in the package called SWAT Calibration Uncertainty Programs (SWAT-CUP). Its main function is to calibrate SWAT and perform validation, sensitivity and uncertainty analysis for a watershed model created by SWAT. Beside, the SWAT model is able to estimate pollutant losses. The Soil and Water Assessment Tool model was used to identify critical source areas of phosphorus and sediment in the Wister Lake basin in southeastern Oklahoma, USA (Busteed et al., 2009). This model is compatible with GIS and RS in natural resources projects (Eyad et al., 2008). Therefore, the main objective of this study is validating the applicability of the SUFI-2 in Taleghan River Basin in Northwest of Tehran with particular interest on setting up a runoff component in SWAT model to improve hydrologic modeling in the Taleghan River Basin.

MATERIALS AND METHODS
In this research four major input data including Digital Elevation Model (DEM), land use map, soil map, climatologic data and stream gage data are collected and used as given below: • Radar Digital Elevation Model with 85 meter resolution from National Geographic Center of Iran • Land use map prepared from IRS images for July 2007 • Classified soil map and field work with 1/50000 scale obtained from Faculty of Natural Resources of Tehran University • Climatologic data from seven rainfall stations, five temperature stations located inside and around the basin and also two stream gauges from the Water Resource Company for 1992 till 2004

Study area:
The study area is the upper part of Taleghan dam watershed and located in north western of Tehran, capital of Iran, with an approximate area of 800.5 km 2 and lies within 50° 38΄-51° 12' E longitude and 36° 04'-36° 21' N latitude. Figure 1 shows the location of the study area named as Taleghan watershed. The summary of hydro morphological characteristics is illustrated in Table 1. The outlet stream gauge is located at Galinak which has an area of 800.5 km 2 with 28 sub basins (Fig. 2). In the study catchment topographical elevation varies between 1775 and 4362 amsl with weighted average elevation of 2753 amsl. The hypsometric information of the study area shows that the maximum elevation class of 35.48 % of the catchment area belongs to the 2500-3000 m while the 4000-4500 class has the minimum as 0.06% of total area. The Frequency Distribution of the Slope Classes shows more than 52 percent of area located at slope class >40 %.

Description of SWAT:
The Soil and Water Assessment Tool (SWAT) is a semi-distributed conceptual model that operates continuously on a daily time step (Arnold et al., 1998). It is a comprehensive tool that enables the impact of land management practices on water, sediment and agricultural chemical yields to be predicted over long periods of time for large complex watersheds that have varying soils, land use and management practices (Neitsch et al., 2005).   SWAT was developed to simulate the major processes of the hydrologic cycle and their interactions as simply and realistically as possible and to use input data that is readily available for large scale catchments so that it can be used in routine planning and decision making (Ogden et al., 2001). One of the main advantages of SWAT is that it is computationally efficient for even the largest of catchments, which makes it of practical use to land and water resources managers. The model was designed for the prediction of long-term yields rather than single flood events (Arnold et al., 1998). Description of SUFI-2: Various SWAT parameters for estimation discharge were estimated using the SUFI-2 program (Abbaspour et al., 2007). Uncertainty is defined as discrepancy between observed and simulated variables in SUFI-2 where it is counted by variation between them. SUFI-2 combines calibration and uncertainty analysis to find parameter uncertainties while calculating smallest possible prediction uncertainty band. Hence, these parameters uncertainty reflect all sources of uncertainty, i.e. conceptual model, forcing inputs (e.g., temperature) and the parameters themselves. In SUFI-2, uncertainty of input parameters is depicted as a uniform distribution, while model output uncertainty is quantified at the 95 % prediction of uncertainty (95PPU). The cumulative distribution of an output variable is obtained through Latin hypercube sampling. SUFI-2 starts by assuming a large parameter uncertainty within a physically meaningful range, so that the measured data initially fall within 95PPU, then narrows this uncertainty in steps while monitoring P_factor and R_factor. The P_factor is the percentage of data bracketed by 95 % prediction uncertainty (95PPU) and R_factor is the ratio of average thickness of 95PPU band to the standard deviation of the corresponding measured variable. A p-factor of 1 and R-factor of zero is a simulation that exactly corresponded to measured data. In the each iteration, previous parameter ranges are updated by calculating the sensitivity matrix and the equivalent of a Hessian matrix (Magnus and Neudecker, 1988), followed by the calculation matrix. Parameters are then updated in such a way that new ranges are always smaller than previous ranges and are centered on the best simulation (Abbaspor et al., 2007). These two measured factors can be used as statistical analysis instead of the usual equations such as coefficient of determination (R2), Nash-Sutcliffe (Nash and Sutcliffe, 1970) which only compares two signals. Other statistical analyses in this study are coefficient of determination R2 multiplied by the coefficient of the regression line (BR2) and Mean Square Error (MSE). In this study all six mentioned variables were examined for testing calibration and validation of the simulated runoff in Taleghan basin. Abbaspour et al. (2007) designed SUFI2 as an optimization algorithm for sensitivity analysis, calibration, uncertainty and validation. Figure 3 shows a schematic linkage between SWAT and SUFI2 (Abbaspour et al., 2007). In this Fig. 3 Par_inf is information of parameters, par means parameters, LH is Latin Hypercube sampling, rch is reach, fn is function and Val is value. For application of SUFI2 following steps are required: Step 1: Define an objective function from six different types in SUFI2.
Step 2: Define minimum and maximum ranges for parameters. Due to the lack of information, it is assumed that all parameters are uniformly distributed within the basin.
Step 3: Sensitivity analysis is carried out by keeping all parameters constant to realistic values, while varying each parameter within the range assigned in step one.
Step 4: Initial uncertainty ranges for the parameters are selected for the first hypercube sampling. These ranges are smaller than the absolute ranges and they are subjective and depend upon experience.
Step 5: Carry out Latin Hypercube sampling which leads to the combinations of n parameters, where n is the number of desired simulations. This number should be relatively large and approximately between 500-1000.
Step 6: Calculate the objective function.
Step 7: Evaluate each sampling record with a series of measures.
Step 8: Calculate measures for assessing the uncertainties. As SUFI-2 is a stochastic procedure, statistics such as percent error, R2 and Nash-Sutcliffe, which compare two signals, are not applicable.
Step 9: Because parameter uncertainties are initially large, the value of d tends to be quite large during the first round of sampling. Hence, further rounds of sampling are required with updated ranges of parameters.
Watershed delineation by SWAT model divided the catchment at Galinak and Joestan gauging stations into 28 and 25 hydrological sub-basins with Hydrologic Response Units (HRUs) of 185 and 95 respectively. Soil map consists of 11 types of soil with as attributes of depth, electric conductivity, texture, available water content, saturated hydrologic conductivity and carbon content for different layers. Land use map included nine classes which were recoded into SWAT generic land use. The final land use classes were decided to be assigned as agriculture (AGRL), rangeland (RNGE), orchard (ORCD), urban (URBN), water (WATR) and river bed (NCRP). The simulation period was including 13 years from 1992-2004. The first three years was used for warm-up or the model setup, six following years was used for calibration and the rest four years data for validation. Manual calibration for mean annual runoff was the initial steps taken to achieve a general view of the effective parameters in SWAT.
In this study there were over sixty parameters in the ArcSWAT modeling system. After a comprehensive investigation of literature related to the hydrological models, 12 flow parameters were identified as important ones to be ranked based on their sensitivity. The t-Stat and p-Value are two factors to evaluate sensitivity in SWAT-CUP. The t-Stat provides a measure of sensitivity as its absolute values goes larger while the p-Values determine the significance of the sensitivity magnitudes with close to zero value as more significant. The calibration (1995)(1996)(1997)(1998)(1999)(2000) and validation (2001)(2002)(2003)(2004) performed at Galinak and Joestan stream gauge stations at Galinak and Joestan stations ( Fig. 4-7). The Muskingum routing method was selected to route water through the channel network. Six types of objective functions were performed for selection of the best one in Galinak station (as the outlet) by SUFI2, including the square error (mult), a summation form of the square error (sum), Coefficient of determination R2, Chi-squared χ2 (chi 2 ), Nash-Sutcliffe (NS) and Coefficient of determination R 2 multiplied by the coefficient of the regression line (BR 2 ).
A P_factor of 1 and R_factor of zero is a simulation that exactly corresponds to the measured data. The degree of closeness from the magnitudes of P and R can be used to judge the strength of the calibration. A large P_factor can be achieved at the expense of a large R_factor (Abbaspour et al., 2007). Hence; often a balance must be reached between the two. When the acceptance of R_factor and P_factor are reached, then the parameter uncertainty shows the desired parameter ranges. Further goodness of fit can be quantified by R2 and/or Nash-Sutcliffe (NS) coefficient between the observations and the final "best" simulation. It should be noted that SUFI2 does not seek the best simulation with such a stochastic procedure used while it is looking for the best solution which is the final parameter ranges. Table 2 shows the results of sensitivity analysis at Taleghan basin. This Table shows Base flow alpha factors (ALPHA_BF), Snowfall temperature (SFTMP) and Groundwater delay time (GW_DELAY) are more sensitive parameters.  Table 3: Statistical analysis to select the best objective function at Galinak station for the calibration period

RESULTS
Variables  Table 4: Statistical analysis to select the objective function at Galinak station for the validation period  Six variables consist of P_factor, R_factor, R2, NS, BR2, MSE computed at Galinak station for all types of objective functions indicating that Nash-Sutcliffe is the best fitness among others where the monthly comparisons for these variables were found to be as 0.89, 1.35, 0.89, 0.89, 0.8708 and 18.59 for the calibration period and 0.71, 1.31, 0.8, 0.79, 0.6798 and 25.76 for the validation period respectively. Finally Nash-Sutcliff objective function applied for Joestan station while the variables were calculated as 0.82, 1.35, 0.75, 0.67, 0.7134 and 20.25 for calibration period and 0.79, 1.13, 0.77, 0.63, 0.6886 and 24.79 for validation period.
The results of monthly discharge at Galinak station are shown for both calibration and validation periods in Table 3-4. Table 3 shows that P_factor is 0.92 for Nash-Sutcliffe function. Even though this value is greater than other objective function, but they show less difference and also near to 1. R_factor is 1.01 for Nash-Sutcliffe is the smallest among others however it shows distance from zero. Therefore this objective function shows good fitness. Other statistical variables were used for better judgment. Coefficient of determination (R2), Nash-Sutcliffe (NS) and BR2 variables showed higher values and closer to 1. Least mean square error (MSE) with 18.59 took place in this objective function. Therefore in this study Nash-Sutcliffe was chosen as a best objective function at Galinak station. The statistical analysis shows similar results at Joestan stream gauge (Table  4). The same objective function was preceded for Joestan station and the result of statistical analysis showed a reliable coefficient o determination and coefficient of efficiency.
However statistical results for Galinak station are much better than those of Joestan station (Table 5). This Table indicate that mean absolute relative error at the Galinak station are less than those at Joestan station too.

DISCUSSION
These statistical analyses indicate a fair model calibration and validation for discharge by ArcSWAT and SUFI2 interface in Taleghan basin. These results show reliable values for flow calibration and validation periods at both Galinak and Joestan stations. This model has relatively good fitness in lower part of the basin at Galinak station. Upper part of the basin due to more snow melt and complexity in water component shows relatively lower accuracy in comparison with the lower parts at Galinak station.
The results of the statistical evaluation of model performance on the monthly discharge in the calibration and validation periods at Joestan and Galinak stream gauge stations are summarized in Table 5. The values of MARE reported in the two stations are generally low and close to zero. Complexity process of snow melt indicates little more percentage of error at Joestan station. The R2 and NS coefficient are two important statistical analyses for evaluation of the results. According to Norusis (1999 ), when R2 equal to 1, the regression equation model considered as a perfectly fit model, meanwhile if the R2 is lower than 0.5 (near to zero), the model would be considered as not suitable. For Joestan station, the R2 values corresponding to the relationships between the observed and predicted average monthly discharges were 0.76 and 0.83 during the calibration and validation periods, respectively. However, corresponding values for Galinak station were 0.84 and 0.90. The optimal statistical coefficient of determination occurs when the value reaches 1. This statistics measure the goodness of fit of simulations and observations. Therefore all of results in both stations and both period (calibration and validation) for mean monthly flow shows the goodness fit in study area. Motovilov et al. (1999) stated that according to common practice the simulation of a model is considered good for coefficient efficiency values greater than 0.75 and acceptable for values between 0.75 and 0.36. These ranges were adopted in this study to classify model performance. Thus, the attendant results derived from both stream gauges can be used for this study. The last statistical criteria is coefficient efficiency that for Joestan station located around the low level of good ranges (0.75) and for Galinak this coefficient located at moderate level of good ranges (0.75-1). Therefore in General, ArcSWAT model is completely powerful to produce mean monthly discharge in Taleghan area. Consequently, we conclude that (i) the models are good; they fit the study area and the type of data fed into them and (ii) predictions of both are generalizable.

CONCLUSION
In this study SUFI-2 was used for model calibration and validation. By using SUFI-2, we could perform uncertainly analysis and calibrate the model for more number of Parameters.
The monthly proportions of different water pathways of input to the river flow are shown in Fig. 8 for Joestan station and in Fig. 9 for Galinak station. It can be seen that from April to the end of May, most of the river flow originates from surface runoff due to the intense storms and snow melt occurring during that period. Most of the surface runoff in June depends on snow melt that takes place at high elevation areas. The comparison of mean monthly surface components at Joestan station between April and May shows high differences between them(100%), While this comparison at Galinak station indicate low variation of mean monthly surface components (5%) between this two month. This state there is a delay due to snow melt at Joestan station which is located at high elevation with characterized by low temperatures. There is a long dry season that extends from July to the end of next February in the sub-sequential year.
These statistical analyses indicate a fair model calibration and validation for discharge by SWAT and SUFI2 interface in Taleghan basin. These results show reliable values for flow calibration and validation periods at both Galinak and Joestan stations. In the other word, a database system for investigating the water balance change across different land uses within the Taleghan watershed was successfully developed.

ACKNOWLEDGEMENT
First of all, praise is to the Merciful Allah, who has enabled me to complete this study in sound health. I wish to express my deep sense of appreciation and gratitude towards Professor Dr Mohd Amin Mohd Soom and for his valuable guidance and associate Professor Dr. Ghafouri, for his friendly guidance and supervision of this thesis.