Modeling of Rainfall Characteristics for Monitoring of the Extreme Rainfall Event in Makassar City

Corresponding Author: Wahidah Sanusi Department of Mathematics, Faculty of Mathematics and Natural Sciences, Universitas Negeri Makasar, 90224, Parangtambung Makassar, Sulawesi Selatan, Indonesia Email: wahidah.sanusi@unm.ac.id Abstract: Flooding is a common problem that occurs in some regions of Indonesia, including Makassar city. In the planning of flood control, rainfall variables are very necessary as the frequency, intensity and duration of rainfall. The relationship of these variables can be expressed in a curve Intensity-Duration-Frequency (IDF). The objectives of this study are to identify the best fitting distribution of rainfall data of Makassar city and also to model the relationship between rainfall intensity, rainfall duration and rainfall frequency that is described through IDF curves. The annual maximum daily rainfall data from Ujung Pandang rainfall station of Makassar is used in this study for the period 1986-2015. Data collection was performed at the Department of Water Resources Management in South Sulawesi province. Five distributions which are considered are Gumbel, Generalized Extreme Value (GEV), Generalized Pareto (GPA), Generalized Logistic (GLO) and Pearson type III (P3) distributions. The study result found that the probability distribution of rainfall data in Makassar city has a generalized extreme value distribution. Meantime, based on IDF curves shown that the longer the rainfall duration, the rainfall intensity decreases for various return periods. The results of this study are expected to be valuable information for designers of water management.


Introduction
A flood event is a natural disaster that can cause losses for human life's and its environment. The other impacts of flood event are the emergence of tropical diseases as dengue fever disease (Syafruddin and Noorani, 2013). An effort to anticipate the impact of flood events is through wetness estimating techniques of an area. Du et al. (2013) investigated the spatiotemporal variation of dry/wet conditions with the Standardized Precipitation Index (SPI) in Hunan, China. Sanusi and Ibrahim (2012) predicted the wet class transition using a log linear model. Meanwhile, Cai (2010) estimated the wet and low water of precipitation with the weighted Markov chain methods.
Probability distribution model for hydrological data, such as rainfall data have a very important role in providing information about the patterns of behavior and characteristics of rainfall. Various models of probability distribution have been used to model the rainfall data in an area. Some climate researchers have used the gamma distribution as a probability distribution for rainfall data (Aksoy, 2000;May, 2004). They choose this distribution, because of its ability to describe the characteristics of the rainfall. In addition to the gamma distribution, the other probability distributions are often used, such as the lognormal distribution (LN), Gumbel distribution, Pearson type III distribution (P3), Weibull distribution, Generalized Extreme Value distribution (GEV), Generalized Pareto distribution (GPA) and Generalized Logistic distribution (GLO) (Hosking and Wallis, 1997). Kysely and Picek (2007) found that the fitted GEV distribution is suitable of the extreme rainfall amount in the northeast of the Czech Republic compared with other regions. The generalized Pareto distribution has been to study the extreme values (Singh and Guo, 1997). Similarly, Shabri and Ariff (2009) shown that the most appropriate distribution GLO used for annual maximum rainfall data in Selangor. LN distribution has also been used in agriculture and hydrology (Yang, 2000).
Meanwhile, the LP distribution also known as Gamma distribution with three parameters has also been used to describe annual one day maximum rainfall in Ludhiana, Punjab (Kumar and Bhardwaj, 2015).
The relationship of the rainfall Intensity-Duration-Frequency (IDF) is one of tools for planning and designing of flood event (Prodanovic and Simonovic, 2007). Indeed the IDF curves allow for the estimation of the return period of an observed rainfall event or conversely of the rainfall amount corresponding to a given return period for different aggregation times (Koutsoyiannis et al., 1998). Norlida et al. (2011) estimated the IDF curve using Generalized Pareto Distribution in Klang region, Malaysia. Meanwhile, Soro et al. (2010) found that Gumbel and Lognormal distributions are suitable to estimate the IDF curve in Tropical area of West Africa. In this study, the Gumbel, GEV, GPA, GLO and P3 distributions are used for modeling the rainfall Intensity-Duration-Frequency relationship in Makassar city.

Data and Study Area
In this study, daily rainfall amount data (in mm) from Ujung Pandang rainfall station of Makassar for the period 1986-2015 were considered for analysis. The data are obtained from the Department of Water Resources Management in South Sulawesi province, Indonesia. The selected station was based not only on the completeness of data, but also on the longest period of data variability.

Probability Distribution Models
In study, five types of distributions are selected in fitting for the daily rainfall amount, namely Gumbel, GEV, GPA, GLO and P3 distributions. Li et al. (2015) stated that advantages of those probability distributions are simple, superior and popular in frequency analysis of extreme events. Those distributions have a cumulative distribution function, respectively, as follows: Where: x = Daily rainfall amount (mm) ξ = Location parameter α = Scale parameter (α>0) κ = Shape parameter Γ(.) = The Gamma function Parameter Estimation of Probability Distribution Hosking and Wallis (1997) introduced the Lmoments method for estimating the parameters of certain statistical distributions. The advantages of using method of the L-moments are that the parameter estimates are more reliable and more robust (Deka et al., 2009;Eslamian and Feizi, 2007). The L-moments are the summary statistics for probability distributions and data samples and are analogous to ordinary moments (Hosking and Wallis, 1997). They provide measures of location, dispersion, skewness, kurtosis and other aspects of the shape of probability distributions or data samples.
L-moments are linear combinations of Probability Weighted Moments (PWM). Let x 1:n ≤⋅⋅⋅≤x n:n be the ordered sample and n is the sample size. Hosking and Wallis (1997) gave an estimator of PWB, β r , as follows: The first four L-moments are given by: where, λ 1 is the measure of location (L-mean) and λ 2 is the L-scale. Hosking and Wallis (1997) defined the L-moment ratios in hydrological extreme analysis as follows: where, τ is the measure of coefficient of variation (L-C v ), τ 3 is the measure of skewness (L-C s ) and τ 4 is the measure of kurtosis (L-C k ). Details on the estimation of parameters for the mentioned distribution using L-moments method can be found in Hosking and Wallis (1997).

Selection of the Fitted Distribution Models
In this study, the best probability distribution for rainfall data is chosen based on three criteria that is Root Mean Square Error value (RMSE), Mean Absolute Error value (MAE) and Correlation Coefficient value (CC) (Zalina et al., 2002;Zin et al., 2009).
The Root Mean Square Error value (RMSE) is used to indicate the accuracy of the certain probability distribution in predicting the measured values. The minimum RMSE obtained will contribute to a more accurate probability distribution. Formula of RMSE is expressed as: Where: x i = Annual maximum daily rainfall data (mm) n = Number of observations Q(F i ) = The i th quantile estimation for the corresponding probability distribution, The Mean Absolute Error value (MAE) is applied to calculate absolute value of the average different between the measured values and the predicted values. The best probability distribution model is chosen based on the MAE value that is closer to 0. The MAE is defined as: Where: x i = Annual maximum daily rainfall data (mm) n = Number of observations Q(F i ) = The i th quantile estimation for the corresponding probability distribution, The Correlation Coefficient value (CC) is determined to describe how much of the measured value dispersion is explained by the predicted value. The CC value closer to one will be the better fit probability distribution model. Formula of CC is given as: Where: x i = Annual maximum daily rainfall data (mm) Furthermore, the suitable probability distribution of rainfall data was used to derive rainfall intensity estimation for various return periods.

Results and Discussion
The data used in this study is the daily rainfall amount data obtained from Ujung Pandang rainfall station which located in Makassar city for the period 1986-2015. Table 1 presents that during the period 1986-2015, the maximum daily rainfall amount is 376 mm that has occurred on February 2000. This month is the rainy season in Makassar city. In this period, the average daily rainfall is 38 mm. Meanwhile, on December 1996 has occurred the longest rainfall duration, i.e., 30 days. December is also the peak time of heavy rains in Makassar. Figure 1 shown that the average daily of rainfall increases from November to January and decreasing on February. These results indicated that Makassar city experiences the rainy season from November to April and peak time from December to February. In contrast, Makassar city experiences the dry season from May to October and peak time on July to September.
Meanwhile, the best fitted probability distribution is determined due to closer to zero values of MAE, smaller values of RMSE and the CC value closer to one. In Table 2 presents that the Generalized Extreme Value distribution (GEV) is most suitable as compare with other distributions for daily rainfall at Ujung Pandang station.
The parameter values for the GEV probability distribution are given in Table 3. Based on the best distribution model and then determined the design rainfall estimation for various return periods, i.e., two years (T2), five years (T5), 10 years (T10), 30 years (T30), 50 years (T50) and 100 years (T100), respectively (Table 4). Further, these design rainfall values are used to estimate rainfall intensity for various rainfall durations and return periods as shown in Table 5.
Further, the rainfall intensity values in Table 5 are used to determine the rainfall Intensity-Duration-Frequency relationship. Based on the IDF curves in Fig.  2, the relationship between the rainfall Duration (D) and the rainfall Intensity (I) is exponential function. Figure 3 also shown that the longer the rainfall duration, the rainfall intensity is smaller.

Conclusion
The purpose of this study is to determine the best fit distribution. Five distributions namely Gumbel, Generalized Extreme Value (GEV), Generalized Pareto (GPA), generalized logistic (GLO) and Pearson type III (P3) distributions are considered. Based on the result, the Generalized Extreme Value distribution (GEV) is the most suitable to describe the daily rainfall patterns in Makassar city.
The other aims this study is to construct IDF curves using the GEV distribution for describing the relationship between intensity, duration and return periods. Return periods are used as the measure of frequency of rainfall occurrence. Based on the IDF model, the results pointed out that the relationship has an exponential function form which is the long duration of rainfall is followed by decreasing intensity of rainfall for various return periods.
In this study, the best statistical distribution of rainfall and the IDF model have been identified. As at the moment, there are limited studies on probability distribution of extreme rainfall events in Makassar city, these results could provide useful information for estimating the design rainfall in frequency analysis, in particular for extreme rainfall events.