Statistical Modelling of Botswana’s Monthly Maximum Wind Speed using a Four-Parameter Kappa Distribution

Corresponding Author: Oliver Moses Okavango Research Institute, University of Botswana, Botswana Tel: +267 6817203, Fax: +267 6861835 Email: omoses@ori.ub.bw Abstract: Wind speed modelling has been key to many environmental and engineering applications, particularly in environmentally friendly wind power generation to meet energy demands. Efficient assessment of wind speed at different recurrence intervals requires the choice of a suitable statistical distribution and an unbiased method of parameter estimation. This study suggests the use of a four parameter Kappa distribution, with its parameters estimated using the method of L-moments, to model Botswana’s monthly maximum wind speed data at six meteorological weather stations. These synoptic weather stations are Gaborone, Sir Seretse Khama Airport, Tsabong, Tshane, Gantsi and Maun which are broadly spread across the country’s economic activity centres. Reliable wind speed quantiles have been obtained for the selected stations and have been found to fall within the interval 13.80 to 21.69 m s −1 . Mean maximum wind speeds have been found to range between 12.65 and 14.97 m s −1 , with standard deviations ranging between 1.58 and 2.44 m s −1 . These results can reliably be used by environmentalists and technologists working in the energy sector in Botswana.


Introduction
Wind speed is an important factor in many sectors. For instance, it is an important parameter in the estimation of evapotranspiration for agricultural purposes (Valipour and Eslamian, 2014;Valipour, 2014a;2014b;2015a;2015b;2015c;2015d). It is also an important factor for engineers in the construction industry, who use it to calculate wind uplift loads (Liu, 1991;Dyrbye and Hansen, 1997;Rosowsky and Cheng, 1999;Kumar and Stathopoulos, 2000;Zhou et al., 2002). In this study, Botswana's wind speed is being studied mainly for renewable energy applications (Chiodo, 2013;Mukhopadhyay et al., 2014). With the growing rate of urbanisation, Botswana's urban clusters have recently been experiencing energy shortages. Therefore, it has become necessary to explore alternative, environmentally friendly energy sources such as wind mills to meet localised demands. This would reduce the use of energy from fossil fuel resources, the main sources of greenhouse gases. For efficient wind power evaluation, it is essential to model wind speed using a suitable statistical distribution that is able to represent the observations accurately and the parameters of this distribution need to be estimated using an appropriate technique (Parida, 1999;Shabri and Jemain, 2010;Chiodo, 2013;Mukhopadhyay et al., 2014). Although researchers have suggested the use of Weibull distribution (Akpinar and Akpinar, 2004;Azad et al., 2014), this may not be applicable in semi-arid regions where wind speeds are highly variable. In view of this, a four parameter Kappa distribution with its parameters estimated using the L-moments method has been used to model Botswana's monthly maximum wind speeds. This distribution encompasses a family of distributions that can effectively describe the peculiarity in data variability (Hosking, 1986;1990;1994;Parida, 1999;Shabri and Jemain, 2010). When used for parameter estimation, the L-moments procedure gives unbiased parameter estimates, hence yields unbiased wind speed quantiles that can be used to develop growth curves at various recurrence intervals. Planners, designers and practicing engineers could easily make use of this information to plan and produce wind power efficiently (Chiodo, 2013;Mukhopadhyay et al., 2014).
Modelling Botswana's monthly maximum wind speed data for using a four parameter Kappa distribution, with its parameters estimated using the method of Lmoments has never been done. This research is therefore the first of its kind in this country and uses one of the best techniques currently available in the world.

Data
Historical monthly maximum wind speed data at 10m height for the synoptic weather stations Maun, Gantsi, Tsabong, Sir Seretse Khama Airport (SSKA), Tshane and Gaborone (Fig. 1), were obtained from the Botswana Department of Meteorological Services. The stations were selected because of their data availability and because they are the centres of economic activities. The lengths of the data sets vary from station to station, but all fall within the period 1960 to 2005.

Methods
There are two aspects that make statistical modelling successful. The first aspect is the choice of an appropriate statistical distribution that can adequately describe the observations and the second one is the choice of an appropriate parameter estimation technique, which yields quantiles with the least bias and the least mean squared error. A good statistical model should be able to meet both the descriptive and predictive ability aspects (WMO, 1989), such that more meaningful answers can be deduced from it. Two parameter (2-P) distributions such as the Normal, Exponential, Gamma or Gumbel can explain the descriptive ability aspect well, but fail to account for the predictive ability aspect (Parida, 1999). Three parameter (3-P) distributions such as the Generalised Pareto, Generalised Extreme Value and Generalised Normal distributions are better in their predictive and descriptive ability aspects, but sometimes encounter problems with the conditions of separation. Hosking (1986;1990;1994) showed that the method of Probability Weighted Moments (Greenwood et al., 1979) or the method of L-moments for parameter estimation gives unbiased parameter estimates and hence unbiased quantiles, even when the choice of the parent distribution was inappropriate. This means that one should choose a 3-P distribution rather than a 2-P distribution, since it yields the least biased quantiles with the least mean squared error. Parida (1999) found that the use of a four parameter (4-P) distribution such as a 4-P Kappa can overcome the problem of inappropriate 2-P or 3-P distributions. This distribution can decrease to either of them, depending on the magnitude of the parameters. Therefore, a 4-P Kappa distribution is able to account for the descriptive ability aspects of the given data set very well.   (1999)) h k Distribution 1 ≠ 0 3-P Generalised Pareto Distribution 0 ≠ 0 3-P Generalised Extreme Value Distribution -1 ≠ 0 3-P Generalised Logistic Distribution 1 0 2-P Exponential Distribution 0 0 2-P Gumbel Distribution -1 0 2-P Logistic D1istribution 1 1 2-P Uniform Distribution (one form of Normal Distribution) 0 1 2-P Reverse Exponential Distribution (i.e., 1 -F(x) is exponential) When its parameters are estimated using an L-moment procedure, the predictive ability aspects can also be adequately accounted for and the resulting quantiles would be quite reliable. We modelled the monthly maximum wind speed data from different sites in Botswana using a 4-P Kappa distribution, with its parameters estimated using L-moment methods, to obtain unbiased quantiles at each selected site and at the desired recurrence intervals (T). A 4-P Kappa distribution has a cumulative distribution function (Hosking and Wallis, 1997): where, u is the location parameter, α is the scale parameter, h and k are shape parameters, which implicitly include the continuous limits at h = 0 and k = 0. F(x) is the probability of non exceedance and can also be expressed as F(x) = 1-1/T. Special cases of Equation 1 can take the form of different distribution functions with different values of h and k as shown in Table 1. The quantile function x(F) of the 4-P Kappa distribution can be expressed as: This equation is the inverse of the cumulative distribution function (Equation 1). The first four Lmoments for the data sample of size n arranged in ascending order are obtained using the expressions (Hosking and Wallis, 1997): where, λ 1 , λ 2 , λ 3 and λ 4 are related to location, scale, shape and peakedness respectively. In Equation 3b, the connotations (1:2) and (2:2) mean the first and second large values respectively, in a sample size of two drawn from the entire observations made at a station. Appropriate connotations are used in the other equations in a similar manner. It is convenient to present Lmoments as L-moment ratios since their ratios measure the shape of a distribution independently of its scale of measurement. Dimensionless third and fourth L-moment ratios t r , are defined as: where, t 3 is the L-coefficient of skewness (L-Sk), t 4 is the L-coefficient of kurtosis (L-Ku). The ratio of Lcoefficient of variation (L-Cv) is defined as: The L-moment ratios t 3 and t 4 are functions of only the shape parameters h and k (Hosking, 1986;1990;1994;Parida, 1999). The parameters h and k are restricted by conditions below (Equation 6a to 6d): Existence of the L-moments is ensured by Equation 6a and 6b, while their uniqueness is ensured by Equation 6c and 6d. The first step of parameter estimation based on L-moment methods at a given station involves obtaining solutions for h and k that best describe t 3 and t 4 in the Lmoment ratio diagram and also satisfying Equation 6d. For practical purposes, the interval h ≥ -1 is the most useful one, while k = (1-3 t 3 )/(1+ t 3 ) is the equation that is used to calculate k (Hosking, 1986;1990;1994). These estimated values of h and k are used to estimate values of the parameters u and α in Equation 2.

Results and Discussion
Estimated values of the mean maximum wind speeds for each station and their standard deviations, the period of the data, the location parameter u, the scale parameter α, the shape parameters h and k of the 4-P Kappa distribution are shown in Table 2. The mean maximum wind speeds varied between 12.65 and 14.97 m s −1 , with their standard deviations varying between 1.58 and 2.44 m s −1 . Scrutinizing the values of h and k in Table 2, it can be seen that only Tshane has approximately h = -1 and k = 0, suggesting that the underlying distribution is closer to the two parameter logistic distribution by Table  1. Three of the other stations have approximately h = 0 and k ≠ 0, suggesting that the underlying distribution of their data is closer to the 3-P geneneralized extreme value distribution.
The estimated parameters of the 4-P Kappa distribution were substituted into Equation 2 to obtain estimates of wind speed quantiles (Park et al., 2001) for each station, corresponding to the recurrence intervals (T):10, 20, 50, 100, 200 and 500 years. These recurrence intervals were converted to the Gumbel reduced variate using Equation 7 (Hosking and Wallis, 1997;Fowler and Kilsby, 2003) where, Ln is the natural logarithm and F is the nonexceedance probability. The computed wind speed quantiles were used to draw the growth curves presented in Fig. 2, with the Gumbel reduced variate on the horizontal axis. The Gumbel reduced variates 2.3, 3.0, 3.9, 4.6, 5.3 and 6.2 (on the horizontal axis of Fig. 2), correspond to the recurrence intervals (T): 10, 20, 50, 100, 200 and 500 years respectively.      The estimated wind speed quantiles at all the stations fall within the range 13.80 to 21.69 m s −1 , but the growth curves vary from station to station. The growth curve for Tsabong has the highest wind speed quantiles, ranging between 18.38 and 21.69 m s −1 , while Gantsi has the growth curve with the lowest wind speed quantiles, ranging between13.80 and 16.82 m s −1 . Variations in wind speeds (hence variations in growth curves) can be attributed to variations in the local conditions or surroundings (e.g., buildings, vegetation and other topographical features) of each station (Wegley et al., 1980;Larsson, 1986;Oke, 1987;Moses, 2007). Variations in wind speeds can also be attributed to weather systems that influence the country's weather. The influence of these weather systems depends on how near or far they are from the stations. It is worth noting that the weather in Botswana is influenced by synoptic scale weather systems such as the Indian ocean high pressure cell, the Atlantic ocean high pressure cell, surface lows, frontal systems, cut-off lows, Inter Tropical Convergence Zone (ITCZ), high pressure cells, easterly and westerly troughs. To check whether or not the estimated wind speed quantiles corresponded to the observations at each station, the wind quantiles at the recurrence intervals T = 10 and 20 years were compared with the observed wind speeds at the respective stations (Table 3). Suppose that the observations were X 1 , X 2 , ..., X N , where X 1 < X 2 ... < X N , with N being the number of observations in the sample. The Weibull plotting formula (Shaw, 1983) was used to relate the observed wind speeds at T = 10 years and at T = 20 years, to the estimated wind speed quantiles: where, i is the rank of the wind speed observation in the ordered data sample. At T = 10 years, the differences between the estimated wind speed quantiles and the observations range between 0.55 m s −1 (Gaborone) and 1.93 m s −1 (Tsabong), while these differences at T = 20 years range between 0.08 m s −1 (SSKA) and 1.27 m s −1 (Gaborone). The small magnitudes of these differences indicate that the estimated wind speed quantiles correspond well to the observations, which indicates that the growth curves presented in Fig. 2 are reliable. In view of this, the growth curves are useful information to the professionals in the energy sector who are concerned with environmentally friendly alternate energy sources such as wind power production.

Conclusion
In the recent past, Botswana has been highly dependent on energy imports. This has made it necessary to explore alternative energy sources such as wind mills to meet localised demands. For wind power applications, it is crucial to model wind speeds using an appropriate statistical distribution that can adequately describe the observations. In addition, the parameters of such a distribution need to be estimated using an appropriate technique. Monthly maximum wind speed data for Gaborone, Sir Seretse Khama Airport, Tsabong, Tshane, Gantsi and Maun have been modelled using a four parameter Kappa distribution based on L-moment procedure, which has made it possible to obtain reliable wind speed quantiles at recurrence intervals 10, 20, 50, 100, 200 and 500 years. Growth curves have been drawn to display the estimated wind speed quantiles. All the growth curves have wind speed quantiles falling within the range 13.80 to 21.69 m s −1 . A comparison between the estimated wind speed quantiles at the recurrence intervals T = 10 and 20 years and the observations corresponded well. Mean maximum wind speeds for each selected station have also been computed. They have been found to vary between 12.65 and 14.97 m s −1 , with their standard deviations varying between 1.58 and 2.44 m s −1 . The results of this study provide valuable information for many environmental and engineering sectors, which include environmentally friendly wind power production. The high correspondence between the estimated wind speed quantiles and the observations implies that the results of the study can be extended to other regions of the country with climates similar to those of the selected stations. To improve the results of the study or to reduce uncertainties, it is recommended that in similar future studies, reanalysis data should be used to extrapolate the wind data to longer-term data (Schwartz and George, 1999;Brower et al., 2013;Jimenez et al., "n.d."). Different reanalysis data sets will have to be compared to identify the most reliable one.