A Comparison of Spatio-Temporal Bayesian Models for Reconstruction of Rainfall Fields in a Cloud Seeding Experiment

In response to the drought experienced in Southern Italy a rain seeding project has been setup and developed during the years 1989-1994. The initiative was taken with the purpose of applying existing methods of rain enhancement technology to regions of south Italy including Puglia. The aim of this study is to provide statistical support for the evaluation of the experimental part of the project. In particular our aim is to reconstruct rainfall fields by combining two data sources: rainfall intensity as measured by ground raingauges and radar reflectivity. A difficulty in modeling the rainfall data here comes from rounding of many recorded rainguages. The rounding of the rainfall measurements make the data essentially discrete and models based on continuous distributions are not suitable for modeling these discrete data. In this study we extend two recently developed spatio-temporal models for continuous data to accommodate rounded rainfall measurements taking discrete values with positive probabilities. We use MCMC methods to implement the models and obtain forecasts in space and time together with their standard errors. We compare the two models using predictive Bayesian methods. The benefits of our modeling extensions are seen in accurate predictions of dry periods with no positive prediction standard errors.


INTRODUCTION
In this article our aim is to model a dataset on rain enhancing experiment through seeding operations conducted during the years 1989-1994 in the dry regions of south Italy starting with Puglia. The primary purpose is to reconstruct rainfall fields using spatiotemporal data obtained from a network of ground raingauges and data from a C-band digital weather radar. Radar raingauges are increasingly used to reconstruct rainfall fields since they are able to provide spatially continuous images of precipitation for short and regular time intervals; ground raingauges, on the other hand, provide more accurate and direct estimates of rainfall intensity.
Statistical spatio-temporal models are appropriate for our purposes due to the presence of spatio-temporal correlations in rainfall and radar data. Spatio-temporal modeling of rainfall data has received considerable attention in recent literature [1][2][3][4] . Many authors have also considered modeling the relationships between the radar reflectance and rainfall intensity. For example, Brown et al. [5] use multivariate time series models with state space representation incorporating continuous radar readings as covariates. Cassiraga et al. [6] model cross-correlation between radar and rainfall data using experimental surface variogram. Cornford [7] uses probabilistic models which treat the radar readings as noisy realizations of the underlying true precipitation process and builds a forecasting model for short-term predictions. Jordan et al. [8] develop a stochastic spacetime model for rainfall using the variations in the reflectivity-rainfall intensity (Z-R) relationships.
The Puglia region of South Italy is known to be very dry and the total amount of annual rainfall is usually small. As a result typically there is a huge number of zero rainfalls in any rainfall data set for this region. Moreover, after a rain seeding operation many rainfall amounts recorded by the available rainguages were rounded to the nearest 10th of a millimeter giving rise to essentially discrete data.
The occurrences of many discrete amount of rainfall in the data exclude the use of many currently available methods and models cited above since those are essentially developed to model continuous rainfall measurements. However, some authors [1][2][3] have developed methods for handling zero rainfalls using censoring mechanisms. Here the problem is to extend the model to accommodate more than one discrete rainfall value occurring with non-zero probability.
Another objective of this study is to develop methods for relating radar reflectance and rainfall intensity to reconstruct rainfall fields in the presence of the discrete rainfall amounts. In this study we do this by explicitly regressing rainfall data on the radar measurements in a spatio-temporal model. We believe that this method is novel for data obtained from a rain seeding experiment conducted in a very dry region such as southern Italy.
We consider two recently developed hierarchical Bayesian modeling approaches. The first approach is a separable and stationary Gaussian spatio-temporal model developed by Sahu et al. [9] for monitoring some air pollution levels. The second approach is a hierarchical space-time Bayesian Kriged-Kalman filtering (BKKF) model. The spatial prediction surface of the BKKF model is built using the well known method of kriging for optimum spatial prediction and the temporal effects are analyzed using the models underlying the Kalman filtering method.
We extend both the models to accommodate rounded rainfall measurements taking discrete values with positive probabilities. The full Bayesian models are implemented using MCMC techniques which enable us to obtain the optimal Bayesian forecasts in time and space. We compare the two modeling approaches using the mean-square error of predictions and some formal Bayesian model selection criteria.
The dataset: Our data come from the rain enhancement project carried on in the South of Italy (Fig. 1) during the period 1989-1994. This is a very dry region and the total amount of annual rainfall is usually very small (approximately 80 millimeter on the average per year during the study period). We consider a rainfall seeding operation conducted at 5:00AM on April 11, 1992 when 44 out of total 80 ground raingauges recorded amount of rainfall in 10 minutes interval; in addition data from a C-band digital weather radar, scanning the whole area every five minutes, are available. In this study we consider the data recorded every ten minutes from 5:10AM in the morning until 9:30AM. A subsequent seeding operation was performed at 9:30AM and we do not include the data recorded after 9:30AM on that day since our aim is to devise methodology for evaluating a single seeding operation. Data were available for many other seeding operations performed on other days. However, our previous investigation [10,11] in modeling those data have found a number of insurmountable problems in modeling the full data set, for example, (a) there was a large number of missing values due to many malfunctioning automatic rainguages (in some cases there were only 10 rainguages working properly); (b) there were extreme variability in meteorological conditions affecting the amounts of rainfall during different seeding operations.
On April 11, 1992 there were N=44 working rainguages in the study region. Let s i ,i=1,…,N denote the UTM x and y-coordinate of the locations. Out of these 44 sites we choose to set aside data from six sites for validation purposes. The validation sites were chosen judiciously so that those covered the entire study region. Thus we model data from the remaining 38 sites which will be denoted by s 1 , …, s n where n=38. The 44 locations together with a predictive grid of 2710 locations are shown in Fig. 2. We aim to perform spatial predictions in the grid. Together with the rainfall we have data from a Cband digital weather radar, scanning the whole area every five minutes. We shall model log rainfall data using log of the radar measurements as a covariate because there exist a well known linear relationship between them. Let r(s i ,t) denote the log of the radar measurements at site s i and at time t. Radar reflectivity is expressed in units of dBZ. This is a measure of the power scattered back to the radar by precipitation particles in the atmosphere. The power is a function of the distribution of raindrops size given by ( ) ( ) where the summation of the drop diameters (D) takes place over the volume of space sampled by the radar.
Exploratory analysis: Although the amount of rainfall variable is a continuous random variable, the readings from the rainguages were rounded to the nearest 10th of one millimeter. Moreover, the zero rainfalls were already replaced by 0.02 for obvious benefits in working with the log-scale. Table 1 provides the frequencies of the amount of rainfall. Due to this discreteness in the observations we shall use a censoring mechanism when modeling these data as continuous observations. There exist a well known linear relationship between logarithm of radar data and the logarithm of the actual amount of rainfall known as the Marshal and Palmer law [12] . We visualize the relationship in Fig. 3 where we have plotted the mean log rainfall for each distinct value of log-radar values. An approximate, though rather weak, linear relationship is seen in the graph.
In Fig. 4 we provide the mean amount of rainfall (over 27 measurements) in each of the 44 sites. There is no evidence of spatial trend for the site means. In fact we shall illustrate with the exponential covariance function for simplicity.
The site means show evidence of spatial variation. We investigate this using an empirical variogram of the data. We first obtain the residuals after fitting a regression line with r(s i ,t) as one covariate. We also remove any temporal variation and trend present in the residuals by explicit modeling or by creating successive differences. Let ( ) s i W ,t denote the residuals. We suppose that replications at location s i ,i=1, …, n since we have detrended the data. We now consider the average variogram defined by where d ij is the distance between the spatial locations s i and s j . The quantity γ(d ij ) is estimated by The empirical variogram cloud is obtained by plotting γ (d ij ) against d ij for the n(n-1)/2 possible pairs of locations.
In Fig. 5 we provide the variogram cloud and we super-impose a smooth loess curve (as obtained using the S-Plus function loess). This plot justifies our choice of the exponential spatial covariance function.
All the 44 sites have been classified into two areas called the target and control. The classification comes from the experimental design applied in the rain seeding project. The seeding operation is carried out in the target area, however, rainfall is measured both in the target area and the control area. For the April 11 experiment the target area was Bari and the control area was Canosa, marked as C in Fig. 1.
In order to investigate the temporal variation in the data set we show the time series plots of data for each of the 38 sites in Fig. 6. The Figure does not show a large amount of temporal variation. Moreover, there is not much difference between the time series plots of the sites in target and control areas; we have performed significance testing using linear models to confirm this conclusion. Thus our modeling approaches earlier will not differentiate between the sites from target and control areas. Spatio-temporal models Latent variables to model discrete data: As seen in Table 1 the amount of rainfall had been rounded to the nearest 10th of a millimeter. These discrete values occur with non-zero probabilities, but the actual rainfall is a continuous measurement falling between two discrete endpoints. This is a very common problem in modeling rainfall data with zero rainfall [1,2] . A common approach is to model the zeros by the values of a latent continuous variable below a threshold value (censoring).
The problem of discreteness is more severe for the current data as there are many rounded discrete values occurring with non-zero probabilities. Thus we extend the censoring mechanism for latent variable to include multiple discrete values. Let X(s,t) denote a continuous latent variable and let there be k particular values of log rainfall λ 1 , λ 2 , …, λ k which may occur with positive probabilities. Let c 1 , …, c k be constants such that We suppose that the observed data point at a site s at time t is given by otherwise. We suppose that the latent random variable X(s,t) for any observed rainfall bigger than 1 millimeter on the original scale is the actual log amount of rainfall. Henceforth, we model the latent variables X(s,t) rather than the observations Z(s,t) some of which have been rounded.

A Gaussian spatio-temporal random effect model:
We first assume the following hierarchical model: is an independent zero mean spatio-temporal process and the error term ε(s,t) is a white noise process assumed to follow N(0, µ β β = + s s (2) As mentioned before, for ν i ( ,t ) s , we adopt a separable covariance structure [13] . That is, In addition, the ρ's are taken to be exponential correlation functions, i.e., as we have decided to assume previously. We define Σ s and Σ t to be square matrices of order n and T, respectively with elements A Bayesian Kriged-Kalman model: We follow Sahu and Mardia [9] to construct a BKKF model for the rainfall data. In so doing we extend their approach to account for discreteness in the data and also add the radar data as a regressor. Let X t =(X(s 1 ,t),…, X(s n ,t))′ denote the n-dimensional latent random vector at time t; t=1,...,T. The first modeling assumption is the hierarchical model: where Y t =(Y(s 1 ,t),…, Y(s n ,t))′ is an unobserved but scientifically meaningful process (signal) and ε t is a white noise process. Thus we assume that the components of ε t are i.i.d. normal random variables with mean zero and unknown variance 2 ε σ .
The space-time process Y t is given by where the matrix H of order nxp is defined below, α t is the state vector of dimension p,  a and b, IG(a,b). As previously we choose a=2 and b=1 to have a proper but diffuse prior distribution with mean 1 and infinite variance. The regression parameter β 1 is given the flat normal prior distributions with mean 0 and variance 10 4 . The matrix H is obtained by using what are known as principal kriging functions [9,13] . In this implementation we take the fist column of H to be the unit vector, 1. The other columns are obtained as follows: We first obtain We now perform the spectral decomposition of B, We assume that Where 2a η is the assumed prior degrees of freedom (≥p) and b η is a known positive definite matrix. We say that X has the Wishart distribution W p (m,R) if its density is proportional to [14] . (Here tr(A) is the trace of a matrix A.) To obtain diffuse but proper prior distributions we choose a η =p/2 and following Sahu and Mardia [9] we take b η to be the 0.01 times the identity matrix.

Strategies for model choice:
Many graphical diagnostic methods are used to perform diagnostic checking and model validation [13] . Several validation statistics are also available [15] . In this article we shall use the following methods for model choice and validation.

Model choice:
To compare between two different models we shall use the following criterion based on Gelfand and Ghosh [16] and Laud and Ibrahim [17] .
The above validation measure does not take into account the spatial and temporal dependence between the observations. Hence we adopt the validation criterion developed by Sahu and Mardia [9] . Let Z orig denote the vector of 162 observations on the original scale for which we seek validation and oriĝ Z denote the predictions on the original scale and Σ denote the 162 dimensional estimated covariance matrix of oriĝ Z .
The validation criterion developed by Sahu and Mardia [9] is given by: The quantity D 2 will increase if there are large discrepancies between the predictions based on the model, oriĝ Z and the observed data, Z orig . The observed value of D 2 can be referred to the theoretical values of the c 2 distribution with 162 degrees of freedom. In our illustration we shall compare using both MSE and D 2 .
Using the MSE we can compare previous results obtained by Orasi and Jona Lasinio [10] .

Prediction details:
Our aim is to predict the amount of rainfall for all locations in a grid of m=2710 sites at any given time point t=1,...,T. The radar values (covariate) for these locations are available. Moreover, we need the prediction details to carry out the cross-validation for the six sites for which we have set aside data. Here, we provide the prediction details for a location s' at a time point t' for the two modeling approaches presented earlier.
The MCMC methods are first implemented for sampling from the posterior distributions. Subsequently, the predictive distributions are sampled by composition. The draws from the posterior enable draws from the predictive distribution of X(s',t'). This predictive distribution is model dependent and the details for obtaining draws from it are given below.
The sampled values from the predictive distribution of X(s',t') are then used to construct the predictive distribution of Z(s',t'). To implement this step we simply invert the censoring relationship given earlier, i.e. we choose the appropriate value of Z(s',t') by seeing the position of X(s',t') in the set of ordered values 1 2 < < < k c c c . Finally, to obtain the predictions on the original scale, we simply work with the exponential of the predictive realizations Z(s',t').
Predictive distribution for the Gaussian random effects model: Using (1) and (2), for a new location s' at time t', X(s',t') is conditionally independent of z given v(s',t'), with its distribution given by The posterior predictive distribution of X(s',t') is obtained by integrating over the unknown parameters in (12) with respect to the joint posterior distribution. MCMC samples from the posterior distribution enable us to perform the integration.
Note that in (12) we require a new v(s',t') conditional on the posterior samples at the observed locations s 1 , …,s n and at the time points t 1 , …,t T . Let V denote the vector of v(s, t) in n locations and T time points and ⊗denote the Kronecker product. For this we have: and ( ) ( ) The conditional mean and variance are very computationally expensive to calculate due to the dimensionality of V and the very large number of sites s' for which we require predictions. However, by fixing the decay parameters φ, the quantities ',t' s and C(s',t') given in (14) and (15) need only be calculated once; no updating is required in the MCMC.
Predictive distribution for the BKKF model: We use the models (4) and (5) to predict at location s' and at time t. We first obtain the spatial covariance matrix γ Σ * of order n+1 using the assumed covariogram (6). That is, Σ Σ Σ = Σ to the PMCC the random effect models are seen to be better than the BKKF models. Moreover, the censoring mechanism detailed previously is seen to be worthwhile since the random effect model (1) with this is better than the model without the censoring mechanism implemented. The random effect model is also better than the simple fixed effect model without the spatiotemporal process. Thus our best model is the Gaussian spatio-temporal random effect model with the latent variables to handle discrete data. We have checked the residual plots (not shown) for this model for all the 38 modeling sites. The plots do not show any recognizable pattern and the model seem to be adequate for the data.
We now return to the validation data for six sites which we have set-side. We consider validation at three time points 5:50AM, 6:40AM and 8:20AM. The 90% prediction intervals are plotted in Fig. 7. Only one out of 18 observation falls outside its prediction interval. As a result we conclude that the model has performed well in re-constructing the rainfall fields. For the remainder of this study we shall use this model for analysis. Parameter estimates: The MCMC trace plots of parameters of the adopted Gaussian random effects model is given in Fig. 8. The MCMC algorithm converges rapidly and mixes well. Table 3 provides the parameter estimates for the adopted random effect model. The regression coefficient β 1 for the radar measurements is seen to be significant and positive as expected. Furthermore, the estimated values of β 0 and β 1 are consistent with the approximate linear relationship we have seen in the scatter plot given in Fig. 3. The random effect variance, 2 ν σ is slightly larger than the error variance ε 2 which shows that the random effects explain more variation than the pure error. Predictions: Reconstructing the rainfall fields: We use the prediction details discussed to reconstruct the rainfall fields at different time points. We obtain the rainfall maps at time 5, 10 and 20 for illustration in Fig.  9-11. These time points correspond to 5:50AM, 6:40AM and 8:20AM respectively. From these figures we see precipitation moving from the south-west to the north-east. The south-east part of the region has remained consistently dry. The standard deviations of the predictions increases with the predicted amount of rainfall, although they are smaller near the observation sites. For the dry regions at any time point the standard deviations are approximately zero which implies that accurate predictions are possible. Effective modeling of discrete data using the censoring mechanism developed here has made this possible.

DISCUSSION
In this study we have compared two competitive spatio-temporal modeling approaches for rainfall data obtained from a cloud seeding experiment in the regions of south Italy including Puglia. The Gaussian random effect model is seen to perform better than the BKKF model. Each model has, a priori, a good reason to be chosen. The BKKF is a model that naturally extend in a Bayesian space-time setting the geostatistical approach which is usually considered quite sensible when treating rainfall-radar data [6,11,[18][19][20] .
It is able to easily account for non separable behavior of the space-time process and allow to include seasonal effects and covariates in an easily interpretable manner. The Gaussian random effect model with separable space-time covariance structure and the fixed effect model can be seen as alternatives to this approach when there is a reasonable suspect that the space-time process is indeed separable. Furthermore, our data are essentially discrete and this fact is not accounted for in any of the above models. We have developed a censoring method using a latent variable to handle multiple discrete (rounded) rainfall amounts and introduced it in both the models. The benefits of modeling of discrete data using the censoring mechanism are seen in more accurate predictions of dry periods with no positive prediction standard errors. Notice that the BKKF performs better then the fixed effect model but worse than the random effect one with or without censoring. A possible reason for this is that the spatio-temporal process is indeed separable and the BKKF is not a suitable model for such processes.