Inverse Correlations for Multiple Time Series and Gaussian Random Fields and Measures of Their Linear Determinism

For a discrete-time vector linear stationary proce ss, {X(t)}, admitting forward and backward autoregressive representations, the variance matrix of an optimal linear interpolator of X(t), based o n a knowledge of {X(t-j), j≠0}, is known to be given by Ri(0) -1 where Ri(0) denotes the inverse variance of the process. Let A=I s−Ri(0)̄ R(0)̄ , where R(0) denotes the variance matrix of {X(t)} and s I an s×s, identity matrix. A measure of linear interpolabi ity of the process, called an index of linear determinism, may be constructed from the determinan t, A], Det[Is − of 1 1 s ) 0 R( ) 0 Ri( A I − − = − . An alternative measure is constructed by relating ], ) 0 tr[Ri( 1 − the trace of Ri(0) , to tr[R(0)]. The relationship between the matrix A and the correspon ding matrix, P, obtained by considering only an optimal one-step linear predictor of X(t) from a kn owledge of its infinite past, {X(t-j),j>0}, is also discussed. The possible role the inverse correlatio n function may have for model specification of a vector ARMA model is explored. Close parallels betw en the problem of interpolation for a stationary univariate two-dimensional Gaussian random field an time series are examined and an index of linear determinism for the latter class of processes is al so defined. An application of this index for model specification and diagnostic testing of a Gaussian M rkov Random Field is investigated together with the question of its estimation from observed data. Results are illustrated by a simulation study.

Assumption 1: For each t∈N, X(t)=[X 1 (t), X 2 (t),…, X s (t)]′ admits the following one-sided moving average representation: where [ ]' and {ε(t)} is a purely random process: for all t, u∈N, V B is nonsingular, δ u =1, u=0, δ u =0, u≠0, the B(j) are (s×s) matrices satisfying respectively and the matrix spectral density function by and a forward autoregressive representation [1] : V F is nonsingular and We may, accordingly, write [2] ( ) ( ) The inverse, 1 [ ( )] f µ − , of f(µ) has all the essentials properties of a spectral density function [4] and the inverse spectral density function of { } ( ) X t may be defined by .
define the inverse covariance function of {X(t)} and ) v ( the inverse correlation function, where Ri(v)'=Ri (-v). It follows that [3] , fi(µ) admits a Fourier expansion: For a univariate process, s=1, the role of inverse correlation function for time series model identification is examined by several authors; see Cleveland [4] , Chatfield [5] , McClave [6] , Bhansali [7][8][9] , among others. A somewhat different application of this function, namely for estimating the interpolation error variance and a related R 2 measure, is suggested by Battaglia and Bhansali [10] .
In this paper, the potential role the inverse correlation function may have for model specification of a multivariate time series as well as that of a Gaussian Markov random field, GMRF, is examined. Following Battaglia [13] , an inverse process, {Yi(t)}, of {X(t)} is defined by relating this process to the error process, ζ(t), of the linear interpolator, and several related measures of linear interpolability, called measures linear determinism, for multivariate time series are defined. An application of the inverse correlation function for specifying a vector autoregressive moving average model is also considered. The results here show that because of the 'arrow of time' and the associated notions of 'past' and 'future', the role of this function for model identification of a multivariate time series is somewhat limited and possibly confined to special classes of models. However, by contrast, we show that the inverse correlations define the parameters of a GMRF and hence play a fundamental role in specifying its structure. Based on this observation, an 2 R measure of linear interpolability of Gaussian Random Fields is introduced and potential applications of this index for model assessment and criticism are discussed. The inverse correlations of a GMRF also define the correlation structure of the residuals of a model specified for a GMRF. We discuss therefore how this property may be applied for diagnostic testing of a fitted GMRF model. The question of how to estimate the index of linear determinism for a Gaussian random field is also considered and two different estimates of the index are defined, namely, a nonparametric estimate based on a 'window' estimate of the spectral density function of the field, and a parametric likelihood estimate. We finally provide some simulation results.

Inverse process and the linear interpolator:
satisfy Assumption 1 and suppose that X(t) is unknown for a fixed t. The question of how to construct an optimal linear estimate of X(t) from a knowledge of {X(t-j),j≠0} is known as the problem of linear interpolation [11] . Examples of situations where a question of this type arises include the problem of outlier detection and estimation of missing values for a multivariate time series, analysis of spatial data collected over a narrow but long rectangular lattice [12] and spatio-temporal processes.
Properties of the inverse process: As is well known [11] , for each t, provides an optimal linear interpolator of X(t), where the ri(u) denote the inverse correlation function of denotes the error of the optimum linear interpolator, the interpolation error variance matrix is given by Now, see Masani [14] , { } ) t ( Yi admits forward and backward linear representations: Measures of linear determinism: It follows from (5) that E{ζ(t)X(t-v)'}=0, for all v≠0 and hence that E{ζ(t) X (t)'}=0. Thus, the individual components of ζ(t) and X (t) are mutually uncorrelated and we may write, var{X(t)}=var{ X (t)}+var{ζ(t)}.
This last equation provides an analysis of dispersion for X(t), see Rao [15] for a definition of this last concept. It decomposes the variability of X(t), as measured by its variance matrix, as a sum of the variance matrices of X (t) and ζ(t); the former may be thought of as the variability that could be explained from a knowledge of the complete past and future of X(t) and the latter as the unexplained variability due to the interpolation error. Let where cov{ X (t),X(t)}=E[X(t) X (t)'], be a normalised measure of association between X(t) and its linear interpolator, X (t). We have 1 1 (0) (0) .
If s=1 and X(t) is univariate, 2 [ { ( ), ( )}] , A corr X t X t = [10] , where corr{ X (t),X(t)} is the standard correlation coefficient between X(t) and X (t) and in this sense, A provides a multivariate generalisation of this univariate concept. Although, unless R(0) is diagonal, the elements of A do not equal the univariate correlation between the individual components of X(t) and X (t), we have the following inequality under Assumption 1: , where, for two s×s matrices, B and C, we use B<C to mean that C-B is a positive-definite matrix. Under Assumption 1, an explicit expression for A may also be written down by appealing to the representations (1) and (6). We have: Hence, A=0, if and only if B(j)=0, all j>0 and X(t)=ε(t) is a purely random process. Moreover, if ρ(C) denotes a suitable scalar function attached to a matrix C, it follows from the definition (8) , the stronger is the association between X(t) and X (t). Hence, in this sense, A provides information about the linear interpolability of the process, {X(t)}, from a knowledge of its complete past and future, {X(t-v),v≠0}. Battaglia [13] suggests the following multivariate index of linear determinism:  [13] also studies several properties of this index and relates D A to analogous R 2 measures suggested previously in the literature.
An alternative measure of linear interpolability of X(t) may be constructed by appealing to the additivity property of the trace operator, tr[B+C]=tr[B]+tr[C], and utilising the decomposition (7). This leads to the following alternative index of linear determinism for a multivariate stationary process: Ri − and R(0) and for s=1, they both simplify to a univariate index of linear determinism, , A U say, considered by Battaglia and Bhansali [10] and Battaglia [16] . Indeed, for s=1, an explicit expression for U A may be written down in terms of the coefficients, b(j) and a(j), say, of the moving average and autoregressive representations, (1) and (3), for the process. We have, if s=1, and the interpolability of the process is seen to be a monotonic increasing function of the product of the sums of squares of the coefficients in both moving average and autoregressive representations of {X(t)}. Next, under Assumption 1, we relate X (t) to the optimal one-step liner predictor, of X(t), based on the infinite past, {X(t-j),j≥1}. On arguing as above and noting that is the one-step prediction error and E{ε(t)X(t-j)'}=0, for all j≥1, it readily follows that denotes a normalised measure of association between X(t) and its linear one-step predictor, 1 ( ) X t , we have The question of how much additional information about X(t) is gained from the 'future values' {X(t+j),j≥1}, relative to the infinite past, {X(t-j),j≥1}, or equivalently from the current and future innovations, {ε(t+j),j≥0}, may thus be investigated by relating the measures P and A to each other. Battaglia [16] earlier examined this question for s=1, by assuming that X(t) follows an ARMA model and below we generalise his results to s>1 and to the class of linear processes satisfying Assumption 1; also, for s=1, Bhansali [17] examines the asymptotic distribution of an autoregressive estimate of P and of related multistep predictability measures. It is readily seen that Hence, A-P may be interpreted as a normalised measure of association between X(t) and its linear estimate based only on the current and future innovations, {ε(t+j),j≥0}. We have, for s=1, and for a fixed 2 , σ the overall magnitude of the contribution of the future, {X(t+j),j≥1}, relative to that of the past, {X(t-j),j≥1}, is a monotonic increasing function of the sum of squares of the autoregressive coefficients, {a(j),j≥1} and by contrast, that of the past alone is a monotonic increasing function of the sum of squares of the moving average coefficients, {b(j),j≥1}.
Vector ARMA models: Next, we examine the behaviour of the inverse process, {Yi(t)}, when the spectral density function of {X(t)} is a rational function and discuss possible applications of the inverse correlation function for specifying parsimonious parametric models for vector time series. Thus suppose that {X(t)} satisfies the following assumption: Assumption 2: That {X(t)} is a discrete-time vector autoregressive-moving average process of order (p,q), VARMA(p,q), where the Φ(j) and Θ(j) are matrices of coefficients such that, if  (1) and As is well-known, however, the problem of how to specify a VARMA model from its spectral density function is a delicate one and in particular, Assumption 2 only specifies an equivalence class of VARMA models. If, for example, (13) and it is not possible to distinguish between the VARMA models with transfer functions specified by equations (12) and (13).
Although it is still possible to ensure that a given VARMA(p,q) model is identified [18] , by prescribing a rule for choosing a unique member of the equivalence class, such a rule does not eliminate the possibility that a different VARMA model with the same spectral density function, but with possibly a different pair of L(z) and H(z) and also a different order, (p,q), may not be found.
As a consequence, there has been much development, see Hannan and Deistler [19] , in specifying an echelon-form VARMA model by examining the dependence structure of the rows of a block-Hankel matrix with elements consisting either of the impulse response coefficients, B(j), or, equivalently [20] , of the covariances, R(u).
Under Assumption 2, the inverse spectral density function of {X(t)} is given by: Therefore, as noted by Battaglia [13] , unless the matrix polynomials, L(z) and H(z), commute, L(z)H(z)=H(z)L(z), the inverse process, Yi(t), may not be expressed in the form of a standard VARMA model defined by equation (11). Although the class of VARMA models with commuting AR and MA polynomials is not sparse and it, for example, includes the univariate ARMA as well as the pure VAR and pure VMA models as members, this particular requirement does seem to severely limit the possible applications of the inverse correlations for specifying a VARMA model.
Below we discuss how the inverse correlations may still be applied for specifying parsimonious parametric models with a rational impulse response function, k(z): Assumption 2 specifies [20] , a two-fold restriction on the impulse response function, k(z), as follows: 1). k(z) is a rational matrix, in the sense that its elements are rational functions of z; 2). k(z) admits a left matrix fraction description defined by equation (12). Condition 2) above ensures that a given k(z) defines an impulse response function of a VARMA model. It is possible, however, for a rational matrix to admit a right matrix fraction description, of the type satisfied by k(z)' [21] . Indeed [22] , the existence of a left matrix fraction description does not guarantee the existence of the right fraction description and vice versa. Moreover, as the simple example of a VARMA(1,1) model shows, even when both descriptions are known to exist, it is not straightforward to express one such description in terms of the second description. The case of commuting matrix polynomials, L(z) and H(z), discussed above, in this sense, is exceptional since, under this hypothesis, k(z) admits both these descriptions simultaneously and the parameters for both these descriptions may conveniently be specified in terms of the parameters of only the left matrix fraction description.
The question of how to specify a suitable parametric model for a linear process whose impulse response function admits a right matrix fraction description has so far received little attention in the literature. Next, we show how the inverse correlations are useful in this case. The argument given below applies even when the polynomials L(z) and H(z) commute.
Suppose that k(z) admits a right matrix fraction description as follows: it is readily seen that k(z) is a rational matrix and X(t), is a linear process satisfying Assumption 1. Although, as in Barnett [21] , a left matrix description of k(z) still exists and it may be constructed from the Smith form of , the resulting matrices could be highly complex functions of the parameters of L R (z) and H R (z). A right matrix fraction description for k(z) could, however, be constructed by applying the existing methods of constructing a VARMA model to the inverse correlations of X(t), that is, to the corresponding inverse process, Yi(t). Indeed, it readily follows that now admits a left matrix fraction description as follows: and the inverse spectral density function is given by Thus, the inverse process, {Yi(t)}, follows a VARMA model, but running 'forward' in time. Moreover, a backward representation for the inverse process, {Yi(t)}, may be obtained if instead of the backward representation, , the forward representation, (4), is considered. Hence, the standard techniques for specifying a VARMA model could be applied to the inverse correlations and a corresponding right matrix fraction description for {X(t)} obtained. Details are omitted to save space.

Inverse correlations and an index of linear determinism for gaussian random fields:
be a stationary Gaussian random field, where N 2 =NxN is a two-dimensional space defined as follows: , the spectral density function of this random field. Assume that the covariance function, R(u,v), is absolutely summable, , and ) , ( f 2 1 µ µ is non-vanishing, defines the inverse spectral density function of {X(s,t)} and a result of Wiener shows that [23] , ) , ( fi 2 1 µ µ admits a Fourier expansion: Below we discuss how the results described above for multiple time series generalise to a stationary Gaussian random fields and introduce an R 2 measure, called an Index of Linear Determinism, of linear interpolability of the field. The inverse correlation function of a Gaussian Markov random field is also considered by Yuan and Subba [24] . The main focus of these authors' work is, however, slightly different from ours; for example, the question of how to measure the interpolability of a general Gaussian random field is not discussed there and nor is the associated question of model validation, assessment and criticism of a Gaussian Markov Random Field model, GMRF for short, fitted to an observed realization of this field, especially when the stochastic structure of the process generating the data is unknown.

Linear interpolation and gaussian markov random fields:
Suppose that X(s,t) is unknown for a fixed .
denote the best interpolator of X(s,t), in a linear leastsquares sense, when {X(u,v), (u,v)≠(s,t)} are treated as known. Let denote the interpolation error. Then, since ζ(s,t) is uncorrelated with {X(u,v),(u,v)≠(s,t)}, the θ(u,v) are readily seen to be the solutions of the following equations: denotes the interpolation error variance, Equations (19) and (20) may be solved by Fourier methods [2] . We have θ(u,v)=ri(u,v), and (22) For a random field, unlike a time series, there is no distinction between 'the past' and 'the future' and X(s,t) may be thought of as being influenced by, i.e., correlated with, all its neighbouring sites, {X(s-u,tv),(u,v)≠(0,0)}. The analysis described above, therefore, provides a theoretical framework for constructing suitable stochastic models for Gaussian random fields, in much the same manner as the classical Wiener-Kolmogorov Prediction Theory for time series provides the basis for ARMA and related finite parameter time series models. Besag [25] suggested the use of (Homogenous) Gaussian Markov Random Field models, GMRF, by a covariance selection procedure, such that only a finite number of elements in the inverse of the covariance matrix of the field are non-zero and the rest vanish. The spectral density function for this class of models is given by, Besag and Moran [26] , where θ(0,0)=1, S denotes the set of nearest neighbours with which X(s,t) interacts and 2 τ is a constant.
From the results (21) and (22) given above, it readily follows that [24] a GMRF model so specified possesses the following important properties: , Model specification and diagnostic testing for GMRF's: For observed spatial data, however, the set S of neighbouring sites with which a given X(s,t) interacts will usually be unknown, even though some prior information concerning how to formulate this set may well be available. Also, there may not be a 'true' Markovian model generating the observed data and an analyst may simply seek to postulate a GMRF which captures the main interactions that could be present in the data by explicitly recognising that the constructed model probably may not represent the true data generating process, but it merely provides a convenient approximation to this process. These considerations suggest that a non-parametric R 2 measure which is applicable to a wide class of Gaussian Random Fields and hence which could be estimated without assuming a GMRF model for the data, could be helpful in assessing the usefulness or otherwise of a specified model. A non-parametric R 2 measure of this type would, for example, provide a yardstick against which the actual R 2 measure of a postulated parametric model may be compared and the usefulness of the model judged. If, say, the estimated R 2 measure of a relatively parsimonious model is sufficiently close to the estimated non-parametric R 2 measure, then this model may be accepted as being suitable for the data. If, however, the two estimated R 2 measures do not appear close enough, then a more complex model could be fitted and the R 2 measure implied by this model estimated and compared with the estimated nonparametric measure. The analysis may be continued in this way until a suitable model is chosen. Moreover, the model specification procedure just described may be further strengthened by validating the chosen model from an analysis of its residuals. For a GMRF, a suitable diagnostic testing procedure is suggested by the property (24) of the interpolation error process. Thus, the correlations of the residuals of a chosen model may be inspected and their overall structure checked to see whether they possess this particular property. This three-stage model specification, estimation and validation procedure just outlined is close in spirit to a popular three-stage procedure proposed earlier by Box and Jenkins [27] for time series analysis. It clearly retains some of the advantages of flexibility a procedure of this type enjoys; but, it also suffers from many of the disadvantages of a subjective model selection procedure. At the same time, however, the question of how to choose a suitable GMRF model for observed spatial data, with special emphasis on model assessment and diagnostic testing, has so far received little attention in the literature. For a time series, possible effects of fitting a misspecified model have been investigated by Bhansali [28] and the results given there suggest that, even for spatial data, the likely loss in efficiency due to adopting a model chosen according to the three-stage procedure outlined above may not be serious.
A linear interpolability measure for gaussian random fields: It readily follows from (17) and (18) that [25] , for all 2 N ) t , s ( ∈ , we may write X(s,t)= X (s,t)+ζ(s,t), where s ( X is the optimal interpolator of X(s,t) for a Gaussian random field, conditional on a knowledge of {X(s-u,t-v),(u,v)≠(0,0)} and ζ(s,t), the interpolation error, is uncorrelated with X (s,t). We thus have the following decomposition, R(0,0)=var{X(s,t)}=var{ X (s,t)}+var{ζ(s,t)} (25) where . Equation (25) provides an analysis of variance of X(s,t) and shows that, for all , measures the proportion of variance of X(s,t) that is "unexplained" by the optimal linear interpolator, X (s,t). Hence the ratio measures the interpolability of the process and provides an R 2 measure of the amount of variability of X(s,t) that could be explained by a knowledge of all its neighbours, {X(s-u,t-v),(u,v) ≠ (0,0)}. It should be noted that, in common with all R 2 measures, F A has the following properties: if and only if X(s,t) does not interact with its neighbours, implying that the process is purely random and R(s,t)=0, all (s,t)≠(0,0). Conversely, if 0 A F > , then there is interaction between X(s,t) and its neighbours and A F measures the strength of this interaction; the closer this index is to 1, the greater is the interaction, in the sense that the proportion of variability of X(s,t) that could be explained by a knowledge of all its neighbours increases.
An additional application of the index, A F , is for model assessment and criticism. Thus, for a Gaussian random field, let S denote the actual set of sites with which X(s,t) interacts and hence θ(u,v)=0, for all (u,v)∉S and provided S is not a null set, θ(u,v)≠0, if (u,v)∈S. Here, S need not be a finite set and in principle, it could coincide with N 2 . Suppose now that for a given data set, a GMRF model is actually chosen and let Z denote the finite set of sites specified by this model. As regards Z, the following three possibilities arise: a). Z⊂S, that is, Z is a subset of S, or S=Z but a more restricted model than the generating GMRF is chosen; b). S=Z, that is, Z coincides with S and the chosen model also coincides exactly with the generating GMRF; c). S⊂Z, that is, S is a subset of Z and Z includes a larger collection of sites than S, or S=Z but the chosen model contains more parameters than the generating GMRF. Possibilities a) and c) above together characterize the situations in which a misspecified model, which is either underparametrized or overparametrized, is fitted. By contrast, b) describes the situation in which the fitted model coincides with the actual data generating process. Now, let A Z denote the value of the index (26) for the models specified according to possibilities a) -c) above and which may be calculated by subtracting from one the ratio of the residual error variance to the variance of the data. We have, if a) holds, Z . Under a), however, an underparametrized model may still be chosen if the difference between A Z and A F is sufficiently small. Moreover, more parameters than necessary are estimated under c) and this could lead to an efficiency loss because the increased variability due to the redundant parameters would inflate the interpolation error variance.
The model specification procedure outlined later may thus be implemented by working with a nonparametric estimate of A F and a parametric estimate of A Z , obtained from the fitted model, as substitutes for the actual unknown values of these two measures; we discuss this point further later.

Torus lattice processes:
The index of linear determinism, A F , applies to stationary Gaussian random fields defined on infinite lattices. In practice, only a finite dimensional lattice, L, of size M×M will be observed. For this situation, the results may be extended [26] , if we replace the infinite-lattice GMRF's by their analogue on M×M torus lattices and for which the top and bottom rows and the left and right columns of L, become adjacent and the spectral density function, 1 2 ( , ), f µ µ is defined only for Here, Q(θ) is known as the potential matrix; its main diagonal elements are constrained to equal 1 and the off-diagonal elements have a zero-pattern structure defined by θ, the parameter vector, in accordance with the property (1) listed above, namely, θ(u,v)=0, if (u,v)∉S and θ(u,v)≠0 if (u,v)∈S.
Thus, for a GMRF defined over a torus lattice, the interpolation error variance, 2 , τ is parametrised separately from the parameters, θ(u,v) and which now define [9] , the inverse correlations of this GMRF. The variance, R(0,0), of this GMRF may be defined by following Fuller [29] and noting that the λ It thus follows that [26] , Hence, for a GMRF defined over a torus lattice, an index of linear determinism is defined by It is readily verified that the index, A L defined above has the same properties as the index, A F defined over an infinite lattice. Moreover, a parallel index may also be defined when the generating stochastic structure of the random variables defined over a torus lattice is unknown and a possibly mis-specified parameter vector, θ, is chosen by an analyst at the model estimation stage. The resulting index now plays the same role as the corresponding index, A Z defined above the details are omitted.

Estimation of the index of linear determinism for a gaussian random field:
is a Gaussian random field satisfying assumptions (14), (15) and (16), but only a finite realization, {X(s,t), s,t=0,1,...,M}, is observed. An estimate of the variance, R(0,0), of this realization is given by: Also, as in Yuan and Subba Rao [24] , a nonparametric smooth periodogram estimate, The corresponding non-parametric estimates of the inverse correlations are now given by: and which also provide non-parametric estimates of the interpolation coefficients, θ(u,v), for a Gaussian random field, without invoking assumption (23), that is, without requiring that the field is Markovian. The corresponding non-parametric estimate of the index, F A , may now be constructed from the estimates where for convenience, we suppress the dependence of F Â on M.
On the assumption that a torus lattice is observed, a maximum likelihood procedure is applied for estimating the parameter vector, 2 [ , ]' τ θ , of a GMRF over the valid parameter space. As in Besag and Moran [26] , Kashyap and Chellapa [30] , the potential matrix, Q(θ), is readily diagonalized with a twodimensional discrete Fourier transform and the loglikelihood maximized over the valid parameter space numerically. Let . The residual covariances may then be estimated in the standard way [31] , by using a formula analogous to (30) but with 2 ( , ) As discussed earlier, if the specified model is suitable for the observed data, then the non-parametric estimate, If an estimated model does not possess the aforementioned properties then a different model may be fitted and the procedure described above repeated until a suitable model is chosen. The three-stage model specification, estimation and validation procedure described earlier, could thus be implemented when only a finite realization of a GMRF is observed.
As an alternative to a maximum likelihood procedure, a pseudo-likelihood estimation procedure, which permits a trade-off between efficiency and simplicity, may also be used for parameter estimation [26,32,33] .

SIMULATION RESULTS
For illustrating the efficacy of the three-stage model specification, estimation and validation procedure described earlier, especially with observed realizations of small to moderate sizes, M, a simulation study was carried out. The study was designed to give only an indication of the different types of behaviour the various estimates exhibit under particular conditions and it is not intended to be exhaustive.
A second-order GMRF with parameter values, θ(1,0)= -0.1, θ(0,1)= -0.095, θ(1,1)= -0.092, θ(1,-1)= -0.106, τ 2 =1 and toroidal boundary conditions was simulated by using a discrete Fourier transform method described by Kashyap and Chellapa [30] and Dubes and Jain [34] . Two different values of M, namely M=64 and M=128, were used and for each M, 1000 different realizations were generated. The parameters of the generated second-order model were estimated for each realization by maximum likelihood, ML and also by a pseudo-likelihood procedure, PL and the corresponding parametric estimate, L Â , of the index of determinism, A L was computed. A non-parametric estimate (NP), , F Â of the index was also obtained in accordance with (30) and (31); following Yuan and Subba Rao [24] , the two-dimensional spectral density, 1 2 ( , ), f µ µ was estimated with truncation point (25,25), but we used the Tukey-Hanning window instead of the related Parzen and Bartlett-Priestley windows used there.
For each value of M, the simulated means and variances of the parameters estimated by these three methods were evaluated over the generated 1000 replications and the corresponding values of the simulated bias and simulated standard errors were found; the former was calculated as the difference between the simulated mean of each parameter estimate and its true value and the latter as the square root of the simulated variance after dividing this variance by NRP=1000, where NRP denotes the total number of replications. Table 1 shows the simulated bias and simulated standard errrors of the estimated parameters by the three methods for M=64 and Table 2 shows the same information for M=128.
For M=64, the performance of both maximum likelihood and pseudo-likelihood methods is very similar, with only a slight difference in their respective standard deviations. The non-parametric method, based on the window spectral estimate, by contrast, is inferior to these two methods in terms of its bias, though there is little to choose between them in terms of their standard deviations. The bias of the non-parametric method, however, decreases substantially when the lattice size increases to M=128, though the size of this bias is still somewhat larger than that of the maximum likelihood and pseudo-likelihood methods. Table 3 shows, with M=64,128, the simulated means and standard errors of the estimated index for the three methods. The corresponding theoretical value of the index, A L for the parameterization used here, may be evaluated using equation (28) and it is given by A L =0.1359.
It may be gleaned that the simulated means of the two parametric estimates, even with M=64, are quite close to the theoretical value. The non-parametric estimate of the index, by contrast, is badly biased for M=64. However, for M=128, the bias of the nonparametric estimate is substantially smaller and comparable to that of the two parametric methods. This last result is not surprising since the window spectral estimation method is known to perform relatively poorly with realizations of small lengths.
The simulation results broadly support the model assessment and criticism procedure based on a comparison of the non-parametric and parametric estimates of the index of determinism described earlier.
The efficacy of the model validation procedure given was also investigated by an analysis of the residuals obtained by fitting the full second order model by maximum likelihood and computing the residual correlations. The numerical results are shown in Table  4. It may be seen that the behaviour of the simulated means of the residual correlations accords with the theoretical behaviour described. For the lags included in the set S, the mean residual correlations are quite close to the actual parameter values of the generated model. By contrast, for the lags not included in S, the simulated means are close to 0.
For the same simulated realizations of the secondorder model, we also applied the maximum likelihood procedure for fitting three underparametrized models defined as follows:                   , denote the theoretical interpolation errors when Model 1 is approximated in a linear least-squares sense by Models 2, 3 and 4, respectively.
Here, Model 2 is a two-parameter homogenous first-order GMRF, while Models 3 and 4 are obtained from Model 1 by specifying certain isotropic restrictions. In particular, Model 3 is a two-parameter model with 1 θ ɶ describing the horizontal and vertical interactions and 2 θ ɶ the diagonal interactions. Model 4 is a further simplification of Model 3 and specifies only one parameter for all four interactions; it thus assumes that the magnitudes of these interactions are equal and the first and second order effects may simultaneously be captured by just one parameter. Table 5 shows the simulated means and standard errors of the estimated indices, , L Â in 1000 realizations, each of size M=128, for these three misspecified models. The simulated means for the two isotropic models, Models 3 and 4, are quite close to those shown in Table 3 for the generated Model 1, though the simulated means for Model 4 are slightly smaller than those for Model 3 and this could be because the former uses fewer parameters than the latter. By contrast, for Model 2, the simulated means are noticeably smaller than the non-parametric estimate. The simulation results suggest that, for the parameterization used here, both Models 3 and 4, are probably as effective as the generated second order model with four parameters for describing the spatial correlation structure of the data and for estimating the possible effects of the nearest-neighbours. By contrast, however, the homogeneous first-order model is probably not as effective in capturing the diagonal interactions and it may be improved by choosing either of the two isotropic models or the full second-order model.
The model validation procedure described earlier was also applied to the residuals of Models 2, 3 and 4 by examining the simulated means of their residual correlations. The results are shown in Tables 6, 7 and 8. For Model 2, let S-Z ={(u,v),u,v=±1}, denote the set of sites included in S but excluded from Z={(0,-1),(0,1),(-1,0),(1,0)}. It may be seen that the residual correlations at all lags (u,v) such that (u,v)∈S-Z, are non-negligible, especially relative to their values at lags {(u,v)∉S}. This finding accords with the procedure described in and provides further evidence against adopting Model 2 for data generated from Model 1. By contrast, the residual correlations for both Models 3 and 4 are negligible at lags {(u,v)∉S} and the results provide further evidence that either of these two models may not be excluded as being implausible, even when the observations are generated from Model 1.
The simulation results thus broadly support the three-step procedure suggested in this paper for choosing a suitable GMRF model. This procedure is, however, not a model selection procedure and it does not enable a unique model to be chosen from within the class of models being considered. Rather, it helps in identifying plausible models which could be validly considered as being suitable for the observed data and the final choice among them could well depend upon the actual purpose of the analysis. Further research is nevertheless needed in developing suitable statistical tests for a residual analysis of GMRF and in deriving the sampling distributions of the various measures of linear determinism proposed earlier.