A Systematic Procedure for Square Random Lattice Sampling

Corresponding author: Girish Chandra Division of Forestry Statistics, Indian Council of Forestry Research and Education, Dehradun, India-248006, India Email: gchandra23@yahoo.com Abstract: An attempt has been made in this paper to improve the method ‘Square Random Lattice Sampling’ by suggesting a systematic procedure for achieving the goals under controls beyond stratification. The estimator of mean and its variance are calculated. Results shows that the relative precision of the proposed procedure increases as size of the latin square and the number of latin squares along the main diagonal of the population increases. This procedure seems to be a good alternative of the above method of for large population. Three different examples are also discussed to demonstrate the relative precision of the proposed method.


Introduction
Although the idea of lattice sampling was initially discussed by Patterson (1954;Yates, 1946;1960), its need was long felt. The basic idea of lattice sampling is derived from 'controls beyond stratification'. In many situations, it is desirable to stratify the population on the basis of more than one stratification variables (multi-stratification) which often leads to more strata cells than can be accommodated in a oneway stratified design. For an instance, when economic considerations restrict the number of primary selections to x, then x/2 is the upper limit on the number of ordinary strata that can be created for paired selection, i.e., when variance estimation is required and x strata is the limit for single selection per stratum. However, sometimes it is required to have greater control over the selections than can be expressed with only x strata. For example, a national sample of interviews in the United States may need to be restricts to 100 counties, because of the cost of staffing counties. But to satisfy various requests for adequate controls over the distribution of sample counties would require many more strata than 100. These problems for national samples of counties are discussed by Frankel and Stock (1942;Goodman and Kish, 1950). A controlled selection of hospitals is described by Hess et al. (1961) whereas a controlled sample of cities of the U. S. Bureau of Labor Statistics (1961) for its price index by Stigler (1961).
Even a generation ago, conflicting needs of controls and randomization were widely thought to be unconflicting, as can be seen on the debate between purposive and random methods in the Bulletin of the International Statistical Institute (1926) compiled by Jensen (1926). Demands for control often outrun the limitations of stratified random sampling, leading to the need for purposive selection, abandoning probability sampling all together. Several methods have been proposed by different authors for imposing more controls within the requirement of probability sampling. In fact, any departure from Simple Random Sampling (SRS) can be considered as a control. Again taking the example of stratified random sampling for the achieving control which provides us the opportunity to represents all the homogeneous subgroups within a heterogeneous population. Systematic sampling may be taken as another extreme of achieving controls in which only k samples in the sample space (preferred samples) having equal probabilities of selection 1/k are left. The have zero probability of selection. In the similar way, two stage sampling is another method of achieving controls in which first stage units are selected with some specified probabilities, but the selection themselves are made on the basis of some prior restrictions (or controls). Following the lead by Patterson (1954;Yates, 1946;1960), the lattice sampling was further viewed by Jessen and co-authors under the label of "Lattice Sampling", "Two-way Stratification", "Deep Stratification" and "Multi-stratification". In the next section those procedures are reviewed for drawing a sample, which permits cross classification restrictions to be met with less sample units than a one-way design.

Controlled Sampling Procedures: Lattice Sampling
The basic idea for increasing the precision of the sampling design is to impose some restrictions (or controls) while selecting the sample. Due to these restrictions the probabilities of selection of preferred combination of units is increased and consequently decreases the selection probabilities of non-preferred (undesirable) combinations. In addition to the precision, three expected advantages of controlled sampling are shown by Hess and Srikantan (1966) as: • Controls may be imposed to secure proper distribution geographically or otherwise and to ensure adequate sample size for subgroups that are domains of study • To secure moderate reductions in the sampling errors of a multiplicity of characters simultaneously • The significant reduction of the sampling error in the global estimates of specified key variable Different selection procedures have been used by different authors. Some of the important works on this field are discussed as follows. Frankel and Stock (1942) discussed the multistratification techniques in detail and studied their uses in data collection. In particular, they considered the possibility of using sample design in which the latin square principle can be used to reduce the number of sample units necessary to represent all strata. For example, suppose two criteria for stratification are used, say X and Y, such that L strata can be constructed from the X characteristic and, within each of these, L from the Y characteristic. If one relates the resulting pattern to a single treatment of an L × L latin square, it is obvious that in a sample of L sample units of the LX strata will be represented and likewise each of the LY strata. Further, Tepping et al. (1943) discussed such designs under the title "Deep Stratification" in some detail and compared their variances with the variances of single stratification sampling. Yates (1946) discussed the problem of selecting twoway stratified sample. Using ANOVA arguments, he suggested that the variance in two-way stratified sampling is smaller than that in the one-way stratified sampling. However, the scheme in the latter work of Yates (1953) resembles with the design scheme of Frankel and Stock (1942), in which the principle of latin square is used to select sample units. He also discussed the frame work in three-dimensional sampling scheme in which the additional dimension (strata) of vertical level (or file), can be represented in the sample by choosing r 2 L units out of the L 3 units in the population, if r units are replicated in each strata. Patterson (1954) extended the work of Yates (1953), particularly with respect to the estimation of errors. He also suggested four methods for selecting samples with "Control" on both sets of strata in two-way stratification techniques. All these four methods are based on the number of replication from each row and each column and the type of lattices. However all four methods provide equal selection probabilities for each element in the population, therefore the sample mean will be an unbiased estimator of population mean for each method. But these four methods have distinguishing characteristics on sampling variance and in their ability to provide suitable estimates of sampling variances. Goodman and Kish (1950) developed a procedure under the name "Controlled Selection" which assigns a probability of selection to each of several possible samples so that 'preferred' combinations of units are given a higher probability than 'non-preferred' combinations of units. Bryant et al. (1960) proposed a procedure under the title "Two-way Stratification" in which there are two stratifying criteria, both of which are desirable in a sample design. They used such procedure only when two variables are used for stratification and each has the same number of levels. In this situation, the number of permitted observations may be less than the number of strata formed by the usual double stratification technique. They also showed that if the stratification effects are additive in the analysis of variance sense, the method is particularly more useful. Bryant (1961) discusses various examples under "Multi-dimensional Controlled Selection" in which the number of strata cells exceeds the permissible sample size. In a survey of fish catch, he used four different types of strata viz. location, day time, season of summer and type of the day and likewise for others. He also extended the work of Bryant et al. (1960) for a well known pole mountain study (1961) in which he had developed the sampling and estimation technique. This study had four dimensions and the sample size n = 46. In this situation a four-dimensional cube of size 46 4 in population is required. The selection procedure is such that, the sample of cells is chosen in four dimensions so that once a row, column, level or flat is selected, it cannot be chosen again.
The techniques discussed above are based on the equal probability of selection of each strata cell. Jessen (1973) developed a technique under the name "Probability Lattice Sampling", in which each strata cell has unequal probability of selection, proportional to the size of the strata cells. Both equal probability and unequal probability methods were covered in Jessen's (1969;1970;1973;1975) papers which are compiled by Jessen (1978). We therefore, restrict ourselves only to Jessen's work. First, we summarized the procedure of lattice sampling in the following subsections.

Equal Probability Method
When only one variable is used for stratification and each stratum (row and column) has equal number of levels, say L, there are L 2 strata cells. In this situation, sample size n = rL, where r cells are to be selected from each row (column) of a square lattice of order L × L. This procedure is called simple random lattice sampling. Suppose for a simple case, there are 16 elements arranged into 4 rows and 4 columns. We wish to draw a sample of size 4 (for r = 1) from the 4×4 universe. The particular lattice design, simple random simple stratification (using rows or column as strata), is given in Fig. 2.1.
An extension of simple random lattices is square random lattices. When two variables (row and column) are used for stratification and each has the same number of levels, say L, there are L 2 cells in the design. Jessen (1975) proposed two techniques for drawing samples of size n = rL, where r cells are selected from each level of each variable (dimension). The first technique is 'general lattice' technique which starts with a selection of r cells at random from the L cells of the first row; likewise at random, select r cells from the L cells of the second row. This process is continues until r cells have been selected from one of the next row which contains only L-1 permissible cells. The selection procedure continues until the last row has been covered such that from each column only r cells have been selected in the sample.
With this selection procedure, the variance of the sample mean per element, y is given by: .
Where: y gh = The value of the characteristic (y) under study occupying row g and column h: Generally, this scheme is a more efficient or very close to SRS or simple stratified (either row or column as stratification variable) sampling (Jessen, 1975).
The alternative technique, the 'latin lattice' is based on the assumption that L/r is a positive integer. Then r × r squares are designated along the diagonal of the L × L square, known as 'latin squares'. In case of L is odd and r is even, the diagonal can be filled by using a combination of r × r complete squares and (r + k) × (r + k) incomplete squares for the appropriate value of k. This particular scheme (before randomization) may be improved by permuting rows and columns randomly to obtain the after randomization design. The selection scheme for these methods for a double lattice (r = 2) is illustrated in the Fig. 2

.2.
However, both the method gives identical results; we prefer latin lattice technique as an additional degree of freedom (r degrees of freedom) is available for the estimate of the variance (Jessen, 1975). When two variables used for stratification have unequal number of levels, the square lattice concept can be extended to rectangular lattices of say R rows and C columns. Jessen (1978, chap. 11) discussed the method of selecting n = rt, t being the larger of R and C, cells from such design. In such case; the latins generated along the diagonal will be rectangular or square, depending upon the dimensions of the rectangular population under consideration.

Unequal Probability Method
Sometimes the cells of a cross-classification contain an unequal number of units. It may then often desirable to sample the cells in a manner that reflects this unevenness. Such situations may arise where the sampling is done in two stages. First a sample of the cells is chosen and then a sample of units is drawn from the selected cells. Jessen (1970;1973;1975) discussed the situation where the cells of a two-way or a threeway cross-classification did not contain the same number of population units and it is desired to select the sample with the desired conditions which is termed as 'unequal probability lattice'.
Each lattices made under this situations are often called 'simple probability lattices'. The lattices can be extended to square probability lattices and cubic probability lattices by an additional dimension, although procedure becomes more complex by using three or more dimensions. If accuracy can be increased by extending stratification from one-way to two-way, then it appears that even further gains are possible by going on to three or even more dimensions, provided, that there are sensible factors to stratify with.

Marginal Stratification
In both the methods discussed above, it is assumed that the cell size is known in advance. When they are not, but the margin sizes are known to us to attempt some sort of stratification, taking account of the information on margin sizes. This case arises when we may have given information on a series of one-way classification but none on two-way or crossed classification. Tiwari and Nigam (1998) developed a procedure for selecting samples for random and probability lattices in square and rectangular frames without taking the restrictions n = rL and n = rt given by Jessen (1978) for square and rectangular lattices respectively. He also suggested estimates for the variance estimation for both the cases.

Systematic Square Random Lattice Sampling
When we deal either with the general lattice/latin lattice technique under square random lattice sampling, the following two limitations are encountered: • There is no systematic procedure to achieves the sample from each row and each column, hence it is arbitrary, time consuming and costly. • The method of general lattice requires to check the condition at every step after the procedure takes place in first (r-1) rows that r units (cells) being selected from each column. Also due to nature of the method of latin lattice (before randomization), some biasness may occurred. The method latin lattice (after randomization) is although preferable over the general lattice method, yet the number of steps involved in the procedure is quite large and there is no definite rule for permuting the rows and columns (after the case before randomization) In the present paper, we have attempted these two issues observed in the Jessen's work and suggested a convenient procedure for selecting samples using systematic sampling after applying the general lattice sampling with r = 1 on the latin squares along the main diagonal. Following subsections deals with the proposed method and comparison with SRS and square random lattice sampling.

The Proposed Plan
As we know the method of 'latin lattice' is preferred over the 'general lattice' due to the availability of an additional degree of freedom for latin lattice while estimating the variance. In this section, we have proposed an improved latin lattice method which overcomes two limitations, as discussed above.
Let us consider a two-way population frame consisting of N = L × L units. A sample of size n is to be drawn from this population utilizing two stratification variables (row and column) and satisfy the two basic restrictions, viz: Here, the whole population may be partitioned into r equal parts on the basis of rows and columns both. This results r 2 latin squares of same size k × k which are shown in matrix form (Equation 3.1) as: where, A ij is the sub-matrix of ' s ij a whose first element is ( 1) 1,( 1) 1 i k j k a − + − + . Now, the proposed selection procedure of units, consisting of two steps, is described as follows: Step 1: Select k units from each latin squares placed along the main diagonal of A, i.e., A 11 , A 22 ,... A rr using the general lattice technique (i.e., r = 1).
Step 2: Select every kth unit using systematic sampling from the selected units in the Step 1 along the corresponding rows and columns of the whole population frame. This step shall cover all the latin squares in addition to the latin square along the main diagonal.
To illustrate this procedure, let us consider a 6×6 population A = (a ij ) 6×6 with r = 2 and k = 3. This results total 4 possible latin squares of size 3×3. In the first step of this example, the units a 13 , a 21 , a 32 , a 44 , a 56 and a 65 are selected from the latin squares A 11 and A 22 using the method of general lattices. The step 2 starts with selecting every 3rd (k = 3) unit systematically along each row and each column of matrix A from the selected units under step 1. The units selected under step 1 and step 2 are shown in Fig. 3. In this particular example, the step 2 cannot be applicable for A 22 due to non-availability of cells for selection.
The total number of possible samples in this proposed procedure will depend on arrangement of cells of latin squares A 11 , A 22 ,... A rr . For r = 1, there are a total of k! possible combination of cells to be selected from each of these latin squares under step 1. Hence, in general, total number of possible samples shall be (k!) r .

Mean and Variances
For the sake of convenience in mathematical calculations, the matrix A (Equation 3.1) can be rewritten as Equation 3.2, in which latin squares are numbered continuously according to the rows only. This is done by substituting 1 in place of first subscript and {r(first subscript -1)+ next subscript} in place of next subscript: Now, we use the following notations: Now, all the latin squares under the proposed procedure are random and may be taken as a single unit and therefore the variance of y is calculated in parallel with SRS and is written as:  This implies that the proposed procedure is always more efficient than SRS for large values of r and k. Sometimes, for small values of r and k the proposed procedure may be weaker than the SRS in terms of relative precision (as shows in the Example 1 in the next section). The Equation 3.6 also indicates that as within latin square variance increases and (or) between latin square variance decrease, the Relative Precision (RP) with respect to SRS increases rapidly.

Comparison with Square Random Lattice Sampling
The variance of y under square random lattice sampling is given in Equation 2.1 in which we have: We know that L = rk.
For k≥r it is clear that the first term of (3.7) is greater than the first term of (3.8) as: This comparison shows that the RP with respect to square random lattice sampling increases as the value of k increases more than r.

Numerical Examples
In this section, we have considered three examples (4×4, 6×6 and 9×9) to show the performance of the proposed systematic square random lattice sampling in comparison with SRS and square random lattice sampling.

Example 1
This small example with r = k = 2 is borrowed from Jessen (1975 The values of these variances shows that for small values of r and k the proposed procedure is not performing well in comparison to SRS and square random lattice sampling. This performance can be calculated in terms of RP as: RP with respect to square random lattice sampling is:

Example 2
This example consist of 6x6 frame simulated from the standard normal distribution for selecting the sample of size 12 with r = 2, k = 3 using SPSS software. The data is shown in the following Table 2.
For this example, we get:

Example 3
The higher order population of size 9×9 is generated from the uniform distribution U(0, 10) as shown in Table 3. For this example, the sample size is taken 27 with r = k = 3.
The final calculation of variances is comes out to be:

Conclusion
We propose a method of selecting samples under the two way stratification structure of the population. In many forest surveys, it is observed that the forest area (generally taken as a forest stand which may be national park, biosphere reserve etc having area more than 500 sq. km) under study is very large. The sampling is usually done by quadrate (of size say 0.1 hectare) method under two way stratification. In such situation, the number of rows and columns are increases instantly as forest area increases. Therefore, a systematic procedure is required without checking the number of units selects from each row and column in every steps. The present procedure may be helpful under such situation when population size is large. In this procedure, we require only the latin square placed across the main diagonal of the population for applying the systematic sampling for selecting the units from other latin squares. The second advantage of the large population size is that the increase of relative precision of the proposed procedure as population size increases. We recommend that the size of the latin square should be larger than the number of latin squares placed across the main diagonal of the population. Finally, it may be said that the proposed procedure is a good alternative of the procedure of squared random lattice sampling as proposed by Jessen (1975) for large population.