ADAPTIVE CLUSTER SAMPLING USING AUXILIARY VARIABLE

In this study we study the estimators of the popula tion mean in adaptive cluster sampling by using the  information of the auxiliary variable. The estimato rs in this study are the classical ratio estimator, the ratio estimator using the population coefficient of varia tion and the coefficient of kurtosis of the auxilia ry variable, the regression estimator and the difference estimator. Simulations showed that the difference estimator ha d t e smallest estimated mean square error when compared to the ratio estimators and the regression estimato r.


INTRODUCTION
Adaptive cluster sampling, proposed by Thompson (1990), is an efficient method for sampling rare and hidden clustered populations. In adaptive cluster sampling, an initial sample of units is selected by simple random sampling. If the value of the variable of interest from a sampled unit satisfies a pre-specified condition C, that is {i, y i ≥ c}, then the unit's neighborhood will also be added to the sample. If any other units that are "adaptively" added also satisfy the condition C, then their neighborhoods are also added to the sample. This process is continued until no more units that satisfy the condition are found. The set of all units selected and all neighboring units that satisfy the condition is called a network. The adaptive sample units, which do not satisfy the condition are called edge units. A network and its associated edge units are called a cluster. If a unit is selected in the initial sample and does not satisfy the condition C, then there is only one unit in the network. A neighborhood must be defined such that if unit i is in the neighborhood of unit j then unit j is in the neighborhood of unit i. In this study, a neighborhood of a unit is defined as the four spatially adjacent units, that is to the left, right, top and bottom of that unit as shown in Fig. 1. Figure 1 illustrates the example of a network. The unit with a star is the initial unit selected. The condition to adaptively added units is a value greater than or equal to 1. Units that are to the left, right, top and bottom of one another make up a neighborhood. The units in the gray shading form a single network. The units in bold numbers are edge units of the network. The network and its edge units make up a cluster.
Sometimes other variables are related to the variable of interest y. We can obtain additional information for estimating the population mean. Use of an auxiliary variable is a common method to improve the precision of estimates of a population mean. In this study, we will study the estimator of population mean in adaptive cluster sampling using an auxiliary variable. Some comparisons are made using a simulation.

Simple Random Sampling Using Auxiliary Variable
Let y be the variable of interest defined on the finite population and the population consists of a set of N units {u 1 , u 2 ,….u N }index by their labels S = {1,2,…,N}. With unit i is associated the variable of interest y i and the auxiliary variable x i . The population mean of y is The ratio estimate of the population mean of y is: The approximate MSE of the ratio estimate of the population total of y is: where, n is the sample size, , is the population ratio, 2 x S is the population variance of the auxiliary variable, 2 y S is the population variance of the variable of interest and xy S is the population covariance between the auxiliary variable and the variable of interest. Sisodia and Dwivedi (1981) suggested the ratio estimator for the population mean of y as: The approximate MSE of R _ SD y is: , C x is the population coefficient of variation of the auxiliary variable, C y is the population coefficient of variation of the variable of interest and ρ xy is the coefficient of correlation between the auxiliary variable and the variable of interest. Singh and Kakran (1993) suggested the ratio estimator for the population mean of y as: x 2 x µ δ = µ + β and β 2 (x) is the population coefficient of kurtosis of the auxiliary variable. Upadhyaya and Singh (1999) considered both coefficient of variation and kurtosis in ratio estimator as: The approximate MSE of R _ UK 2 y is: The regression estimate of the population mean of y is: The approximate MSE of lr y is: The difference estimate of the population mean of y is: Let n denote the initial sample size and v denote the final sample size. Let ψ i denote the network that includes unit i and m i be the number of units in that network. The initial sample of units is selected by simple random sampling without replacement.

Adaptive Cluster Sampling
The Hansen-Hurwitz estimator of the population mean for the variable of interest can be written as (Thompson, 1990;Thompson and Seber, 1996)

Proposed Estimator in Adaptive Cluster
The ratio estimator for the population mean of y in adaptive cluster sampling based on Sisodia and Dwivedi (1981) Where: x w x wx C µ α = µ +

And:
Science Publications JMSS wy w wx xy wx The ratio estimator for the population mean of y in adaptive cluster sampling based on Singh and Kakran (1993) as: and β 2 (w x ) is the population coefficient of kurtosis of w x . The Taylor series method is used for this estimator in the same way to obtain the MSE: The ratio estimator for the population mean of y in adaptive cluster sampling based on Upadhyaya and Singh (1999) Where: ( ) x 2 x w1 x 2 x wx w w C The regression estimate of the population mean of y in adaptive cluster sampling is: The difference estimate of the population mean of y in adaptive cluster sampling is: The approximate MSE of D _ ac y is:

Simulation Study
In this section, the simulation x-values and y-values from Pochai (2008) were studied. The populations were shown in Fig. 2-3 and the data statistics of this populations were shown in Table 1.
For each iteration, an initial sample of units is selected by simple random sampling without replacement. The y-values are obtained for keeping the sample network. In each the sample network the x-values are obtained. The condition for added units in the sample is defined by C = {y: y>0}.
For each estimator 5,000 iterations were performed to obtain an accuracy estimate. Initial SRS sizes were varied n = 5, 10, 15, 20, 30 and 40 were used. The estimated mean square error of the estimate mean is:

CONCLUSION
Adaptive cluster sampling is an efficient method for sampling rare and hidden clustered populations. From the estimated MSE of the estimators in Table 2 showed that the difference estimator had the smallest estimated mean square error when compared to the ratio estimators and the regression estimator.
The ratio estimator for the population mean ( ) R _ ac y had the smaller estimated mean square error when compared to the ratio estimators using information of C wx and β 2 (w x ).
The estimator for the population mean did not use auxiliary variable had the higher estimated mean square error when compared to the estimator for the population mean using auxiliary variable 3. ACKNOWLEDGEMENT