The simple random sampling scheme provides a random sample where every unit in the population has equal probability of selection. Under certain circumstances, more efficient estimators are obtained by assigning unequal probabilities of selection to the units in the population. This type of sampling is known as varying probability sampling scheme.
If Y is the variable under study and X is an auxiliary variable related to Y, then in the most commonly used varying probability scheme, the units are selected with probability proportional to the value of X, called as size. This is termed as probability proportional to a given measure of size (pps) sampling. If the sampling units vary considerably in size, then SRS does not takes into account the possible importance of the larger units in the population. A large unit, i.e., a unit with large value of Y contributes more to the population total than the units with smaller values, so it is natural to expect that a selection scheme which assigns more probability of inclusion in a sample to the larger units than to the smaller units would provide more efficient estimators than the estimators which provide equal probability to all the units. This is accomplished through pps sampling.
Note that the “size” considered is the value of auxiliary variable X and not the value of study variable Y. For example in an agriculture survey, the yield depends on the area under cultivation. So bigger areas are likely to have larger population and they will contribute more towards the population total, so the value of the area can be considered as the size of auxiliary variable. Also, the cultivated area for a previous period can also be taken as the size while estimating the yield of crop. Similarly, in an industrial survey, the number of workers in a factory can be considered as the measure of size when studying the industrial output from the respective factory.
Difference between the methods of SRS and varying probability scheme: In SRS, the probability of drawing a specified unit at any given draw is the same. In varying probability scheme, the probability of drawing a specified unit differs from draw to draw. It appears in pps sampling that such procedure would give biased estimators as the larger units are overrepresented and the smaller units are under-represented in the sample. This will happen in case of sample mean as an estimator of population mean where all the units are given equal weight. Instead of giving equal weights to all the units, if the sample observations are suitably weighted at the estimation stage by taking the probabilities of selection into account, then it is possible to obtain unbiased estimators.
In pps sampling, there are two possibilities to draw the sample, i.e., with replacement and without replacement.
Selection of units with replacement: The probability of selection of a unit will not change and the probability of selecting a specified unit is same at any stage. There is no redistribution of the probabilities after a draw.
Selection of units without replacement: The probability of selection of a unit will change at any stage and the probabilities are redistributed after each draw.
PPS without replacement (WOR) is more complex than PPS with replacement (WR) . We consider both the cases separately.
PPS sampling with replacement (WR): First we discuss the two methods to draw a sample with PPS and WR.
1. Cumulative total method: The procedure of selection a simple random sample of size n consists of
- associating the natural numbers from 1 to N units in the population and
- then selecting those n units whose serial numbers correspond to a set of n numbers where each number is less than or equal to N which is drawn from a random number table.
In selection of a sample with varying probabilities, the procedure is to associate with each unit a set of consecutive natural numbers, the size of the set being proportional to the desired probability.
If X 1 , X2 , ..., X N are the positive integers proportional to the probabilities assigned to the N units in the population, then a possible way to associate the cumulative totals of the units. Then the units are selected based on the values of cumulative totals. This is illustrated in the following table:
In this case, the probability of selection of ith unit is
Note that TN is the population total which remains constant..
Drawback : This procedure involves writing down the successive cumulative totals. This is time consuming and tedious if the number of units in the population is large.
This problem is overcome in the Lahiri’s method.
Lahiri’s method:
Let i.e., maximum of the sizes of N units in the population or some convenient
number greater than M .
The sampling procedure has following steps:
1. Select a pair of random number (i, j) such that 1 ≤ i≤ N ,1 ≤ j ≤M .
2. If j ≤X i , then ith unit is selected otherwise rejected and another pair of random number is chosen.
3. To get a sample of size n , this procedure is repeated till n units are selected.
Now we see how this method ensures that the probabilities of selection of units are varying and are proportional to size.
Probability of selection of ith unit at a trial depends on two possible outcomes
– either it is selected at the first draw
– or it is selected in the subsequent draws preceded by ineffective draws. Such probability is given by
Probability that no unit is selected at a trial
Probability that unit i is selected at a given draw (all other previous draws result in the non selection of unit i)
Thus the probability of selection of unit i is proportional to the size Xi . So this method generates a pps sample.
Advantage:
1. It does not require writing down all cumulative totals for each unit.
2. Sizes of all the units need not be known before hand. We need only some number greater than the maximum size and the sizes of those units which are selected by the choice of the first set of random numbers 1 to N for drawing sample under this scheme.
Disadvantage: It results in the wastage of time and efforts if units get rejected.
The probability of rejection
The expected numbers of draws required to draw one unit
This number is large if M is much larger than
Example: Consider the following data set of 10 number of workers in the factory and its output. We illustrate the selection of units using the cumulative total method.
Selection of sample using cumulative total method:
1.First draw: - Draw a random number between 1 and 64.
- Suppose it is 23
- T4 < 23<T5
- Unit Y is selected and Y5 = 8 enters in the sample .
2. Second draw:
- Draw a random number between 1 and 64
- Suppose it is 38
- T7 < 38<T8
- Unit 8 is selected and Y8 = 17 enters in the sample
- and so on.
- This procedure is repeated till the sample of required size is obtained.
Selection of sample using Lahiri’s Method
In this case
So we need to select a pair of random number (i, j ) such that 1 ≤ i≤ 10, 1 ≤j ≤ 14 .
Following table shows the sample obtained by Lahiri’s scheme:
and so on. Here ( y3,y9 ) are selected into the sample.
Varying probability scheme with replacement: Estimation of population mean
Let Yi : value of study variable for the ith unit of the population, i = 1, 2,…,N.
Xi : known value of auxiliary variable (size) for the ith unit of the population.
Pi : probability of selection of ith unit in the population at any given draw and is proportional to size Xi .
Consider the varying probability scheme and with replacement for a sample of size n. Let yr be the value of rth observation on study variable in the sample and pr be its initial probability of selection. Define
then
is an unbiased estimator of population mean , variance of
an unbiased estimate of variance of
Proof: Note that zr can take any one of the N values out of Z1 ,Z2 , ..., Z N with corresponding initial probabilities 12
P1,P2, ..., PN , respectively. So
Thus
So is an unbiased estimator of population mean .
The variance of is
Now
Thus
To show that is an unbiased estimator of variance of , consider
which is the same as in the case of SRSWR.
Estimation of population total: An estimate of population total is
Taking expectation, we get
Thus is an unbiased estimator of population total. Its variance is
An estimate of the variance
Varying probability scheme without replacement
In varying probability scheme without replacement, when the initial probabilities of selection are unequal, then the probability of drawing a specified unit of the population at a given draw changes with the draw. Generally, the sampling WOR provides a more efficient estimator than sampling WR. The estimators for population mean and variance are more complicated. So this scheme is not commonly used in practice, especially in large scale sample surveys with small sampling fractions.
Let unit,
Pi : Probability of selection of Ui at the first draw, i = 1, 2, ..., N
Pi(r) : Probability of selecting Ui at the r th draw (1)
Pi = Pi.
Consider
Pi(2) = Probability of selection of Ui at 2nd draw.
Such an event can occur in the following possible ways:
Ui is selected at 2nd draw when
So Pi(2) can be expressed as
Pi(2) will, in general, be different for each i = 1,2,…, N . So will change with successive draws.
This makes the varying probability scheme WOR more complex. Only will provide an unbiased estimator of . In general,will not provide an unbiased estimator of .
Ordered estimates
To overcome the difficulty of changing expectation with each draw, associate a new variate with each draw such that its expectation is equal to the population value of the variate under study. Such estimators take into account the order of the draw. They are called the ordered estimates. The order of the value obtained at previous draw will affect the unbiasedness of population mean.
We consider the ordered estimators proposed by Des Raj, first for the case of two draws and then generalize the result.
556 videos|198 docs
|
1. What is probability proportional to size sampling? |
2. How does probability proportional to size sampling work? |
3. What are the advantages of probability proportional to size sampling? |
4. How is probability proportional to size sampling different from simple random sampling? |
5. What are the limitations of probability proportional to size sampling? |
556 videos|198 docs
|
|
Explore Courses for Mathematics exam
|