Mathematics Exam  >  Mathematics Notes  >  Mathematics for IIT JAM, GATE, CSIR NET, UGC NET  >  Sampling distributions, CSIR-NET Mathematical Sciences

Sampling distributions, CSIR-NET Mathematical Sciences | Mathematics for IIT JAM, GATE, CSIR NET, UGC NET PDF Download

Suppose that we draw all possible samples of size n from a given population. Suppose further that we compute a statistic (e.g., a mean, proportion, standard deviation) for each sample. The probability distribution of this statistic is called a sampling distribution. And the standard deviation of this statistic is called the standard error.

 

Variability of a Sampling Distribution

The variability of a sampling distribution is measured by its variance or its standard deviation. The variability of a sampling distribution depends on three factors:

  • N: The number of observations in the population.

  • n: The number of observations in the sample.

  • The way that the random sample is chosen.

If the population size is much larger than the sample size, then the sampling distribution has roughly the same standard error, whether we sample with or without replacement. On the other hand, if the sample represents a significant fraction (say, 1/20) of the population size, the standard error will be meaningfully smaller, when we sample without replacement.

 

Sampling Distribution of the Mean

Suppose we draw all possible samples of size n from a population of size N. Suppose further that we compute a mean score for each sample. In this way, we create a sampling distribution of the mean.

We know the following about the sampling distribution of the mean. The mean of the sampling distribution (μx) is equal to the mean of the population (μ). And the standard error of the sampling distribution (σx) is determined by the standard deviation of the population (σ), the population size (N), and the sample size (n). These relationships are shown in the equations below:

Sampling distributions, CSIR-NET Mathematical Sciences | Mathematics for IIT JAM, GATE, CSIR NET, UGC NET  = μ

Sampling distributions, CSIR-NET Mathematical Sciences | Mathematics for IIT JAM, GATE, CSIR NET, UGC NET= [ σ / sqrt(n) ] * sqrt[ (N - n ) / (N - 1) ]

In the standard error formula, the factor sqrt[ (N - n ) / (N - 1) ] is called the finite population correction or fpc. When the population size is very large relative to the sample size, the fpc is approximately equal to one; and the standard error formula can be approximated by:

Sampling distributions, CSIR-NET Mathematical Sciences | Mathematics for IIT JAM, GATE, CSIR NET, UGC NET= σ / sqrt(n).

You often see this "approximate" formula in introductory statistics texts. As a general rule, it is safe to use the approximate formula when the sample size is no bigger than 1/20 of the population size.

 

Sampling Distribution of the Proportion

In a population of size N, suppose that the probability of the occurrence of an event (dubbed a "success") is P; and the probability of the event's non-occurrence (dubbed a "failure") is Q. From this population, suppose that we draw all possible samples of size n. And finally, within each sample, suppose that we determine the proportion of successes p and failures q. In this way, we create a sampling distribution of the proportion.

We find that the mean of the sampling distribution of the proportion (μp) is equal to the probability of success in the population (P). And the standard error of the sampling distribution (σp) is determined by the standard deviation of the population (σ), the population size, and the sample size. These relationships are shown in the equations below:

μp = P

σp = [ σ / sqrt(n) ] * sqrt[ (N - n ) / (N - 1) ]

σp = sqrt[ PQ/n ] * sqrt[ (N - n ) / (N - 1) ]

where σ = sqrt[ PQ ].

Like the formula for the standard error of the mean, the formula for the standard error of the proportion uses the finite population correction, sqrt[ (N - n ) / (N - 1) ]. When the population size is very large relative to the sample size, the fpc is approximately equal to one; and the standard error formula can be approximated by:

σp = sqrt[ PQ/n ]

You often see this "approximate" formula in introductory statistics texts. As a general rule, it is safe to use the approximate formula when the sample size is no bigger than 1/20 of the population size.

 

Central Limit Theorem

The central limit theorem states that the sampling distribution of the mean of any independent, random variable will be normal or nearly normal, if the sample size is large enough.

How large is "large enough"? The answer depends on two factors.

  • Requirements for accuracy. The more closely the sampling distribution needs to resemble a normal distribution, the more sample points will be required.

  • The shape of the underlying population. The more closely the original population resembles a normal distribution, the fewer sample points will be required.

In practice, some statisticians say that a sample size of 30 is large enough when the population distribution is roughly bell-shaped. Others recommend a sample size of at least 40. But if the original population is distinctly not normal (e.g., is badly skewed, has multiple peaks, and/or has outliers), researchers like the sample size to be even larger.

 


How to Choose Between T-Distribution and Normal Distribution

The t distribution and the normal distribution can both be used with statistics that have a bell-shaped distribution. This suggests that we might use either the t-distribution or the normal distribution to analyze sampling distributions. Which should we choose?

Guidelines exist to help you make that choice. Some focus on the population standard deviation.

  • If the population standard deviation is known, use the normal distribution

  • If the population standard deviation is unknown, use the t-distribution.

Other guidelines focus on sample size.

  • If the sample size is large, use the normal distribution. (See the discussion above in the section on the Central Limit Theorem to understand what is meant by a "large" sample.)

  • If the sample size is small, use the t-distribution.

In practice, researchers employ a mix of the above guidelines. On this site, we use the normal distribution when the population standard deviation is known and the sample size is large. We might use either distribution when standard deviation is unknown and the sample size is very large. We use the t-distribution when the sample size is small, unless the underlying distribution is not normal. The t distribution should not be used with small samples from populations that are not approximately normal.

The document Sampling distributions, CSIR-NET Mathematical Sciences | Mathematics for IIT JAM, GATE, CSIR NET, UGC NET is a part of the Mathematics Course Mathematics for IIT JAM, GATE, CSIR NET, UGC NET.
All you need of Mathematics at this link: Mathematics
556 videos|198 docs
Explore Courses for Mathematics exam
Signup for Free!
Signup to see your scores go up within 7 days! Learn & Practice with 1000+ FREE Notes, Videos & Tests.
10M+ students study on EduRev
Related Searches

Sample Paper

,

GATE

,

Exam

,

CSIR-NET Mathematical Sciences | Mathematics for IIT JAM

,

study material

,

Previous Year Questions with Solutions

,

UGC NET

,

UGC NET

,

Sampling distributions

,

CSIR-NET Mathematical Sciences | Mathematics for IIT JAM

,

Important questions

,

Semester Notes

,

Sampling distributions

,

Objective type Questions

,

Free

,

CSIR NET

,

CSIR NET

,

mock tests for examination

,

shortcuts and tricks

,

Summary

,

CSIR-NET Mathematical Sciences | Mathematics for IIT JAM

,

MCQs

,

UGC NET

,

Extra Questions

,

past year papers

,

practice quizzes

,

pdf

,

ppt

,

CSIR NET

,

GATE

,

Viva Questions

,

video lectures

,

GATE

,

Sampling distributions

;