Definition
The order statistics of a random sample X1,...,Xn are the sample values placed in ascending order. They are denoted by X(1),...,X(n).
The order statistics are random variables that satisfy X(1) ≤ X(2) ≤ ··· ≤ X(n). The following are some statistics that are easily defined in terms of the order statistics.
The sample range, R = X(n) −X(1), is the distance between the smallest and largest observations. It is a measure of the dispersion in the sample and should reflect the dispersion in the population.
The sample median, which we will denote by M, is a number such that approximately onehalf of the observations are less than M and one-half are greater. In terms of order statistics, M is defined by
The median is a measure of location that might be considered an alternative to the sample mean. One advantage of the sample median over the sample mean is that it is less affected by extreme observations.
For any number p between 0 and 1, the (100p)th sample percentile is the observation such that approximately np of the observations are less than this observation and n(1−p) of the observations are greater. The 50th percentile is the sample median, the 25th percentile is called the lower quartile, and the 75th percentile is called the upper quartile. A measure of dispersion that is sometimes used is the interquartile range, the distance between the lower and upper quartiles.
Theorem 5.3.4
Let X1,...,Xn be a random sample from a discrete distribution with pmf fX(xi) = pi, where x1 < x2 <··· are the possible values of X in ascending order. Define
Let X(1),...,X(n) denote the order statistics from the sample. Then
and
Proof: Fix i, and let Y be a random variable that counts the number of X1,...,Xn that are less than or equal to xi. For each of X1,...,Xn, call the event {Xj ≤xi} a success and {Xj > xi}a “failure”. Then Y is the number of success in n trials. Thus, Y ∼binomial(n,Pi). The event {X(j) ≤xi} is equivalent {Y ≥j}; that is, at least j of the sample values are less than or equal to xi. The two equations are then established. ¤
Theorem 5.4.4
Let X(1),...,X(n) denote the order statistics of a random sample, X1,...,Xn, from a continuous population with cdf FX(x) and pdf fX(x). Then the pdf of X(j) is
Example (Uniform order statistics pdf)
Let X1,...,Xn be iid uniform(0,1), so fX(x) = 1 for x∈(0,1) and FX(x) = x for x∈(0,1). Thus, the pdf of the jth order statistics is
for x∈(0,1). Hence, X(j) ∼Beta(j,n−j + 1). From this we can deduce that
and
Theorem 5.4.6
Let X(1),...,X(n) denote the order statistics of a random sample, X1,...,Xn, from a continuous population with cdf FX(x) and pdf fX(x). Then the joint pdf of X(i) and X(j), 1≤i < j ≤n, is
The joint pdf of three or more order statistics could be derived using similar but even more involved arguments. Perhaps the other most useful pdf is fX(1),...,X(n)(x1,...,xn), the joint pdf of all the order statistics, which is given by
556 videos|198 docs
|
556 videos|198 docs
|
|
Explore Courses for Mathematics exam
|