Various Modes of Convergence
Definitions
We write or plim Xn = X.
for all z ∈ R and z is a continuity points of F. We write
We write
We denote a If r =2 , it is called mean square convergence and denoted as
Relationship among various modes of convergence
Example 1 Convergence in distribution does not imply convergence in probability.
⇒ Let Ω ={ω1,ω2,ω3,ω4}. Define the random variables Xn and X such that
Moreover, we assign equal probability to each event. Then,
Since Fn (x)=F (x) for all n, it is trivial that However,
Note that |Xn (ω)−X (ω)|= 1 for all n and ω. Hence, .
Example 2 Convergence in probability does not imply almost sure convergence.
⇒ Consider the sequence of independent random variables {Xn} such that
Obviously for any 0 <ε<1, we have
Hence In order to show , we need the following lemma.
Proof. Let C ={ω : Xn (ω)→ X (ω) asn →∞},A(ε)={ω : ω ∈ An (ε) i.o.}. Then, P (C) = 1 if and only if P (A(ε)) = 0 for all ε>0. However, Bm (ε) is a decreasing sequence of events, Bm (ε)↓ A(ε) asm →∞and so P (A(ε)) = 0 if and only if P (Bm (ε))→∞as m →∞ . Continuing the counter-example, we have
Hence,
Example 4 Convergence in probability does not imply convergence in Lr −norm.
⇒ Let {Xn} be a random variable such that
Then, for any ε>0 we have
Hence, . However, for each r>0,
Hence,
Some useful theorems
Theorem 5 Let {Xn} be a random vector with a fixed finite number of elements. Let g be a real-valued function continuous at a constant vector point α. Then
⇒By continuity of g at α, forany ε >0 we can find δ suchthat ||Xn −α|| <δ implies |g(Xn) − g(α)| <ε . Therefore,
Theorem 6 Suppose that where α is non-stochastic. Then
Then,
Theorem 7 Let {Xn} be a random vector with a fixed finite number of elements. Let g be a continuous real-valued function . Then
Theorem 8 Suppose
Inequalities frequently used in large sample theory
Proposition 9 (Chebychev’s inequality) For ε>0
Proposition 10 (Markov’s inequality) For ε>0 and p>0
Proposition 11 (Jensen’s inequality) If a function φ is convex on an interval I containing the support of a random variable X, then
φ(E(X)) ≤E (φ(X))
Proposition 12 (Cauchy-Schwartz inequality) For random variables X and Y
Proposition 13 (H¨older’s inequality ) For any p≥1
where
Proposition 14 (Lianpunov’s inequality) If r>p>0,
Proposition 15 (Minkowski’s inequality) For r ≥1,
Proposition 16 (Lo`eve’s cr inequality) For r>0,
where cr =1when 0 <r≤1, and cr = mr−1when r>1.
Laws of Large Numbers
Given restrictions on the dependence, heterogeniety, and moments of a sequence of random variables converges in some mode to a parameter value.
When the convergence is in probability sense, we call it a weak law of large numbers. When in almost sure sense, it is called a strong law of large numbers.
Theorem 17 (Komolgorov SLLN I) Let {Xi} be a sequence of independently and identically distributed random variables. Then
Remark 18 The above theorem requires the existence of the first moment only. However, the restriction on dependence and heterogeneity is quite severe. The theorem requires i.i.d.(random sample), which is rarely the case in econometrics. Note that the theorem is stated in necessary and sufficient form. Since almost sure convergence always implies convergence in probability, the theorem can be stated as . Then it is aweak law of large numbers.
Theorem 19 (Komolgorov SLLN II) Let {Xi} be a sequence of independently distributed random variables with finite variances
Remark 20 Here we allow for the heterogeneity of distributions in exchange for existence of the second moment. Still, they have to be independent. Intuitive explanation for the summation condition is that we should not have variances grow too fast so that we have a shrinking variance for the sample mean.
Theorem 21 (Markov SLLN) Let {Xi} be a sequence of independently distributed random variables with finite means E (Xi)=µi < ∞. If for some δ>0,
Remark 22 When δ =1 , the theorem collapses to Komolgorov SLLN II. Here we don’t need the existence of the second moment. All we need is the existence of the moment of order (1+ δ) where δ>0.
Theorem 23 (Ergodic theorem) Let {Xi} be a (weakly) stationary and ergodic sequence with E|Xi| < ∞. Then,
Remark 24 By stationarity, we have E (Xi)=µ for all i. And ergodicity enables us to have, roughly speaking, an estimate of µ as a sample mean of . Both stationarity and ergodicity are restrictions on dependence structure - which sometimes seem quite severe for econometric data.
Theorem 25 (McLeish) Let {Xi} be a sequence with a uniform mixing of size or a strong mixing of size with finite means E (Xi)=µi. If for some then
E (Xt |F t−1) = 0 for all t
where Ft−1 = σ(Xt−1,X t−2,···) i.e., information up to time (t−1). Theorem 26 (Chow) Let {Xi} be a martingale difference sequence. If for some then
Central Limit Theorems
Remark 28 The conclusion of the theorem can be also written asWe requiresthe existence of the second moment even if we have i.i.d. sample. (Compare this with LLN).
Theorem 29 (Lindeberg-Feller CLT) Let {Xi} be a sequence of independently distributed random variables with and distribution function Fi (x). Then
and
if and only if
Remark 30 The condition is called ”Lindeberg condition”. The condition controls the tail behavior of Xi so that we have a proper distribution for scaled sample mean. We do not need identical distribution here. The condition is difficult to verify in practice. A search for a sufficient condition for the Lindeberg condition leads to the following CLT.
Remark 32 We can show that the moment restrictions in the theorem are enough to obtain the Lindeberg condition.
Remark 34 The above theorem allows some dependence structure but retains homogeneity through stationarity and ergodicity.
Remark 36 The above CLT is quite general in the sense that we can allow reasonable dependence and heterogeneity structures to be applied to econometric data. However, as shown in the statement of the theorem, it is impractical to check the conditions of the theorem in practice.
556 videos|198 docs
|
556 videos|198 docs
|
|
Explore Courses for Mathematics exam
|