Donsker's Theorem Proof Explained
Donsker's Theorem: An Overview
Donsker’s Theorem, also known as the Donsker invariance principle, is a fundamental result in probability theory that establishes a connection between the empirical distribution function of a sequence of independent and identically distributed (i.i.d.) random variables and the Brownian bridge process. The theorem is named after Monroe D. Donsker, who first proved it in the 1950s.
Statement of Donsker's Theorem
Let X_1, X_2, \ldots be a sequence of i.i.d. random variables with mean 0 and variance 1. Define the empirical distribution function as
\[F_n(x) = \frac{1}{n} \sum_{i=1}^n \mathbf{1}_{\{X_i \leq x\}},\]
where \mathbf{1}_{\{X_i \leq x\}} is the indicator function. Donsker’s Theorem states that the stochastic process
\[\sqrt{n} (F_n(x) - F(x))\]
converges in distribution to a Brownian bridge process B_0(t) as n \to \infty, where F(x) is the cumulative distribution function of the random variables X_i.
Proof of Donsker's Theorem
The proof of Donsker’s Theorem is based on the following steps:
- Representation of the empirical distribution function: We can represent the empirical distribution function F_n(x) as
\[F_n(x) = \frac{1}{n} \sum_{i=1}^n \mathbf{1}_{\{X_i \leq x\}} = \frac{1}{n} \sum_{i=1}^n Y_i(x),\]
where Y_i(x) = \mathbf{1}_{\{X_i \leq x\}}.
Convergence of the finite-dimensional distributions: We need to show that the finite-dimensional distributions of the stochastic process \sqrt{n} (F_n(x) - F(x)) converge to those of the Brownian bridge process B_0(t). This can be done by showing that the joint distribution of the random variables \sqrt{n} (F_n(x_1) - F(x_1)), \ldots, \sqrt{n} (F_n(x_k) - F(x_k)) converges to the joint distribution of the random variables B_0(t_1), \ldots, B_0(t_k) for any k and any x_1, \ldots, x_k.
Tightness of the stochastic process: We need to show that the stochastic process \sqrt{n} (F_n(x) - F(x)) is tight, meaning that for any \epsilon > 0, there exists a compact set K \subset \mathbb{R} such that
\[\sup_{n} \mathbb{P} \left( \sup_{x \in \mathbb{R}} |\sqrt{n} (F_n(x) - F(x))| > \epsilon \right) < \epsilon.\]
This can be done by using the fact that the stochastic process \sqrt{n} (F_n(x) - F(x)) is a martingale and applying the martingale inequality.
Key Lemmas and Theorems
The proof of Donsker’s Theorem relies on several key lemmas and theorems, including:
- The Lindeberg-Feller Central Limit Theorem: This theorem states that if X_1, X_2, \ldots is a sequence of i.i.d. random variables with mean 0 and variance 1, and if the Lindeberg condition is satisfied, then the distribution of the sum \sum_{i=1}^n X_i converges to a normal distribution as n \to \infty.
- The martingale inequality: This inequality states that if M_n is a martingale, then for any \epsilon > 0, there exists a constant C such that
\[\mathbb{P} \left( \sup_{n} |M_n| > \epsilon \right) \leq \frac{C}{\epsilon} \mathbb{E} \left[ \sup_{n} |M_n| \right].\]
Applications of Donsker's Theorem
Donsker’s Theorem has several applications in statistics and probability theory, including:
- Goodness-of-fit tests: Donsker’s Theorem can be used to construct goodness-of-fit tests for the distribution of a random sample.
- Confidence intervals: Donsker’s Theorem can be used to construct confidence intervals for the distribution of a random sample.
- Bootstrap methods: Donsker’s Theorem can be used to justify the use of bootstrap methods for estimating the distribution of a random sample.
Notes
📝 Note: The proof of Donsker's Theorem is a complex and technical argument that requires a good understanding of probability theory and mathematical analysis.
📝 Note: The statement of Donsker's Theorem assumes that the random variables $X_i$ are i.i.d. with mean 0 and variance 1. However, the theorem can be extended to more general cases where the random variables have different means and variances.
Final Thoughts
In conclusion, Donsker’s Theorem is a fundamental result in probability theory that establishes a connection between the empirical distribution function of a sequence of i.i.d. random variables and the Brownian bridge process. The proof of the theorem is a complex and technical argument that requires a good understanding of probability theory and mathematical analysis. The theorem has several applications in statistics and probability theory, including goodness-of-fit tests, confidence intervals, and bootstrap methods.
What is the statement of Donsker’s Theorem?
+
Donsker’s Theorem states that the stochastic process \sqrt{n} (F_n(x) - F(x)) converges in distribution to a Brownian bridge process B_0(t) as n \to \infty, where F_n(x) is the empirical distribution function and F(x) is the cumulative distribution function of the random variables X_i.
What is the proof of Donsker’s Theorem based on?
+
The proof of Donsker’s Theorem is based on the representation of the empirical distribution function, convergence of the finite-dimensional distributions, and tightness of the stochastic process.
What are the applications of Donsker’s Theorem?
+
Donsker’s Theorem has several applications in statistics and probability theory, including goodness-of-fit tests, confidence intervals, and bootstrap methods.