Empirical processes: theory and applications.

*(English)*Zbl 0741.60001
Regional Conference Series in Probability and Statistics. 2. Hayward, CA, Alexandria, VA: Institute of Mathematical Statistics, American Statistical Association. viii, 86 p. (1990).

Given a sequence \(\xi_ 1,\xi_ 2,\ldots\) of random elements (r.e.) in an arbitrary sample space \(X=(X,{\mathfrak X})\) (that is, \({\mathcal A},{\mathfrak X}\)-measurable maps \(\xi_ i:\Omega\to X\) defined on some basic probability space \((\Omega,{\mathcal A},\mathbb{P})\), where \({\mathcal A}\) and \({\mathfrak X}\) denote \(\sigma\)-fields of subsets of \(\Omega\) and \(X\), respectively), the empirical measure \(P_ n\) on \({\mathfrak X}\) puts mass \(1/n\) at each of the random observations \(\xi_ 1(\omega),\ldots,\xi_ n(\omega)\) in \(X\). Each \(\mathfrak X\)-measurable real-valued function \(f\) on \(X\) determines a random variable (r.v.) \(P_ nf\equiv\int_ XfdP_ n\), i.e.
\[
(P_ nf)(\omega):=(1/n)\sum_{i\leq n}f(\xi_ i(\omega)),\quad \omega\in\Omega. (1)
\]
Assuming the \(\xi_ i's\) to be independent and identically distributed with law \(P\) on \({\mathfrak X}\) (that is, with \(P(B):=\mathbb{P}(\xi_ i\in B)\), \(B\in{\mathfrak X})\), then, for fixed \(f\) with \(Pf^ 2\equiv\int_ Xf^ 2dP<\infty\), the sequence of r.v.’s \(\nu_ n(f):=n^{1/2}(P_ nf-Pf)\), \(n\in\mathbb{N}\), satisfies the strong law of large numbers and the central limit theorem.

The theory of empirical processes \(\nu_ n=(\nu_ n(f))_{f\in{\mathcal F}}\) indexed by various classes \({\mathcal F}\) of real-valued functions \(f\) on \(X\) seeks to generalize these classical results so that they hold uniformly (in some sense) for \(f\in{\mathcal F}\). This monograph provides a beautiful exposition of the present state of arts in the theory of empirical processes based on the author’s own contributions and on ideas originally due to Richard Dudley. Special emphasis is given to handle typical and important applications in statistics and econometrics.

Contents: 1. Introduction; 2. Symmetrization and Conditioning; 3. Chaining; 4. Packing and Covering in Euclidean Spaces; 5. Stability; 6. Convex Hulls; 7. Maximal Inequalities; 8. Uniform Laws of Large Numbers; 9. Convergence in Distribution and Almost Sure Representation; 10. Functional Central Limit Theorems; 11. Least Absolute Deviations Estimators for Censored Regressions; 12. Random Convex Sets; 13. Estimation from Censored Data; 14. Biased Sampling.

Following the author’s Introduction, the various topics can be summarized as follows: In asymptotic problems, \({\mathcal F}\) is often a parametric family, \(\{f(\cdot,t):t\in T\}\), where the parameter space \(T\) is not necessarily finite-dimensional. Writing \(f_ i(\omega,t)\) instead of \(f(\xi_ i(\omega),t)\) in (1) and considering instead

\[ S_ n(\omega,t):=\sum_{i\leq n}f_ i(\omega,t), (1') \] one eliminates at the same time an unnecessary notational distinction between empirical and partial-sum processes indexed by \(T\), bringing both closer to the theory for sums of independent r.e.’s in a Banach space. In the present notes, however, the author concentrates on problems and methods that are usually identified as belonging to empirical process theory.

The general problem attacked in Sections 2-7 is that of finding probabilistic bounds for \[ \Delta_ n(\omega):=\sup_{t\in T}| S_ n(\omega,t)-\mathbb{P} S_ n(\cdot,t)|, \hbox{ }\omega\in\Omega, \] where \((S_ n(\cdot,t))_{t\in T}\), with \(S_ n(\omega,t)\) defined by (\(1'\)), is a sum of independent processes indexed by \(T\). For a general convex, increasing function \(\Phi\) on \(\mathbb{R}^ +\), in Section 2 a bound for \(\mathbb{P}\Phi(\Delta_ n)\) is derived by introducing a more variable process \[ L_ n(\sigma,\omega):=\sup_{t\in T}\left|\sum_{i\leq n}\sigma_ if_ i(\omega,t)\right| \] by means of a Rademacher sequence \(\sigma=\{\sigma_ 1,\ldots,\sigma_ n\}\) of independent r.v.’s \(\sigma_ i\), each \(\sigma_ i\) taking only the values \(+1\) and \(-1\), both with probability 1/2, such that \(\sigma\) is independent of the sequence of processes \(\{(f_ i(\cdot,t))_{t\in T},\) \(1\leq i\leq n\}\). It is shown that \[ \mathbb{P}\Phi(\Delta_ n)\leq\mathbb{P}\Phi(2L_ n)\equiv\mathbb{P}_ \omega\mathbb{P}_ \sigma\Phi(2\sup _{f\in{\mathcal F}_ \omega}|\sigma\cdot {\mathbf f}|), \] where, for fixed \(\omega\in\Omega\), \[ {\mathcal F}_ \omega:=\{{\mathbf f}=(f_ 1(\omega,t),\ldots,f_ n(\omega,t)):t\in T\}\hbox { and } \sigma\cdot{\mathbf f}:=\sum_{i\leq n}\sigma_ if_ i(\omega,t). \] With \(\omega\) held fixed, the inner expectation, with respect to \(\mathbb{P}_ \sigma\), involves a very simple process, namely \(\sigma\cdot{\mathbf f}\), indexed by a (random) subset \({\mathcal F}_ \omega\) of \(\mathbb{R}^ n\). Absorbing the factor 2 into the function \(\Phi\), the problem thus becomes to find bounds for \(\mathbb{P}_ \sigma\Phi(\sup_{{\mathcal F}}|\sigma\cdot{\mathbf f}|)\) for various \(\Phi\) and various subsets \({\mathcal F}\) of \(\mathbb{R}^ n\). This is established in Section 3.

The inequality in this respect is focussed on a bound for the Orlicz norm \(\|\sup_{{\mathbf f}\in{\mathcal F}}|\sigma\cdot{\mathbf f}|\|_ \Psi\) (with the particular function \(\Phi(x)\equiv\Psi(x):=(1/5)\exp(x^ 2))\) involving the so-called packing numbers of \({\mathcal F}\). [The packing number \(D(\varepsilon,{\mathcal F})\) is the largest number of points that can be packed into \({\mathcal F}\) with each pair at least \(\varepsilon\) apart; if \(\Phi\) is a convex, increasing function on \(\mathbb{R}^ +\) with \(0\leq\Phi(0)<1\), the Orlicz norm \(\| Z\|_ \Phi\) of a r.v. \(Z\) is defined by \(\| Z\|_ \Phi:=\inf\{C>0:\mathbb{P}(\Phi(Z)/C)\leq 1\}.]\) In this way the study of maximal inequalities for \(\Delta_ n\) is transformed into a study of the geometry of the set \({\mathcal F}_ \omega\).

Section 4 presents the connection between packing numbers and the combinatorial methods that have evolved from the approach of Vapnik and Červonenkis. It develops the idea that a bounded set \({\mathcal F}\) in \(\mathbb{R}^ n\) that has a weak property shared by \(V\)-dimensional subspaces should have packing numbers like those of a bounded subset of \(\mathbb{R}^ V\). Sections 5-7 elaborate upon this idea, with Section 7 summarizing the results in the form of several maximal inequalities for \(\Delta_ n\).

Section 8 transforms the maximal inequalities into simple conditions for uniform analogues of the law of large numbers and Sections 9-10 transform them into uniform analogues of the central limit theorem — functional limit theorems that are descendants of Donsker’s Theorem for the empirical distribution function on the real line — where the present approach heavily depends on the method of almost sure representations. In the last four sections the theory is applied in an efficient way to the situations described in their titles.

This excellent monograph can be strongly recommended to anyone interested in the theory of empirical processes and its applications; it is a pity that it does not contain the area of bootstrapping empirical measures culminating in the elegant results of E. Giné and J. Zinn [Ann. Probab. 18, No. 2, 851-869 (1990; Zbl 0706.62017)].

The theory of empirical processes \(\nu_ n=(\nu_ n(f))_{f\in{\mathcal F}}\) indexed by various classes \({\mathcal F}\) of real-valued functions \(f\) on \(X\) seeks to generalize these classical results so that they hold uniformly (in some sense) for \(f\in{\mathcal F}\). This monograph provides a beautiful exposition of the present state of arts in the theory of empirical processes based on the author’s own contributions and on ideas originally due to Richard Dudley. Special emphasis is given to handle typical and important applications in statistics and econometrics.

Contents: 1. Introduction; 2. Symmetrization and Conditioning; 3. Chaining; 4. Packing and Covering in Euclidean Spaces; 5. Stability; 6. Convex Hulls; 7. Maximal Inequalities; 8. Uniform Laws of Large Numbers; 9. Convergence in Distribution and Almost Sure Representation; 10. Functional Central Limit Theorems; 11. Least Absolute Deviations Estimators for Censored Regressions; 12. Random Convex Sets; 13. Estimation from Censored Data; 14. Biased Sampling.

Following the author’s Introduction, the various topics can be summarized as follows: In asymptotic problems, \({\mathcal F}\) is often a parametric family, \(\{f(\cdot,t):t\in T\}\), where the parameter space \(T\) is not necessarily finite-dimensional. Writing \(f_ i(\omega,t)\) instead of \(f(\xi_ i(\omega),t)\) in (1) and considering instead

\[ S_ n(\omega,t):=\sum_{i\leq n}f_ i(\omega,t), (1') \] one eliminates at the same time an unnecessary notational distinction between empirical and partial-sum processes indexed by \(T\), bringing both closer to the theory for sums of independent r.e.’s in a Banach space. In the present notes, however, the author concentrates on problems and methods that are usually identified as belonging to empirical process theory.

The general problem attacked in Sections 2-7 is that of finding probabilistic bounds for \[ \Delta_ n(\omega):=\sup_{t\in T}| S_ n(\omega,t)-\mathbb{P} S_ n(\cdot,t)|, \hbox{ }\omega\in\Omega, \] where \((S_ n(\cdot,t))_{t\in T}\), with \(S_ n(\omega,t)\) defined by (\(1'\)), is a sum of independent processes indexed by \(T\). For a general convex, increasing function \(\Phi\) on \(\mathbb{R}^ +\), in Section 2 a bound for \(\mathbb{P}\Phi(\Delta_ n)\) is derived by introducing a more variable process \[ L_ n(\sigma,\omega):=\sup_{t\in T}\left|\sum_{i\leq n}\sigma_ if_ i(\omega,t)\right| \] by means of a Rademacher sequence \(\sigma=\{\sigma_ 1,\ldots,\sigma_ n\}\) of independent r.v.’s \(\sigma_ i\), each \(\sigma_ i\) taking only the values \(+1\) and \(-1\), both with probability 1/2, such that \(\sigma\) is independent of the sequence of processes \(\{(f_ i(\cdot,t))_{t\in T},\) \(1\leq i\leq n\}\). It is shown that \[ \mathbb{P}\Phi(\Delta_ n)\leq\mathbb{P}\Phi(2L_ n)\equiv\mathbb{P}_ \omega\mathbb{P}_ \sigma\Phi(2\sup _{f\in{\mathcal F}_ \omega}|\sigma\cdot {\mathbf f}|), \] where, for fixed \(\omega\in\Omega\), \[ {\mathcal F}_ \omega:=\{{\mathbf f}=(f_ 1(\omega,t),\ldots,f_ n(\omega,t)):t\in T\}\hbox { and } \sigma\cdot{\mathbf f}:=\sum_{i\leq n}\sigma_ if_ i(\omega,t). \] With \(\omega\) held fixed, the inner expectation, with respect to \(\mathbb{P}_ \sigma\), involves a very simple process, namely \(\sigma\cdot{\mathbf f}\), indexed by a (random) subset \({\mathcal F}_ \omega\) of \(\mathbb{R}^ n\). Absorbing the factor 2 into the function \(\Phi\), the problem thus becomes to find bounds for \(\mathbb{P}_ \sigma\Phi(\sup_{{\mathcal F}}|\sigma\cdot{\mathbf f}|)\) for various \(\Phi\) and various subsets \({\mathcal F}\) of \(\mathbb{R}^ n\). This is established in Section 3.

The inequality in this respect is focussed on a bound for the Orlicz norm \(\|\sup_{{\mathbf f}\in{\mathcal F}}|\sigma\cdot{\mathbf f}|\|_ \Psi\) (with the particular function \(\Phi(x)\equiv\Psi(x):=(1/5)\exp(x^ 2))\) involving the so-called packing numbers of \({\mathcal F}\). [The packing number \(D(\varepsilon,{\mathcal F})\) is the largest number of points that can be packed into \({\mathcal F}\) with each pair at least \(\varepsilon\) apart; if \(\Phi\) is a convex, increasing function on \(\mathbb{R}^ +\) with \(0\leq\Phi(0)<1\), the Orlicz norm \(\| Z\|_ \Phi\) of a r.v. \(Z\) is defined by \(\| Z\|_ \Phi:=\inf\{C>0:\mathbb{P}(\Phi(Z)/C)\leq 1\}.]\) In this way the study of maximal inequalities for \(\Delta_ n\) is transformed into a study of the geometry of the set \({\mathcal F}_ \omega\).

Section 4 presents the connection between packing numbers and the combinatorial methods that have evolved from the approach of Vapnik and Červonenkis. It develops the idea that a bounded set \({\mathcal F}\) in \(\mathbb{R}^ n\) that has a weak property shared by \(V\)-dimensional subspaces should have packing numbers like those of a bounded subset of \(\mathbb{R}^ V\). Sections 5-7 elaborate upon this idea, with Section 7 summarizing the results in the form of several maximal inequalities for \(\Delta_ n\).

Section 8 transforms the maximal inequalities into simple conditions for uniform analogues of the law of large numbers and Sections 9-10 transform them into uniform analogues of the central limit theorem — functional limit theorems that are descendants of Donsker’s Theorem for the empirical distribution function on the real line — where the present approach heavily depends on the method of almost sure representations. In the last four sections the theory is applied in an efficient way to the situations described in their titles.

This excellent monograph can be strongly recommended to anyone interested in the theory of empirical processes and its applications; it is a pity that it does not contain the area of bootstrapping empirical measures culminating in the elegant results of E. Giné and J. Zinn [Ann. Probab. 18, No. 2, 851-869 (1990; Zbl 0706.62017)].

Reviewer: P.Gänßler (München)

##### MSC:

60-02 | Research exposition (monographs, survey articles) pertaining to probability theory |

62-02 | Research exposition (monographs, survey articles) pertaining to statistics |

62J02 | General nonlinear regression |

60F05 | Central limit and other weak theorems |

60F15 | Strong limit theorems |

60F17 | Functional limit theorems; invariance principles |

60B12 | Limit theorems for vector-valued random variables (infinite-dimensional case) |

62E20 | Asymptotic distribution theory in statistics |