Weak convergence and empirical processes. With applications to statistics.

*(English)*Zbl 0862.60002
Springer Series in Statistics. New York, NY: Springer. xvi, 508 p. (1996).

Let \((S,{\mathcal S},P)\) be a probability space, let \(X_i:S^N\mapsto S\) be the coordinate functions and let \(P_n=\sum^n_{i=1}\delta_{X_i}/n\), \(n\in\mathbb{N}\), be the empirical measures for the ‘data’ \(X_i\). The pathbreaking paper of V. N. Vapnik and A. Ya. Chervonenkis [Theory Probab. Appl. 16, 264-280 (1971); translation from Teor. Veroyatn. Primen. 16, 264-279 (1971; Zbl 0247.60005)] on a.s. convergence of \((P_n-P)(C)\) uniformly in \(C\) over general classes \(\mathcal C\) of measurable subsets of \(S\), originated a new branch of modern empirical process theory whose object of study is the empirical measure viewed as a process indexed by a class of sets or a class of functions, the novelty being the generality of the setup: the space \(S\) needs not be \({\mathbf R}\) or \({\mathbf R}^d\), the class \(\mathcal C\) is not necessarily the half-line, etc. One seeks to obtain limit theorems, exponential bounds, rates for the empirical process, uniform over the class \(\mathcal C\) or the class \(\mathcal F\). The development of this theory requires a generalization of the theory of convergence of random variables and vectors to random elements taking values in not necessarily separable metric spaces (e.g., the metric space \(\ell^\infty({\mathcal F})\) of all the bounded functionals \({\mathcal F}\mapsto{\mathbf R}\)). Perhaps more importantly, it also requires a wealth of other more probabilistic material, much of it developed earlier for the study of sample path properties of Gaussian processes and for probability theory in separable Banach spaces. And it turns out that the empirical process theory thus developed has found many uses in asymptotic statistics. Although some problems still remain, this branch of empirical process theory has presently reached maturity after 25 years of strong development. The book of van der Vaart and Wellner constitutes an excellent account of general empirical process theory and its applications, and comes precisely at the right time. Previous partial surveys of the theory: Gaenssler’s 1993 IMS Lecture Notes, D. Pollard’s book “Convergence of stochastic processes” (1984; Zbl 0544.60045), the lecture notes of R. M. Dudley [in: École d’Été de probabilités de Saint Flour XII-1982, Lect. Notes Math. 1097, 1-142 (1984; Zbl 0554.60029)], the reviewer and J. Zinn [in: Probability and Banach spaces, Lect. Notes Math. 1221, 50-113 (1986; Zbl 0605.60026)] and D. Pollard [“Empirical processes: Theory and applications” (1990; Zbl 0741.60001)] and, somewhat tangentially, a chapter in M. Ledoux and M. Talagrand’s book “Probability in Banach spaces: Isoperimetry and processes” (1991; Zbl 0748.60004).

The book under review is divided into three parts: Stochastic convergence, Empirical processes and Statistical applications, and it also has an appendix with important miscellanea such as inequalities and Gaussian processes. The first part, ‘Stochastic convergence’ (about 80 pages) constitutes a complete account of the theory of convergence in law, a.s. and in probability of not necessarily measurable random elements with values in not necessarily separable metric spaces. It contains the necessary integral calculus of non-measurable functions, the main properties of Hoffmann-Jørgensen’s definition of convergence in law of non-measurable random elements, Dudley’s almost uniform convergence, etc. Some of the highlights of this exposition are a new version of Prokhorov’s theorem, with the added notion of ‘asymptotic measurability’, and the Skorokhod-Dudley-Wichura theorem.

The second part, ‘Empirical processes’ (about 200 pages), develops theory of empirical processes. The tools of symmetrization and randomization, exponential inequalities for sums of independent variables, and maximal inequalities and entropy bounds are first developed and then applied to prove uniform laws of large numbers (Glivenko-Cantelli theorems), uniform central limit theorems (Donsker theorems), multiplier clt’s and rates of convergence. The two main types of hypotheses, namely Vapnik-Červonenkis type and bracketing type, are presented in great detail, with many useful examples. Uniformity not only in \(f\in{\mathcal F}\), but also in \(P\), both for the lln and the clt, is also treated. This part of the book has also a chapter devoted to sharp exponential bounds [mainly from M. Talagrand, Ann. Probab. 22, No. 1, 28-76 (1994; Zbl 0798.60051)]. The inclusion of so many examples, of the sharp exponential inequalities and of a refined form of Dudley’s theorem on the covering numbers of VC classes [D. Haussler, J. Comb. Theory, Ser. A 69, No. 2, 217-232 (1995; Zbl 0818.60005)] are some of the salient features of this exposition. Neither necessary conditions for the lln and the clt nor the law of the iterated logarithm are treated; although these are very important parts of the theory, they are less applicable than most of the material chosen by the authors.

The third part, ‘Statistical applications’ (about 150 pages), includes \(M\) and \(Z\) estimators, rates of convergence of \(M\)-estimators with applications e.g. to regression, the bootstrap, the delta method, independence empirical processes, contiguity and convolution and minimax theorems. Perhaps the first application of modern empirical process theory in statistics was D. Pollard’s [Econ. Theory 1, 295-314 (1985)] to \(M\)-estimation. The authors expand on this, including the subsequent work by J. Kim and D. Pollard [Ann. Stat. 18, No. 1, 191-219 (1990; Zbl 0703.62063)], and present very nice and interesting examples, such as Grenander’s estimator for monotone densities, the short estimator, etc. They also present recent work, mostly of van de Geer [Report 93-06, Univ. of Leiden] and L. Birgé and P. Massart [Probab. Theory Relat. Fields 97, No. 1/2, 113-150 (1993; Zbl 0805.62037)], on the method of sieves and minimum contrast estimators. The bootstrap of empirical processes [mainly, the reviewer and J. Zinn, Ann. Probab. 18, No. 2, 851-869 (1990; Zbl 0706.62017)], and J. Præstgaard and the second author, Ann. Probab. 21, No. 4, 2053-2086 (1993; Zbl 0792.62038)] is presented very thoroughly. This presentation relies much less on randomization than the original works, and it substantially improves them on measurability. The chapter on the delta method is very complete and contains very significant examples such as the Wilcoxon statistic, the Nelson-Aalen estimator of the cumulative hazard function for censored data, quantiles, copula function, Kaplan-Meyer and the product integral, etc. The usefulness of Hadamard differentiability is emphasized [see R. M. Dudley, e.g. Ann. Stat. 22, No. 1, 1-20 (1994; Zbl 0816.62039)], on Fréchet differentiability with respect to \(p\)-variation norms, with rates). The version of Prokhorov’s theorem from Part 1 finds a good application in the proofs of general forms of the convolution and minimax theorems.

There are six appendices, the first and sixth with a collection of useful inequalities, the second with a complete list of all the facts on Gaussian processes used throughout the text, the third on Rademacher processes, the fourth, with a proof of one of Talagrand’s isoperimetric inequalities for product spaces, and the fifth with some central limit theorems in \({\mathbf R}\). Every chapter contains a list of exercises, and every part ends with bibliographical notes.

The preface of this book explains what it tries to do: “The first goal is to give an exposition of certain modes of stochastic convergence…” “A second goal is to use the weak convergence theory background developed in Part 1 to present an account of major components of the modern theory of empirical processes indexed by classes of sets and functions.” “Our third goal is to illustrate the usefulness of modern weak convergence theory and modern empirical process theory for statistics by a wide variety of applications.” The authors have amply succeeeded: the book does exactly what it sets up to do.

The book is of immediate interest to mathematical statisticians, particularly to those working on asymptotic theory or using (not necessarilly asymptotic) properties of the empirical process. It is also of interest to probabilists and to less mathematical statisticians. This book is very appropriate both as a reference and as a graduate textbook (for several courses). This is not the place to make recommendations; so, I will only say that I am keeping this book in my office, on the shelf closest to my desk.

The book under review is divided into three parts: Stochastic convergence, Empirical processes and Statistical applications, and it also has an appendix with important miscellanea such as inequalities and Gaussian processes. The first part, ‘Stochastic convergence’ (about 80 pages) constitutes a complete account of the theory of convergence in law, a.s. and in probability of not necessarily measurable random elements with values in not necessarily separable metric spaces. It contains the necessary integral calculus of non-measurable functions, the main properties of Hoffmann-Jørgensen’s definition of convergence in law of non-measurable random elements, Dudley’s almost uniform convergence, etc. Some of the highlights of this exposition are a new version of Prokhorov’s theorem, with the added notion of ‘asymptotic measurability’, and the Skorokhod-Dudley-Wichura theorem.

The second part, ‘Empirical processes’ (about 200 pages), develops theory of empirical processes. The tools of symmetrization and randomization, exponential inequalities for sums of independent variables, and maximal inequalities and entropy bounds are first developed and then applied to prove uniform laws of large numbers (Glivenko-Cantelli theorems), uniform central limit theorems (Donsker theorems), multiplier clt’s and rates of convergence. The two main types of hypotheses, namely Vapnik-Červonenkis type and bracketing type, are presented in great detail, with many useful examples. Uniformity not only in \(f\in{\mathcal F}\), but also in \(P\), both for the lln and the clt, is also treated. This part of the book has also a chapter devoted to sharp exponential bounds [mainly from M. Talagrand, Ann. Probab. 22, No. 1, 28-76 (1994; Zbl 0798.60051)]. The inclusion of so many examples, of the sharp exponential inequalities and of a refined form of Dudley’s theorem on the covering numbers of VC classes [D. Haussler, J. Comb. Theory, Ser. A 69, No. 2, 217-232 (1995; Zbl 0818.60005)] are some of the salient features of this exposition. Neither necessary conditions for the lln and the clt nor the law of the iterated logarithm are treated; although these are very important parts of the theory, they are less applicable than most of the material chosen by the authors.

The third part, ‘Statistical applications’ (about 150 pages), includes \(M\) and \(Z\) estimators, rates of convergence of \(M\)-estimators with applications e.g. to regression, the bootstrap, the delta method, independence empirical processes, contiguity and convolution and minimax theorems. Perhaps the first application of modern empirical process theory in statistics was D. Pollard’s [Econ. Theory 1, 295-314 (1985)] to \(M\)-estimation. The authors expand on this, including the subsequent work by J. Kim and D. Pollard [Ann. Stat. 18, No. 1, 191-219 (1990; Zbl 0703.62063)], and present very nice and interesting examples, such as Grenander’s estimator for monotone densities, the short estimator, etc. They also present recent work, mostly of van de Geer [Report 93-06, Univ. of Leiden] and L. Birgé and P. Massart [Probab. Theory Relat. Fields 97, No. 1/2, 113-150 (1993; Zbl 0805.62037)], on the method of sieves and minimum contrast estimators. The bootstrap of empirical processes [mainly, the reviewer and J. Zinn, Ann. Probab. 18, No. 2, 851-869 (1990; Zbl 0706.62017)], and J. Præstgaard and the second author, Ann. Probab. 21, No. 4, 2053-2086 (1993; Zbl 0792.62038)] is presented very thoroughly. This presentation relies much less on randomization than the original works, and it substantially improves them on measurability. The chapter on the delta method is very complete and contains very significant examples such as the Wilcoxon statistic, the Nelson-Aalen estimator of the cumulative hazard function for censored data, quantiles, copula function, Kaplan-Meyer and the product integral, etc. The usefulness of Hadamard differentiability is emphasized [see R. M. Dudley, e.g. Ann. Stat. 22, No. 1, 1-20 (1994; Zbl 0816.62039)], on Fréchet differentiability with respect to \(p\)-variation norms, with rates). The version of Prokhorov’s theorem from Part 1 finds a good application in the proofs of general forms of the convolution and minimax theorems.

There are six appendices, the first and sixth with a collection of useful inequalities, the second with a complete list of all the facts on Gaussian processes used throughout the text, the third on Rademacher processes, the fourth, with a proof of one of Talagrand’s isoperimetric inequalities for product spaces, and the fifth with some central limit theorems in \({\mathbf R}\). Every chapter contains a list of exercises, and every part ends with bibliographical notes.

The preface of this book explains what it tries to do: “The first goal is to give an exposition of certain modes of stochastic convergence…” “A second goal is to use the weak convergence theory background developed in Part 1 to present an account of major components of the modern theory of empirical processes indexed by classes of sets and functions.” “Our third goal is to illustrate the usefulness of modern weak convergence theory and modern empirical process theory for statistics by a wide variety of applications.” The authors have amply succeeeded: the book does exactly what it sets up to do.

The book is of immediate interest to mathematical statisticians, particularly to those working on asymptotic theory or using (not necessarilly asymptotic) properties of the empirical process. It is also of interest to probabilists and to less mathematical statisticians. This book is very appropriate both as a reference and as a graduate textbook (for several courses). This is not the place to make recommendations; so, I will only say that I am keeping this book in my office, on the shelf closest to my desk.

Reviewer: E.Giné (College Station)

##### MSC:

60-02 | Research exposition (monographs, survey articles) pertaining to probability theory |

62-02 | Research exposition (monographs, survey articles) pertaining to statistics |