Partially linear models.

*(English)*Zbl 0968.62006
Heidelberg: Physica-Verlag. x, 203 p. (2000).

This monograph deals with partially linear regression models of the form
\[
Y_i=X_I^T\beta+g(T_i)+\varepsilon_i,\quad i=1,\dots,n,
\]
where \(X_i=(x_{i1},\dots, x_{ip})^T\) and \(T_i=(t_{i1},\dots,t_{id})^T\) are vectors of explanatory variables, and \((X_i,T_i)\) are either independent and identically distributed (i.i.d.) random design points or fixed design points. \(\beta=(\beta_1,\dots,\beta_p)\) is a vector of unknown parameters, \(g(\cdot)\) is an unknown function from \(\mathbb R^d\) to \(\mathbb R^1\), and \(\varepsilon_1,\dots,\varepsilon_n\) are independent random errors with mean zero and finite variance. Partially linear models of this form are semiparametric models since they contain both parametric and nonparametric components. They allow easier interpretation of the effects of each variable and may be preferred to completely nonparametric regression because of the well-known “curse of dimensionality”. The parametric components can be estimated at the rate of \(\sqrt{n}\), while the estimation precision of the nonparametric function decreases rapidly as the dimension of the nonlinear variable increases. The partially linear models are more flexible than standard linear models, since they combine both parametric and nonparametric components.

The main objectives of this monograph are: (i) To present a number of theoretical results for the estimators of both parametric and nonparametric components, and (ii) To illustrate the proposed estimation and testing procedures by several simulated and true data sets using \(XploRe\) – The Interactive Statistical Computing Environment, available on the website: http://www.xplore-stat.de. The emphasis is on methodologies rather than on theory, with a particular focus on applications of partially linear regression techniques to various statistical problems. These problems include least squares regression, asymptotically efficient estimation, bootstrap resampling, censored data analysis, linear measurement error models, nonlinear measurement models, and nonlinear and nonparametric time series models. The authors’ description of the structure of the monograph is the following:

Chapter 2 considers a simple partially linear model. An estimation procedure for the parametric component of the partially linear model is established based on the nonparametric weight sum. Section 2.1 mainly provides asymptotic theory and an estimation procedure for the parametric component with heteroscedastic errors. In this section, the least squares estimator \(\beta_{LS}\) is modified to the weighted least squares estimator \(\beta_{WLS}\). For constructing \(\beta_{WLS}\) we employ the split-sample techniques. The asymptotic normality of \(\beta_{WLS}\) is then derived. Three different variance functions are discussed and estimated. The selection of smoothing parameters involved in the nonparametric weight sum is also discussed in Subsection 2.1.3. Simulation comparison is also implemented in Subsection 2.1.4. A modified estimation procedure for the case of censored data is given in Section 2.2. Based on a modification of the Kaplan-Meier estimator, synthetic data and an estimator of \(\beta\) are constructed. We then establish the asymptotic normality for the resulting estimator of \(\beta\). We also examine the behaviors of the finite sample through a simulated example. Bootstrap approximations are given in Section 2.3.

Chapter 3 discusses the estimation of the nonparametric component without the restriction of constant variance. Convergence and asymptotic normality of the nonparametric estimate are given in Sections 3.2 and 3.3. The estimation methods proposed in this chapter are illustrated through examples in Section 3.4, in which the estimator (1.2.3) is applied to the analysis of the logarithm of the earnings to labour market experience.

In Chapter 4, we consider both linear and nonlinear variables with measurement errors. An estimation procedure and asymptotic theory for the case where the linear variables are measured with measurement errors are given in Section 4.1. The common estimator given in (1.2.2) is modified by applying the so-called “correction for attenuation”, and hence deletes the inconsistency caused by measurement errors. The modified estimator is still asymptotically normal as (1.2.2) but with a more complicated form of the asymptotic variance. Section 4.2 discusses the case where the nonlinear variables are measured with measurement errors. Our conclusion shows that asymptotic normality heavily depends on the distribution of the measurement error when T is measured with error. Examples and numerical discussions are presented to support the theoretical results.

Chapter 5 discusses several relatively theoretic topics. The laws of the iterative logarithm (LIL) and the Berry-Esseen bounds for the parametric component are established. Section 5.3 constructs a class of asymptotically efficient estimators. Two classes of efficiency concepts are introduced. The well-known Bahadur asymptotic efficiency, which considers the exponential rate of the tail probability and second order asymptotic efficiency are discussed in detail in Sections 5.4 and 5.5, respectively. The results of this chapter show that the LS estimate can be modified to have both Bahadur asymptotic efficiency and second order asymptotic efficiency even when the parametric and nonparametric components are dependent. The estimation of the error distribution is also investigated in Section 5.6.

Chapter 6 generalizes the case studies in previous chapters to partially linear time series models and establishes asymptotic results as well as small sample studies. At first we present several data-based test statistics to determine which model should be chosen to model a partially linear dynamical system. Secondly we propose a cross-validation (CV) based criterion to select the optimum linear subset for a partially linear regression model. We investigate the problem of selecting the optimum bandwidth for a partially linear autoregressive model. Finally, we summarize recent developments in a general class of additive stochastic regression models.

This book is available as e-book on the website: http://www.i-xplore.de.

The main objectives of this monograph are: (i) To present a number of theoretical results for the estimators of both parametric and nonparametric components, and (ii) To illustrate the proposed estimation and testing procedures by several simulated and true data sets using \(XploRe\) – The Interactive Statistical Computing Environment, available on the website: http://www.xplore-stat.de. The emphasis is on methodologies rather than on theory, with a particular focus on applications of partially linear regression techniques to various statistical problems. These problems include least squares regression, asymptotically efficient estimation, bootstrap resampling, censored data analysis, linear measurement error models, nonlinear measurement models, and nonlinear and nonparametric time series models. The authors’ description of the structure of the monograph is the following:

Chapter 2 considers a simple partially linear model. An estimation procedure for the parametric component of the partially linear model is established based on the nonparametric weight sum. Section 2.1 mainly provides asymptotic theory and an estimation procedure for the parametric component with heteroscedastic errors. In this section, the least squares estimator \(\beta_{LS}\) is modified to the weighted least squares estimator \(\beta_{WLS}\). For constructing \(\beta_{WLS}\) we employ the split-sample techniques. The asymptotic normality of \(\beta_{WLS}\) is then derived. Three different variance functions are discussed and estimated. The selection of smoothing parameters involved in the nonparametric weight sum is also discussed in Subsection 2.1.3. Simulation comparison is also implemented in Subsection 2.1.4. A modified estimation procedure for the case of censored data is given in Section 2.2. Based on a modification of the Kaplan-Meier estimator, synthetic data and an estimator of \(\beta\) are constructed. We then establish the asymptotic normality for the resulting estimator of \(\beta\). We also examine the behaviors of the finite sample through a simulated example. Bootstrap approximations are given in Section 2.3.

Chapter 3 discusses the estimation of the nonparametric component without the restriction of constant variance. Convergence and asymptotic normality of the nonparametric estimate are given in Sections 3.2 and 3.3. The estimation methods proposed in this chapter are illustrated through examples in Section 3.4, in which the estimator (1.2.3) is applied to the analysis of the logarithm of the earnings to labour market experience.

In Chapter 4, we consider both linear and nonlinear variables with measurement errors. An estimation procedure and asymptotic theory for the case where the linear variables are measured with measurement errors are given in Section 4.1. The common estimator given in (1.2.2) is modified by applying the so-called “correction for attenuation”, and hence deletes the inconsistency caused by measurement errors. The modified estimator is still asymptotically normal as (1.2.2) but with a more complicated form of the asymptotic variance. Section 4.2 discusses the case where the nonlinear variables are measured with measurement errors. Our conclusion shows that asymptotic normality heavily depends on the distribution of the measurement error when T is measured with error. Examples and numerical discussions are presented to support the theoretical results.

Chapter 5 discusses several relatively theoretic topics. The laws of the iterative logarithm (LIL) and the Berry-Esseen bounds for the parametric component are established. Section 5.3 constructs a class of asymptotically efficient estimators. Two classes of efficiency concepts are introduced. The well-known Bahadur asymptotic efficiency, which considers the exponential rate of the tail probability and second order asymptotic efficiency are discussed in detail in Sections 5.4 and 5.5, respectively. The results of this chapter show that the LS estimate can be modified to have both Bahadur asymptotic efficiency and second order asymptotic efficiency even when the parametric and nonparametric components are dependent. The estimation of the error distribution is also investigated in Section 5.6.

Chapter 6 generalizes the case studies in previous chapters to partially linear time series models and establishes asymptotic results as well as small sample studies. At first we present several data-based test statistics to determine which model should be chosen to model a partially linear dynamical system. Secondly we propose a cross-validation (CV) based criterion to select the optimum linear subset for a partially linear regression model. We investigate the problem of selecting the optimum bandwidth for a partially linear autoregressive model. Finally, we summarize recent developments in a general class of additive stochastic regression models.

This book is available as e-book on the website: http://www.i-xplore.de.

Reviewer: M.P.Moklyachuk (Kyïv)

##### MSC:

62-02 | Research exposition (monographs, survey articles) pertaining to statistics |

62G08 | Nonparametric regression and quantile regression |

62J05 | Linear regression; mixed models |

62F10 | Point estimation |

62G20 | Asymptotic properties of nonparametric inference |