zbMATH — the first resource for mathematics

The dimensionality reduction principle for generalized additive models. (English) Zbl 0603.62050
Consider an exponential family of distributions of the form \[ \int^{x}e^{b_ 1(\eta)y+b_ 2(\eta)} \nu (dy) \] with a real parameter \(\eta\) and a \(\sigma\)-finite measure \(\nu\) on \({\mathbb{R}}\). Under suitable assumptions their expectations are given by \(b_ 3(\eta):=-b_ 2'(\eta)/b_ 1'(\eta)\). Now assume that for a random vector (Y,X) with values in \({\mathbb{R}}\times [0,1]^ J\), (J\(\in {\mathbb{N}})\), the conditional distribution belongs to this exponential family with \(\eta =f(x)\), \(x\in [0,1]^ J\) and hence \(E(Y| X=x)=b_ 3(f(x))\). [Exponential response model, see e.g. S. J. Haberman, ibid. 5, 815-841 (1977; Zbl 0368.62019) in case of linear f we have a generalized linear model as in J. A. Nelder and R. W. M. Wedderburn, J. R. Stat. Soc., Ser. A 135, 370-384 (1972).]
It is shown that under suitable assumptions on the conditional distribution, on f and the density g(\(\cdot)\) of X, the expected log- likelihood \[ \Delta (a)=\int \{b_ 1(a(x))b_ 3(f(x))+b_ 2(a(x))\}g(x)dx \] can be maximized with respect to \(a\in {\mathcal A}=\{a(x_ 1,...,x_ J)=a_ 0+\sum^{J}_{1}a_ j(x_ j)\) s.t. \(E(a(X))=a_ 0\), \(E(a_ j(X_ j))=0\), \(1\leq j\leq J\}\) (Theorem 1).
The maximizer is called the best additive approximation to the response function f and besides its advantages w.r. to interpretation compared to general approximations it can be estimated from a sample \((Y_ i,X_ i)^ n_{i=1}\) in a quality which does not decrease with increasing dimension J and furthermore the speed of convergence is optimal in the \(L_ 2\) sense (Theorem 2). [For a related result for regression functions, see the author, Adaptive regression and other nonparametric models. Ann. Stat. 13, 689-705 (1985)]. To show this, some spline estimator resulting from maximizing an empirical log-likelihood quantity is used.
Reviewer: U.Stadtm├╝ller

62G20 Asymptotic properties of nonparametric inference
62G05 Nonparametric estimation
Full Text: DOI