×

Penalized spline approaches for functional logit regression. (English) Zbl 1367.62183

Summary: The problem of multicollinearity associated with the estimation of a functional logit model can be solved by using as predictor variables a set of functional principal components. The functional parameter estimated by functional principal component logit regression is often nonsmooth and then difficult to interpret. To solve this problem, different penalized spline estimations of the functional logit model are proposed in this paper. All of them are based on smoothed functional PCA and/or a discrete penalty in the log-likelihood criterion in terms of B-spline expansions of the sample curves and the functional parameter. The ability of these smoothing approaches to provide an accurate estimation of the functional parameter and their classification performance with respect to unpenalized functional PCA and LDA-PLS are evaluated via simulation and application to real data. Leave-one-out cross-validation and generalized cross-validation are adapted to select the smoothing parameter and the number of principal components or basis functions associated with the considered approaches.

MSC:

62H25 Factor analysis and principal components; correspondence analysis
62G08 Nonparametric regression and quantile regression
62J12 Generalized linear models (logistic models)
65C60 Computational problems in statistics (MSC2010)

Software:

fda (R)
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Agresti A (1990) Categorical data analysis. Wiley, New York · Zbl 0716.62001
[2] Aguilera A, Escabias M, Preda C, Saporta G (2010) Using basis expansion for estimating functional pls regression. Applications with chemometric data. Chemom Intell Lab Syst 104:289–305 · doi:10.1016/j.chemolab.2010.09.007
[3] Aguilera A, Escabias M, Valderrama M (2008) Discussion of different logistic models with functional data. Application to systemic lupus erythematosus. Comput Stat Data Anal 53(1):151–163 · Zbl 1452.62791 · doi:10.1016/j.csda.2008.07.001
[4] Aguilera A, Gutiérrez R, Ocaña F, Valderrama M (1995) Computational approaches to estimation in the principal component analysis of a stochastic process. Appl Stoch Models Data Anal 11(4):279–299 · Zbl 0938.65012 · doi:10.1002/asm.3150110402
[5] Aguilera A, Gutiérrez R, Ocaña F, Valderrama M (1996) Approximation of estimators in the pca of a stochastic process using B-splines. Commun Stat, Simul Comput 25(3):671–690 · Zbl 0937.62602 · doi:10.1080/03610919608813336
[6] Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. Wadswort · Zbl 0541.62042
[7] Cardot H, Crambes C, Kneip A, Sarda P (2007) Smoothing splines estimators in functional linear regression with errors-in-variables. Comput Stat Data Anal 51:4832–4848 · Zbl 1162.62333 · doi:10.1016/j.csda.2006.07.029
[8] Cardot H, Ferraty F, Sarda P (2003) Spline estimators for the functional linear model. Stat Sin 13:571–591 · Zbl 1050.62041
[9] Cardot H, Sarda P (2005) Estimation in generalized linear models for functional data via penalized likelihood. J Multivar Anal 92(1):24–41 · Zbl 1065.62127 · doi:10.1016/j.jmva.2003.08.008
[10] Craven P, Wahba G (1979) Smoothing noisy data with spline functions–estimating the correct degree of smoothing by the method of generalized cross-validation. Numer Math 31(4):377–403 · Zbl 0377.65007 · doi:10.1007/BF01404567
[11] Currie I, Durban M (2002) Flexible smoothing with P-splines: a unified approach. Stat Model 2(4):333–349 · Zbl 1195.62072 · doi:10.1191/1471082x02st039ob
[12] De Boor C (2001) A practical guide to splines, revised edn. Springer, Berlin · Zbl 0987.65015
[13] Eilers P, Marx B (1996) Flexible smoothing with B-splines and penalties. Stat Sci 11(2):89–121 · Zbl 0955.62562 · doi:10.1214/ss/1038425655
[14] Eilers P, Marx B (2002) Generalized linear additive smooth structures. J Comput Graph Stat 11(4):758–783 · doi:10.1198/106186002844
[15] Escabias M, Aguilera A, Valderrama M (2004) Principal component estimation of functional logistic regression: discussion of two different approaches. J Nonparametr Stat 16(3–4):365–384 · Zbl 1065.62114 · doi:10.1080/10485250310001624738
[16] Escabias M, Aguilera A, Valderrama M (2005) Modeling environmental data by functional principal component logistic regression. Environmetrics 16(1):95–107 · doi:10.1002/env.696
[17] Escabias M, Aguilera A, Valderrama M (2007) Functional pls logit regression model. Comput Stat Data Anal 51(10):4891–4902 · Zbl 1162.62392 · doi:10.1016/j.csda.2006.08.011
[18] Ferraty F, Vieu P (2003) Curves discrimination: a nonparametric functional approach. Comput Stat Data Anal 44:161–173 · Zbl 1429.62241 · doi:10.1016/S0167-9473(03)00032-X
[19] Ferraty F, Vieu P (2006) Nonparametric functional data analysis. Springer, Berlin · Zbl 1119.62046
[20] Harezlak J, Coull B, Laird N, Magari S, Christiani D (2007) Penalized solutions to functional regression problems. Comput Stat Data Anal 51(10):4911–4925 · Zbl 1162.62335 · doi:10.1016/j.csda.2006.09.034
[21] Hastie T, Buja A, Tibshirani R (1994) Flexible discriminant analysis by optimal scoring. J Am Stat Assoc 89(428):1255–1270 · Zbl 0812.62067 · doi:10.1080/01621459.1994.10476866
[22] Hastie T, Tibshirani R (1990) Generalized additive models. Chapman & Hall, London · Zbl 0747.62061
[23] Hastie T, Tibshirani R (1993) Varying-coefficient models (with discussion). J R Stat Soc B 55:757–796 · Zbl 0796.62060
[24] James G (2002) Generalized linear models with functional predictors. J R Stat Soc B 64(3):411–432 · Zbl 1090.62070 · doi:10.1111/1467-9868.00342
[25] Le Cessie S, Van Houwelingen J (1992) Ridge estimators in logistic regression. Appl Stat 41(1):191–201 · Zbl 0825.62593 · doi:10.2307/2347628
[26] van der Linde A (2008) Variational Bayesian functional pca. Comput Stat Data Anal 53(2):517–533 · Zbl 1452.62425 · doi:10.1016/j.csda.2008.09.015
[27] Marx B, Eilers P (1999) Generalized linear regression on sampled signals and curves. A p-spline approach. Technometrics 41(1):1–13 · doi:10.1080/00401706.1999.10485591
[28] Müller H (2005) Functional modelling and classification of longitudinal data. Board of the Foundation of Scand J Stat 32:223–240 · Zbl 1089.62072 · doi:10.1111/j.1467-9469.2005.00429.x
[29] Ocaña F, Aguilera A, Escabias M (2007) Computational considerations in functional principal component analysis. Comput Stat 22(3):449–465 · Zbl 1196.62080 · doi:10.1007/s00180-007-0051-2
[30] Ocaña F, Aguilera A, Valderrama M (2008) Estimation of functional regression models for functional responses by wavelet approximations. In: Dabo-Niang S, Ferraty F (eds) Functional and operatorial statistics. Physica-Verlag, Heidelberg, pp 15–22
[31] O’Sullivan F (1986) A statistical perspective on ill-posed inverse problems. Stat Sci 1:505–527
[32] Preda C, Saporta G, Lévéder C (2007) Pls classification for functional data. Comput Stat 22:223–235 · Zbl 1196.62086 · doi:10.1007/s00180-007-0041-4
[33] Ramsay JO, Silverman BW (2002) Applied functional data analysis: methods and case studies. Springer, Berlin · Zbl 1011.62002
[34] Ramsay JO, Silverman BW (2005) Functional data analysis. Springer, Berlin
[35] Ratcliffe S, Heller G, Leader L (2002) Functional data analysis with application to periodically stimulated foetal heart rate data. II: Functional logistic regression. Stat Med 21:1115–1127 · doi:10.1002/sim.1068
[36] Reinsch C (1967) Smoothing by spline functions. Numer Math 10:177–183 · Zbl 0161.36203 · doi:10.1007/BF02162161
[37] Reiss P, Ogden R (2007) Functional principal component regression and functional partial least squares. J Am Stat Assoc 102(479):984–996 · Zbl 1469.62237 · doi:10.1198/016214507000000527
[38] Ruppert D (2002) Selecting the number of knots for penalized splines. J Comput Graph Stat 11(4):735–757 · doi:10.1198/106186002853
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.