×

A uniform framework for the combination of penalties in generalized structured models. (English) Zbl 1414.62321

Summary: Penalized estimation has become an established tool for regularization and model selection in regression models. A variety of penalties with specific features are available and effective algorithms for specific penalties have been proposed. But not much is available to fit models with a combination of different penalties. When modeling the rent data of Munich as in our application, various types of predictors call for a combination of a Ridge, a group Lasso and a Lasso-type penalty within one model. We propose to approximate penalties that are (semi-)norms of scalar linear transformations of the coefficient vector in generalized structured models – such that penalties of various kinds can be combined in one model. The approach is very general such that the Lasso, the fused Lasso, the Ridge, the smoothly clipped absolute deviation penalty, the elastic net and many more penalties are embedded. The computation is based on conventional penalized iteratively re-weighted least squares algorithms and hence, easy to implement. New penalties can be incorporated quickly. The approach is extended to penalties with vector based arguments. There are several possibilities to choose the penalty parameter(s). A software implementation is available. Some illustrative examples show promising results.

MSC:

62J07 Ridge regression; shrinkage estimators (Lasso)
62J12 Generalized linear models (logistic models)
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Antoniadis, A.; Fan, J., Regularization of wavelet approximations, J Am Stat Assoc, 96, 939-967, (2001) · Zbl 1072.62561
[2] Bondell, HD; Reich, BJ, Simultaneous factor selection and collapsing levels in ANOVA, Biometrics, 65, 169-177, (2009) · Zbl 1159.62048
[3] Claeskens, G.; Hjort, NL, Minimizing average risk in regression models, Econom Theory, 24, 493-527, (2008) · Zbl 1284.62454
[4] Rooi, J.; Eilers, P., Deconvolution of pulse trains with the L0 penalty, Analytica Chimica Acta, 705, 218-226, (2011)
[5] Donoho, D.; Elad, M., Optimally sparse representation in general (nonorthogonal) dictionaries via \(l^1\) minimization, Proc Natl Acad Sci, 100, 2197-2202, (2003) · Zbl 1064.94011
[6] Efron, B.; Hastie, T.; Johnstone, I.; Tibshirani, R., Least anlge regression, Ann Stat, 32, 407-499, (2004) · Zbl 1091.62054
[7] Eilers, PHC; Marx, BD, Flexible smoothing with b-splines and penalties, Stat Sci, 11, 89-121, (1996) · Zbl 0955.62562
[8] Fahrmeir L, Belitz C, Biller C, Brezger A, Heim S, Hennerfeind A, Jerak A (2007) Statistik. Dokumentation und Analysen, Landeshauptstadt München, Sozialreferat, Amt für Wohnen und Migration
[9] Fahrmeir, L.; Kneib, T.; Konrath, S., Bayesian regularisation in structured additive regression: a unifying perspective on shrinkage, smoothing and predictor selection, Stat Comput, 20, 203-219, (2010)
[10] Fahrmeir, L.; Kneib, T.; Lang, S., Penalized structured additive regression for space-time data: a bayesian perspective, Stat Sinica, 14, 715-745, (2004) · Zbl 1073.62025
[11] Fahrmeir L, Tutz G (2001) Multivariate statistical modelling based on generalized linear models. Springer, New York · Zbl 0980.62052
[12] Fan, J.; Li, R., Variable selection via nonconcave penalized likelihood and its oracle properties, J Am Stat Assoc, 96, 1348-1360, (2001) · Zbl 1073.62547
[13] Frank, lE; Friedman, JH, A statistical view of some chemometrics regression tools, Technometrics, 35, 109-135, (1993) · Zbl 0775.62288
[14] Friedman JH, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1-22 (glmnet, R package version 1.9-8)
[15] Gertheiss, J.; Hogger, S.; Oberhauser, C.; Tutz, G., Selection of ordinally scaled independent variables with applications to international classification of functioning core sets, R Stat Soc Ser C Appl Stat, 60, 377-395, (2011)
[16] Gertheiss, J.; Tutz, G., Sparse modeling of categorial explanatory variables, Ann Appl Stat, 4, 2150-2180, (2010) · Zbl 1220.62092
[17] Gertheiss, J.; Tutz, G., Regularization and model selection with categorial effect modifiers, Stat Sinica, 22, 957-982, (2012) · Zbl 1257.62078
[18] GIMP Team (2012) GNU Image Manipulation Program. http://www.gimp.org
[19] Goeman, JJ, L1 penalized estimation in the cox proportional hazards model, Biom J, 52, 70-84, (2010) · Zbl 1207.62185
[20] Hastie T, Efron B (2013) lars: Least angle regression, Lasso and forward stagewise. R package version 1:2
[21] Hoerl, AE; Kennard, RW, Ridge regression: biased estimation for nonorthogonal problems, Technometrics, 12, 55-67, (1970) · Zbl 0202.17205
[22] Kneib T, Heinzl F, Brezger A, Bove DS, Klein N (2014) BayesX: R utilities accompanying the software package BayesX. R package versions 0.2-6
[23] Koch, I., On the asymptotic performance of median smoothers in image analysis and nonparametric regression, Ann Stat, 24, 1648-1666, (1996) · Zbl 0867.62031
[24] Marx, BD; Eilers, PHC, Direct generalized additive modeling with penalized likelihood, J Comput Stat Data Anal, 28, 193-209, (1998) · Zbl 1042.62580
[25] McCullagh P, Nelder JA (1983) Generalized linear models. Chapman & Hall, London · Zbl 0588.62104
[26] Meier L (2013) grplasso: Fitting user specified models with group Lasso penalty. R package version 0.4-5
[27] Meier L, van de Geer S, Bnhlmann P (2008) The group Lasso for logistic regression. R Stat Soc Ser B Stat Methodol 70(1):53-71 · Zbl 1400.62276
[28] Meier, L.; Geer, S.; Bnhlmann, P., High-dimensional additive modeling, Ann Stat, 37, 3779-3821, (2009) · Zbl 1360.62186
[29] Oelker M-R (2015) gvcm.cat: Regularized categorial effects/categorial effect modifiers in GLMs. R package version 1.9
[30] Osborne, MR; Turlach, BA, A homotopy algorithm for the quantile regression lasso and related piecewise linear problems, J Comput Graph Stat, 20, 972-987, (2011)
[31] R Core Team (2015) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. R version 3.1.3 (2015-03-09)
[32] Rippe, RCA; Meulman, JJ; Eilers, PHC, Visualization of genomic changes by segmented smoothing using an \(l_0\) penalty, PLoS One, 7, 1-14, (2012)
[33] Tibshirani, R., Regression shrinkage and selection via the LASSO, R Stat Soc Ser B Stat Methodol, 58, 267-288, (1996) · Zbl 0850.62538
[34] Tibshirani, R.; Saunders, M.; Rosset, J.; Zhu, J.; Knight, K., Sparsity and smoothness via the fused LASSO, R Stat Soc Ser B Stat Methodol, 67, 91-108, (2005) · Zbl 1060.62049
[35] Ulbricht J (2010) Variable selection in generalized linear models. Dissertation, Department of Statistics, Ludwig-Maximilians-Universität München, Verlag Dr. Hut
[36] Verbyla, AP; Cullis, BR; Kenward, MG; Welham, SJ, The analysis of designed experiments and longitudinal data by using smoothing splines, J R Stat Soc Ser C (Appl Stat), 48, 269-311, (1999) · Zbl 0956.62062
[37] Wang, H.; Leng, C., A note on adaptive group lasso, J Comput Stat Data Anal, 52, 5277-5286, (2008) · Zbl 1452.62524
[38] Wood, S., Stable and efficient multiple smoothing parameter estimation for generalized additive models, J Am Stat Assoc, 99, 673-686, (2004) · Zbl 1117.62445
[39] Wood S (2011) Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. J R Stat Soc Ser B 73(1):3-36 (mgcv, R package versions 1.8-4)
[40] Yuan, M.; Lin, Y., Model selection and estimation in regression with grouped variables, R Stat Soc Ser B Stat Methodol, 68, 49-67, (2006) · Zbl 1141.62030
[41] Zou, H., The adaptive LASSO and its oracle properties, J Am Stat Assoc, 101, 1418-1429, (2006) · Zbl 1171.62326
[42] Zou, H.; Hastie, T., Regularization and variable selection via the Elastic net, R Stat Soc Ser B Stat Methodol, 67, 301-320, (2005) · Zbl 1069.62054
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.