A cost-sensitive constrained Lasso. (English) Zbl 07363868

Summary: The Lasso has become a benchmark data analysis procedure, and numerous variants have been proposed in the literature. Although the Lasso formulations are stated so that overall prediction error is optimized, no full control over the accuracy prediction on certain individuals of interest is allowed. In this work we propose a novel version of the Lasso in which quadratic performance constraints are added to Lasso-based objective functions, in such a way that threshold values are set to bound the prediction errors in the different groups of interest (not necessarily disjoint). As a result, a constrained sparse regression model is defined by a nonlinear optimization problem. This cost-sensitive constrained Lasso has a direct application in heterogeneous samples where data are collected from distinct sources, as it is standard in many biomedical contexts. Both theoretical properties and empirical studies concerning the new method are explored in this paper. In addition, two illustrations of the method on biomedical and sociological contexts are considered.


62-07 Data analysis (statistics) (MSC2010)
62H12 Estimation in multivariate analysis
62J07 Ridge regression; shrinkage estimators (Lasso)
62P99 Applications of statistics


UCI-ml; Gurobi
Full Text: DOI


[1] Bradford, JP; Kunz, C.; Kohavi, R.; Brunk, C.; Brodley, CE; Nédellec, C.; Rouveirol, C., Pruning decision trees with misclassification costs, Machine learning: ECML-98, 131-136 (1998), Berlin: Springer, Berlin
[2] Bühlmann, P.; Van-De Geer, S., Statistics for high-dimensional data (2011), Berlin: Springer, Berlin · Zbl 1273.62015
[3] Carrizosa, E.; Romero-Morales, D., Combining minsum and minmax: a goal programming approach, Oper Res, 49, 1, 169-174 (2001) · Zbl 1163.90727
[4] Carrizosa, E.; Martín-Barragán, B.; Morales, DR, Multi-group support vector machines with measurement costs: a biobjective approach, Discrete Appl Math, 156, 950-966 (2008) · Zbl 1152.90536
[5] Datta, S.; Das, S., Near-Bayesian support vector machines for imbalanced data classification with equal or unequal misclassification costs, Neural Netw, 70, 39-52 (2015) · Zbl 1394.68280
[6] Donoho, DL; Johnstone, IM; Kerkyacharian, G.; Picard, D., Wavelet shrinkage: Asymptopia?, J R Stat Soc Ser B (Methodol), 57, 2, 301-369 (1995) · Zbl 0827.62035
[7] Efron, B.; Hastie, T.; Johnstone, I.; Tibshirani, R., Least angle regression, Ann Stat, 32, 2, 407-499 (2004) · Zbl 1091.62054
[8] Fan, J.; Li, R., Variable selection via nonconcave penalized likelihood and its oracle properties, J Am Stat Assoc, 96, 456, 1348-1360 (2001) · Zbl 1073.62547
[9] Freitas, A.; Costa-Pereira, A.; Brazdil, P.; Song, IY; Eder, J.; Nguyen, TM, Cost-sensitive decision trees applied to medical data, Data warehousing and knowledge discovery, 303-312 (2007), Berlin: Springer, Berlin
[10] Friedman, J.; Hastie, T.; Tibshirani, R., The elements of statistical learning (2001), Heidelberg: Springer, Heidelberg · Zbl 0973.62007
[11] Gaines, BR; Kim, J.; Zhou, H., Algorithms for fitting the constrained Lasso, J Comput Graph Stat, 27, 4, 861-871 (2018)
[12] Garside, MJ, The best sub-set in multiple regression analysis, J R Stat Soc Ser C (Appl Stat), 14, 2-3, 196-200 (1965)
[13] Gurobi Optimization L (2018) Gurobi optimizer reference manual. http://www.gurobi.com
[14] Hastie, T.; Tibshirani, R.; Wainwright, M., Statistical learning with sparsity (2015), New York: Chapman and Hall/CRC, New York · Zbl 1319.68003
[15] He, H.; Ma, Y., Imbalanced learning: foundations, algorithms, and applications (2013), Hoboken: Wiley, Hoboken · Zbl 1272.68022
[16] Hoerl, AE; Kennard, RW, Ridge regression: biased estimation for nonorthogonal problems, Technometrics, 12, 1, 55-67 (1970) · Zbl 0202.17205
[17] Hu, Q.; Zeng, P.; Lin, L., The dual and degrees of freedom of linearly constrained generalized lasso, Comput Stat Data Anal, 86, 13-26 (2015) · Zbl 1468.62084
[18] James GM, Paulson C, Rusmevichientong P (2019) Penalized and constrained optimization: an application to high-dimensional website advertising. J Am Stat Assoc 1-31 · Zbl 1437.62687
[19] Kouno, T.; de Hoon, M.; Mar, JC; Tomaru, Y.; Kawano, M.; Carninci, P.; Suzuki, H.; Hayashizaki, Y.; Shin, JW, Temporal dynamics and transcriptional control using single-cell gene expression analysis, Genome Biol, 14, 10, R118 (2013)
[20] Lee, W.; Jun, CH; Lee, JS, Instance categorization by support vector machines to adjust weights in adaboost for imbalanced data classification, Inf Sci, 381, Supplement C, 92-103 (2017)
[21] Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml
[22] Ollier, E.; Viallon, V., Regression modelling on stratified data with the lasso, Biometrika, 104, 1, 83-96 (2017) · Zbl 07072183
[23] Prati, RC; Batista, GEAPA; Silva, DF, Class imbalance revisited: a new experimental setup to assess the performance of treatment methods, Knowl Inf Syst, 45, 1, 247-270 (2015)
[24] Redmond, M.; Baveja, A., A data-driven software tool for enabling cooperative information sharing among police departments, Eur J Oper Res, 141, 3, 660-678 (2002) · Zbl 1081.68745
[25] Rockafellar, RT, Convex analysis (1972), Princeton: Princeton University Press, Princeton
[26] Shapiro, A.; Dentcheva, D.; Ruszczyński, A., Lectures on stochastic programming: modeling and theory (2009), Philadelphia: SIAM, Philadelphia · Zbl 1183.90005
[27] Simon, N.; Friedman, J.; Hastie, T.; Tibshirani, R., Regularization paths for Cox’s proportional hazards model via coordinate descent, J Stat Softw, 39, 5, 1-13 (2011)
[28] Stamey, TA; Kabalin, JN; McNeal, JE; Johnstone, IM; Freiha, F.; Redwine, EA; Yang, N., Prostate specific antigen in the diagnosis and treatment of adenocarcinoma of the prostate: II. Radical prostatectomy treated patients, J Urol, 141, 5, 1076-1083 (1989)
[29] Sun, Y.; Wong, AK; Kamel, MS, Classification of imbalanced data: a review, Int J Pattern Recognit Artif Intell, 23, 687-719 (2009)
[30] Tibshirani, R., Regression shrinkage and selection via the lasso, J R Stat Soc Ser B (Methodol), 58, 1, 267-288 (1996) · Zbl 0850.62538
[31] Tibshirani, R.; Saunders, M.; Rosset, S.; Zhu, J.; Knight, K., Sparsity and smoothness via the fused lasso, J R Stat Soc Ser B (Stat Methodol), 67, 1, 91-108 (2005) · Zbl 1060.62049
[32] Tibshirani, RJ; Taylor, J., The solution path of the generalized Lasso, Ann Stat, 39, 3, 1335-1371 (2011) · Zbl 1234.62107
[33] Torres-Barrán, A.; Alaíz, CM; Dorronsoro, JR, \( \nu \)-SVM solutions of constrained Lasso and elastic net, Neurocomputing, 275, 1921-1931 (2018)
[34] U.S. Department of Commerce, Bureau of the Census, Census of Population and Housing 1990 United States: Summary Tape File 1a & 3a (Computer Files), U.S. Department of Commerce, Bureau of the Census Producer, Washington, DC and Inter-university Consortium for Political and Social Research Ann Arbor, Michigan (1992)
[35] U.S. Department of Justice, Bureau of Justice Statistics, Law Enforcement Management and Administrative Statistics (Computer File) U.S. Department Of Commerce, Bureau of the Census Producer, Washington, DC and Inter-university Consortium for Political and Social Research Ann Arbor, Michigan (1992)
[36] U.S. Department of Justice, Federal Bureau of Investigation, Crime in the United States (Computer File) (1995)
[37] Yu, G.; Liu, Y., Sparse regression incorporating graphical structure among predictors, J Am Stat Assoc, 111, 514, 707-720 (2016)
[38] Yuan, M.; Lin, Y., Model selection and estimation in regression with grouped variables, J R Stat Soc Ser B (Stat Methodol), 68, 1, 49-67 (2006) · Zbl 1141.62030
[39] Zou, H., The adaptive lasso and its oracle properties, J Am Stat Assoc, 101, 476, 1418-1429 (2006) · Zbl 1171.62326
[40] Zou, H.; Hastie, T., Regularization and variable selection via the elastic net, J R Stat Soc Ser B (Stat Methodol), 67, 2, 301-320 (2005) · Zbl 1069.62054
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.