×

Sparse logistic regression with \(L_p\) penalty for biomarker identification. (English) Zbl 1166.62314

Summary: We propose a novel method for sparse logistic regression with non-convex regularization \(L_p\;(p <1)\). Based on smooth approximation, we develop several fast algorithms for learning the classifier that is applicable to high dimensional data set such as gene expressions. To the best of our knowledge, these are the first algorithms to perform sparse logistic regression with an \(L_p\) and elastic net (\(L_e\)) penalty. The regularization parameters are decided through maximizing the area under the ROC curve (AUC) of the test data. Experimental results on methylation and microarray data attest the accuracy, sparsity, and efficiency of the proposed algorithms. Biomarkers identified with our methods are compared with those in the literature. Our computational results show that \(L_p\) logistic regression \((p <1)\) outperforms the \(L_1\) logistic regression and SCAD SVM. Software is available upon request from the first author.

MSC:

62J12 Generalized linear models (logistic models)
65C60 Computational problems in statistics (MSC2010)
92C40 Biochemistry, molecular biology
PDFBibTeX XMLCite
Full Text: DOI Link