×

Inter-class sparsity based discriminative least square regression. (English) Zbl 1439.62163

Summary: Least square regression is a very popular supervised classification method. However, two main issues greatly limit its performance. The first one is that it only focuses on fitting the input features to the corresponding output labels while ignoring the correlations among samples. The second one is that the used label matrix, i.e., zero-one label matrix is inappropriate for classification. To solve these problems and improve the performance, this paper presents a novel method, i.e., inter-class sparsity based discriminative least square regression (ICS\(_-\)DLSR), for multi-class classification. Different from other methods, the proposed method pursues that the transformed samples have a common sparsity structure in each class. For this goal, an inter-class sparsity constraint is introduced to the least square regression model such that the margins of samples from the same class can be greatly reduced while those of samples from different classes can be enlarged. In addition, an error term with row-sparsity constraint is introduced to relax the strict zero-one label matrix, which allows the method to be more flexible in learning the discriminative transformation matrix. These factors encourage the method to learn a more compact and discriminative transformation for regression and thus has the potential to perform better than other methods. Extensive experimental results show that the proposed method achieves the best performance in comparison with other methods for multi-class classification.

MSC:

62J02 General nonlinear regression
62H30 Classification and discrimination; cluster analysis (statistical aspects)
62M45 Neural nets and related approaches to inference from stochastic processes
68T05 Learning and adaptive systems in artificial intelligence
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Abdi, H., Partial least squares regression and projection on latent structure regression (pls regression), Wiley Interdiscip. Rev. Comput. Stat., 2, 1, 97-106 (2010)
[2] Argyriou, A.; Evgeniou, T.; Pontil, M., Convex multi-task feature learning, Mach. Learn., 73, 3, 243-272 (2008) · Zbl 1470.68073
[3] Bishop, C. M., Pattern recognition and machine learning (2006), Springer-Verlag New York, Inc · Zbl 1107.68072
[4] Bunea, F.; She, Y.; Wegkamp, M. H., Optimal selection of reduced rank estimators of high-dimensional matrices, Ann. Statist., 1282-1309 (2011) · Zbl 1216.62086
[5] Cai, X., Ding, C., Nie, F., & Huang, H. (2013). On the equivalent of low-rank linear regressions and linear discriminant analysis based regressions. In ACM SIGKDD international conference on knowledge discovery and data mining; Cai, X., Ding, C., Nie, F., & Huang, H. (2013). On the equivalent of low-rank linear regressions and linear discriminant analysis based regressions. In ACM SIGKDD international conference on knowledge discovery and data mining
[6] Cai, D., He, X., & Han, J. (2007). Spectral regression: A unified approach for sparse subspace learning. In IEEE international conference on data mining; Cai, D., He, X., & Han, J. (2007). Spectral regression: A unified approach for sparse subspace learning. In IEEE international conference on data mining
[7] Cai, X., Nie, F., & Huang, H. (2013). Exact top-k feature selection via l 2,0 -norm constraint. In International joint conference on artificial intelligence; Cai, X., Nie, F., & Huang, H. (2013). Exact top-k feature selection via l 2,0 -norm constraint. In International joint conference on artificial intelligence
[8] Chang, C.-C.; Lin, C.-J., Libsvm: a library for support vector machines, ACM Transactions on Intelligent Systems and Technology (TIST), 2, 3, 27 (2011)
[9] Cherkassky, V.; Ma, Y., Practical selection of svm parameters and noise estimation for svm regression, Neural Networks, 17, 1, 113-126 (2004) · Zbl 1075.68632
[10] De la Torre, F., A least-squares framework for component analysis, IEEE Trans. Pattern Anal. Mach. Intell., 34, 6, 1041-1055 (2012)
[11] Fang, X.; Teng, S.; Lai, Z.; He, Z.; Xie, S.; Wong, W. K., Robust latent subspace learning for image classification, IEEE Transactions on Neural Networks & Learning Systems, PP, 99, 1-14 (2017)
[12] Fang, X.; Xu, Y.; Li, X.; Lai, Z.; Teng, S.; Fei, L., Orthogonal self-guided similarity preserving projection for classification and clustering, Neural Netw., 88, 1-8 (2017) · Zbl 1434.68405
[13] Fang, X.; Xu, Y.; Li, X.; Lai, Z.; Wong, W. K.; Fang, B., Regularized label relaxation linear regression, IEEE Transactions on Neural Networks and Learning Systems (2017)
[14] Feng, Q., Zhou, Y., & Lan, R. (2016). Pairwise linear regression classification for image set retrieval. In Proceedings of the IEEE conference on computer vision and pattern recognition; Feng, Q., Zhou, Y., & Lan, R. (2016). Pairwise linear regression classification for image set retrieval. In Proceedings of the IEEE conference on computer vision and pattern recognition
[15] Gao, J.; Shi, D.; Liu, X., Significant vector learning to construct sparse kernel regression models, Neural Netw., 20, 7, 791-798 (2007) · Zbl 1125.68094
[16] Georghiades, A. S.; Belhumeur, P. N.; Kriegman, D. J., From few to many: Illumination cone models for face recognition under variable lighting and pose, IEEE Transactions on Pattern Analysis and Machine Intelligence, 23, 6, 643-660 (2001)
[17] Guyon, I.; Weston, J.; Barnhill, S.; Vapnik, V., Gene selection for cancer classification using support vector machines, Machine Learning, 46, 1, 389-422 (2002) · Zbl 0998.68111
[18] Hong, M.; Luo, Z. Q., On the linear convergence of the alternating direction method of multipliers, Math. Program., 162, 1-2, 1-35 (2012)
[19] Jiang, Z.; Lin, Z.; Davis, L. S., Label consistent k-svd: Learning a discriminative dictionary for recognition, IEEE Trans. Pattern Anal. Mach. Intell., 35, 11, 2651-2664 (2013)
[20] Kim, D.; Gales, M., Noisy constrained maximum-likelihood linear regression for noise-robust speech recognition, IEEE Transactions on Audio, Speech, and Language Processing, 19, 2, 315-325 (2011)
[21] Kim, E., Lee, M., & Oh, S. (2015). Elastic-net regularization of singular values for robust subspace learning. In IEEE conference on computer vision and pattern recognition; Kim, E., Lee, M., & Oh, S. (2015). Elastic-net regularization of singular values for robust subspace learning. In IEEE conference on computer vision and pattern recognition
[22] Lazebnik, S.; Schmid, C.; Ponce, J., Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories, (2006 IEEE computer society conference on computer vision and pattern recognition, Vol. 2 (2006), IEEE), 2169-2178
[23] Learned-Miller, E.; Huang, G. B.; RoyChowdhury, A.; Li, H.; Hua, G., Labeled faces in the wild: A survey, (Advances in face detection and facial image analysis (2016), Springer), 189-248
[24] Li, X., Chen, M., Nie, F., & Wang, Q. (2017). Locality adaptive discriminant analysis. In Twenty-sixth international joint conference on artificial intelligence; Li, X., Chen, M., Nie, F., & Wang, Q. (2017). Locality adaptive discriminant analysis. In Twenty-sixth international joint conference on artificial intelligence
[25] Li, Y.; Ngom, A., Nonnegative least-squares methods for the classification of high-dimensional biological data, IEEE/ACM Trans. Comput. Biol. Bioinform., 10, 2, 447-456 (2013)
[26] Lin, Z., Chen, M., & Ma, Y. (2010). The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. arXiv preprint arXiv:1009.5055; Lin, Z., Chen, M., & Ma, Y. (2010). The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. arXiv preprint arXiv:1009.5055
[27] Lin, Z.; Liu, R.; Su, Z., Linearized alternating direction method with adaptive penalty for low-rank representation, (Advances in neural information processing systems (2011)), 612-620
[28] Liu, J., Ji, S., & Ye, J. (2009). Multi-task feature learning via efficient l 2, 1 -norm minimization. In Conference on uncertainty in artificial intelligence; Liu, J., Ji, S., & Ye, J. (2009). Multi-task feature learning via efficient l 2, 1 -norm minimization. In Conference on uncertainty in artificial intelligence
[29] Liu, G.; Lin, Z.; Yan, S.; Sun, J.; Yu, Y.; Ma, Y., Robust recovery of subspace structures by low-rank representation, IEEE Trans. Pattern Anal. Mach. Intell., 35, 1, 171-184 (2013)
[30] Maaten, L. V.D.; Hinton, G., Visualizing data using t-sne, J. Mach. Learn. Res., 9, 2605, 2579-2605 (2008) · Zbl 1225.68219
[31] Martinez, A. M. (1998). The ar face database. CVC technical report.; Martinez, A. M. (1998). The ar face database. CVC technical report.
[32] Naseem, I.; Togneri, R.; Bennamoun, M., Linear regression for face recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 11, 2106-2112 (2010)
[33] Nene, S. A., Nayar, S. K., & Murase, H. et al., (1996). Columbia object image library (coil-20).; Nene, S. A., Nayar, S. K., & Murase, H. et al., (1996). Columbia object image library (coil-20).
[34] Ruppert, D.; Sheather, S. J.; Wand, M. P., An effective bandwidth selector for local least squares regression, J. Amer. Statist. Assoc., 90, 432, 1257-1270 (1995) · Zbl 0868.62034
[35] Ruppert, D.; Wand, M. P., Multivariate locally weighted least squares regression, The Annals of Statistics, 1346-1370 (1994) · Zbl 0821.62020
[36] Tibshirani, R., Regression shrinkage and selection via the lasso: a retrospective, J. R. Stat. Soc. Ser. B Stat. Methodol., 73, 3, 273-282 (2011) · Zbl 1411.62212
[37] Wang, J. J.-Y.; Gao, X., Max – min distance nonnegative matrix factorization, Neural Netw., 61, 75-84 (2015) · Zbl 1325.68211
[38] Wang, Q.; Meng, Z.; Li, X., Locality adaptive discriminant analysis for spectral-spatial classification of hyperspectral images, IEEE Geoscience & Remote Sensing Letters, 14, 11, 2077-2081 (2017)
[39] Wang, L.; Pan, C., Groupwise retargeted least-squares regression, IEEE Transactions on Neural Networks and Learning Systems (2017)
[40] Wang, S.-J.; Yang, J.; Sun, M.-F.; Peng, X.-J.; Sun, M.-M.; Zhou, C.-G., Sparse tensor discriminant color space for face verification, IEEE Transactions on Neural Networks and Learning Systems, 23, 6, 876-888 (2012)
[41] Wang, L.; Zhang, X.-Y.; Pan, C., Msdlsr: Margin scalable discriminative least squares regression for multicategory classification, IEEE Transactions on Neural Networks and Learning Systems, 27, 12, 2711-2717 (2016)
[42] Wen, J.; Fang, X.; Cui, J.; Fei, L.; Yan, K.; Chen, Y., Robust sparse linear discriminant analysis, IEEE Trans. Circuits Syst. Video Technol. (2018)
[43] Wright, J.; Yang, A. Y.; Ganesh, A.; Sastry, S. S.; Ma, Y., Robust face recognition via sparse representation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 2, 210-227 (2009)
[44] Xiang, S.; Nie, F.; Meng, G.; Pan, C.; Zhang, C., Discriminative least squares regression for multiclass classification and feature selection, IEEE Transactions on Neural Networks and Learning Systems, 23, 11, 1738-1754 (2012)
[45] Xiang, S.; Zhu, Y.; Shen, X.; Ye, J., Optimal exact least squares rank minimization, (Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining (2012), ACM), 480-488
[46] Xu, Y.; Fang, X.; Zhu, Q.; Chen, Y.; You, J.; Liu, H., Modified minimum squared error algorithm for robust classification and face recognition experiments, Neurocomputing, 135, C, 253-261 (2014)
[47] Xue, H.; Chen, S.; Yang, Q., Discriminatively regularized least-squares classification, Pattern Recognit., 42, 1, 93-104 (2009) · Zbl 1159.68547
[48] Yang, J.; Yuan, X., Linearized augmented Lagrangian and alternating direction methods for nuclear norm minimization, Mathematics of Computation, 82, 281, 301-329 (2013) · Zbl 1263.90062
[49] Ye, J., Least squares linear discriminant analysis, (Proceedings of the 24th international conference on machine learning (2007), ACM), 1087-1093
[50] Zhang, Y. (2010). An alternating direction algorithm for nonnegative matrix factorization. Technical preprint.; Zhang, Y. (2010). An alternating direction algorithm for nonnegative matrix factorization. Technical preprint.
[51] Zhang, Z.; Lai, Z.; Xu, Y.; Shao, L.; Wu, J.; Xie, G.-S., Discriminative elastic-net regularized linear regression, IEEE Trans. Image Process., 26, 3, 1466-1481 (2017) · Zbl 1409.94783
[52] Zhang, Z.; Shao, L.; Xu, Y.; Liu, L.; Yang, J., Marginal representation learning with graph structure self-adaptation, IEEE Transactions on Neural Networks and Learning Systems (2017)
[53] Zhang, X.-Y.; Wang, L.; Xiang, S.; Liu, C.-L., Retargeted least squares regression algorithm, IEEE Transactions on Neural Networks and Learning Systems, 26, 9, 2206-2213 (2015)
[54] Zhang, L., Yang, M., Feng, X., Ma, Y., & Zhang, D. (2012). Collaborative representation based classification for face recognition. arXiv preprint arXiv:1204.2358; Zhang, L., Yang, M., Feng, X., Ma, Y., & Zhang, D. (2012). Collaborative representation based classification for face recognition. arXiv preprint arXiv:1204.2358
[55] Zou, H.; Hastie, T.; Tibshirani, R., Sparse principal component analysis, Journal of Computational and Graphical Statistics, 15, 2, 265-286 (2006)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.