×

zbMATH — the first resource for mathematics

Block clustering based on difference of convex functions (DC) programming and DC algorithms. (English) Zbl 1418.62253
Summary: We investigate difference of convex functions (DC) programming and the DC algorithm (DCA) to solve the block clustering problem in the continuous framework, which traditionally requires solving a hard combinatorial optimization problem. DC reformulation techniques and exact penalty in DC programming are developed to build an appropriate equivalent DC program of the block clustering problem. They lead to an elegant and explicit DCA scheme for the resulting DC program. Computational experiments show the robustness and efficiency of the proposed algorithm and its superiority over standard algorithms such as two-mode K-means, two-mode fuzzy clustering, and block classification EM.

MSC:
62H30 Classification and discrimination; cluster analysis (statistical aspects)
90C25 Convex programming
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Banfield, J. D., & Raftery, A. E. (1993). Model-based gaussian and non gaussian clustering. Biometrics, 49, 803-821. , · Zbl 0794.62034
[2] Bezdek, J. C. (1973). Fuzzy mathematics in pattern classification. Unpublished doctoral dissertation, Cornell University, Ithaca, NY.
[3] Bezdek, J. C. (1981). Pattern recognition with fuzzy objective function algorithm. New York: Plenum Press. , · Zbl 0503.68069
[4] Bhattacharjee, A., Richards, W. G., Staunton, J., Li, C., Monti, S., Vasa, P., (2001). Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc. Natl. Acad. Sci. USA, 98(24), 13790-13795. ,
[5] Bradley, B. S., & Mangasarian, O. L. (1998). Feature selection via concave minimization and support vector machines. In J. Shavlik (Ed.), Machine Learning Proceedings of the Fifteenth International Conferences (pp. 82-90). San Francisco: Morgan Kaufmann.
[6] Celeux, G., & Govaert, G. (1992). A classification EM algorithm for clustering and two stochastic versions. Comput. Statist. Data Anal., 14(3), 315-332. , · Zbl 0937.62605
[7] Celeux, G., & Govaert, G. (1993). Comparison of the mixture and the classification maximum likelihood in cluster analysis. J. Stat. Comput. Simul., 47, 127-146. ,
[8] Collobert, R., Sinz, F., Weston, J., & Bottou, L. (2006). Trading convexity for scalability. In Proceedings of the 23rd International Conference on Machine Learning. New York: ACM. ,
[9] Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39(1), 1-38. · Zbl 0364.62022
[10] Govaert, G., & Nadif, M. (2003). Clustering with block mixture models. Pattern Recognition, 36, 463-473. , · Zbl 1452.62444
[11] Govaert, G., & Nadif, M. (2008). Block clustering with Bernoulli mixture models: Comparison of different approaches. Computational Statistics and Data Analysis, 52, 3233-3245. , · Zbl 1452.62444
[12] Hartigan, J. A. (1975). Clustering algorithms. New York: Wiley. · Zbl 0372.62040
[13] Krause, N., & Singer, Y. (2004). Leveraging the margin more carefully. In Proceedings of the International Conference on Machine Learning ICML’04. San Mateo, CA: Morgan Kaufmann. ,
[14] Le Thi, H. A. (N.d.). DC Programming and DCA. http://lita.sciences.univ-metz.fr/ lethi. · Zbl 1322.90072
[15] Le Thi, H. A. (1997). Contribution à l’optimisation non convexe et l’optimisation globale: Théorie, algorithmes et applications. Habilitation à Diriger des Recherches, Université de Rouen.
[16] Le Thi, H. A., Belghiti, T., & Pham Dinh, T. (2006). A new efficient algorithm based on DC programming and DCA for clustering. Journal of Global Optimization, 37, 593-608. · Zbl 1198.90327
[17] Le Thi, H. A., Le, H. M., Nguyen, V. V., & Pham Dinh, T. (2008). A DC programming approach for feature selection in support vector machine learning. Journal of Advances in Data Analysis and Classification, 2(3), 259-278. , · Zbl 1284.90057
[18] Le Thi, H. A., Le, H. M., & Pham Dinh, T. (2006). Optimization based DC programming and DCA for hierarchical clustering. European Journal of Operational Research, 183, 1067-1085. · Zbl 1149.90117
[19] Le Thi, H. A., Le, H. M., & Pham Dinh, T. (2007). Fuzzy clustering based on nonconvex optimisation approaches using difference of convex (DC) functions algorithms. Journal of Advances in Data Analysis and Classification, 2, 1-20. · Zbl 1301.90072
[20] Le Thi, H. A., Nguyen, V. V., & Ouchani, S. (2008). Gene selection for cancer classification using DCA. In Proceedings of the 4th International Conference on Advanced Data Mining and Classification (pp. 62-72). New York: Springer-Verlag. ,
[21] Le Thi, H. A., & Pham Dinh, T. (1997). Solving a class of linearly constrained indefinite quadratic problems by DC algorithms. Journal of Global Optimization, 11(3), 253-285. , · Zbl 0905.90131
[22] Le Thi, H. A., & Pham Dinh, T. (2005). The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems. Annals of Operations Research, 133, 23-46. , · Zbl 1116.90122
[23] Le Thi, H. A., Pham Dinh, T., & Huynh, V. N. (2012). Exact penalty techniques in DC programming. Journal of Global Optimization, 52(3), 509-535. , · Zbl 1242.49037
[24] Liu, Y., & Shen, X. (2006). Multicategory ##img##-learning. Journal of the American Statistical Association, 101, 500-509. , · Zbl 1119.62341
[25] Liu, Y., Shen, X., & Doss, H. (2005). Multicategory ##img##-learning and support vector machine: Computational tools. Journal of Computational and Graphical Statistics, 14, 219-236. ,
[26] Mechelen, V., Bock, H. H., & De Boeck, P. (2004). Two-mode clustering methods: A structured overview. Statistical Methods in Medical Research, 13, 363-394. , · Zbl 1053.62078
[27] Neumann, J., Schnörr, C., & Steidl, G. (2004). SVM-based feature selection by direct objective minimisation. In Proc. of 26th DAGM Symposium (pp. 212-219). New York: Springer-Verlag. ,
[28] Nutt, C. L., Mani, D. R., Betensky, R. A., Tamayo, P., Cairncross, J. G., Ladd, C., (2003). Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. Cancer Res., 63(7), 1602-1607.
[29] Ong, C. S., & Le Thi, H. A. (2011). Learning with sparsity by difference of convex functions algorithm. J. Optimization Methods and Software. doi:10.1080/10556788.2011.652630
[30] Pham Dinh, T., & Le Thi, H. A. (1998). DC optimization algorithms for solving the trust region subproblem. SIAM J. Optimization, 8, 476-505. , · Zbl 0913.65054
[31] Pomeroy, S. L., Tamayo, P., Gaasenbeek, M., Sturla, L. M, Angelo, M., McLaughlin, M. E., (2002). Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature, 415, 436-442. ,
[32] Ramaswamy, S., Tamayo, P., Rifkin, R., Mukherjee, S., Yeang, C. H., Angelo, M., (2001). Multiclass cancer diagnosis using tumor gene expression signatures. Proc. Natl. Acad. Sci. USA, 98, 15149-15154. ,
[33] Rosmalen, J. V., Groenen, P. J. F., Trejos, J., & Castillo, W. (2005). Global optimization strategies for two-mode clustering (Rep., EI-2005-33). Econometric Institute. · Zbl 1337.62145
[34] Shen, X., Tseng, G. C., Zhang, X., & Wong, W. H. (2003). ##img##-learning. Journal of American Statistical Association, 98, 724-734. , · Zbl 1052.62095
[35] Symons, M. J. (1981). Clustering criteria and multivariate normal mixtures. Biometrics, 37, 35-43. , · Zbl 0473.62048
[36] Weber, S., Schüle, T., & Schnörr, C. (2005). Prior learning and convex-concave regularization of binary tomography. Electr. Notes in Discr. Math., 20, 313-327. , · Zbl 1179.68192
[37] Yuille, A. L., & Rangarajan, A. (2002). The convex concave procedure (CCCP). In T. G. Dietterich, S. Becker, & Z. Ghahramani (Eds.), Advances in neural information processing systems, 14. Cambridge, MA: MIT Press. · Zbl 1022.68112
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.