×

An efficient algorithm for maximal margin clustering. (English) Zbl 1230.90133

Summary: Maximal margin based frameworks have emerged as a powerful tool for supervised learning. The extension of these ideas to the unsupervised case, however, is problematic since the underlying optimization entails a discrete component. In this paper, we first study the computational complexity of maximal hard margin clustering and show that the hard margin clustering problem can be \(precisely\) solved in \(O(n ^{d+2})\) time where \(n\) is the number of the data points and \(d\) is the dimensionality of the input data. However, since it is well known that many datasets commonly “express” themselves primarily in far fewer dimensions, our interest is in evaluating if a careful use of dimensionality reduction can lead to practical and effective algorithms. We build upon these observations and propose a new algorithm that gradually increases the number of features used in the separation model in each iteration, and analyze the convergence properties of this scheme. We report on promising numerical experiments based on a “truncated” version of this approach. Our experiments indicate that for a variety of datasets, good solutions equivalent to those from other existing techniques can be obtained in significantly less time.

MSC:

90C09 Boolean programming
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Vapnik V.: The Nature of Statistical Learning Theory. Springer, New York (1995) · Zbl 0833.62008
[2] Schölkopf B., Smola A.: Learning with Kernels Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, Cambridge (2002)
[3] Xu, L., Neufeld, J., Larson, B., Schuurmans, D.: Maximum margin clustering. In: Advances in Neural Information Processing Systems (NIPS). MIT Press, Cambridge (2005)
[4] Xu, L., Schuurmans, D.: Unsupervised and semi-supervised multi-class support vector machines. In: Proceedings of the 20th National Conference on Artificial Intelligence (AAAI) (2005)
[5] Valizadegan, H., Jin, R.: Generalized maximum margin clustering and unsupervised kernel learning. In: Advances in Neural Information Processing Systems (NIPS). MIT Press, Cambridge (2006)
[6] Bennett, K., Demiriz, A.: Semi-supervised support vector machines. In: Advances in Neural Information Processing Systems (NIPS). MIT Press, Cambridge (1998)
[7] Bie, T. D., Cristianini, N.: Convex methods for transduction. In: Advances in Neural Information Processing Systems (NIPS). MIT Press, Cambridge (2004) · Zbl 1104.68601
[8] De Bie T., Cristianini N.: Fast SDP relaxations of graph cut clustering, transduction, and other combinatorial problems. J. Mach. Learn. Res. 7, 1409–1436 (2006) · Zbl 1222.68179
[9] Vandenberghe L., Boyd S.: Semidefinite programming. SIAM Rev. 38, 49–95 (1996) · Zbl 0845.65023 · doi:10.1137/1038003
[10] Zhang, K., Tsang, I.W., Kwok, J.T.: Maximum margin clustering made practical. In: International Conference on Machine learning (ICML), pp. 1119–1126 (2007)
[11] Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems (NIPS). MIT Press, Cambridge (2002)
[12] Prokopyev O.A., Busygin S., Pardalos P.M.: An optimization based approach for data classification. Optim. Methods Softw. 2, 3–9 (2007) · Zbl 1116.62068
[13] Xu R., Wunsch D.: Survey of clustering algorithms. IEEE Trans. Neural Netw. 16, 645–648 (2005) · doi:10.1109/TNN.2005.845141
[14] Sherali H.D., Desai J.: A global optimization rlt-based approach for solving the hard clustering problem. J. Global Optim. 32, 281–306 (2005) · Zbl 1123.62045 · doi:10.1007/s10898-004-2706-7
[15] Sherali H.D., Desai J.: A global optimization rlt-based approach for solving the fuzzy clustering approach. J. Global Optim. 33, 597–615 (2005) · Zbl 1097.90072 · doi:10.1007/s10898-004-7390-0
[16] Butenko S., Chaovalitwongse W., Pardalos P.M.: Clustering Challenges in Biological Networks. World Scientific, Singapore (2009)
[17] Bradley P.S., Mangasarian O.L.: k-plane clustering. J. Global Optim. 16, 23–32 (2000) · Zbl 0990.90135 · doi:10.1023/A:1008324625522
[18] Du D., Jung Y., Park H., Drake B.L.: A decision criterion for the optimal number of clusters in hierarchical clustering. J. Global Optim. 25, 91–111 (2003) · doi:10.1023/A:1021394316112
[19] Shi J., Malik J.: Normalized cut and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22, 888–905 (2000) · Zbl 05111961 · doi:10.1109/34.868688
[20] Dhillon, I.S., Guan, Y., Kulis B.: Kernel k-means, spectral clustering and normalized cuts. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 551–556 (2004)
[21] Chen, H., Peng, J.: 0–1 semidefinite programming for graph-cut clustering: modelling and approximation. In: Pardalos, P.M., Hansen, P. (eds.) Data Mining and Mathematical Programming. CRM Proceedings and Lecture Notes of the American Mathematical Society, pp. 15–40 (2008) · Zbl 1213.90183
[22] Johnson W. B., Lindenstrauss J.: Extensions of lipshitz mapping into hilbert space. Contemp. Math. 26, 189–206 (1984) · Zbl 0539.46017
[23] Musicant, D. R.: NDC: normally distributed clustered datasets (1998) http://www.cs.wisc.edu/dmi/svm/ndc/
[24] Achlioptas, D.: Database-friendly random projections. In: ACM Symposium on Principles of Database Systems (PODS), pp. 274–281 (2001)
[25] Graham, D. B., Allinson, N. M.: In: Face Recognition: From Theory to Applications, vol. 163, chapter Characterizing Virtual Eigensignatures for General Purpose Face Recognition. NATO ASI Series F, Computer and Systems Sciences, pp. 446–456 (1998)
[26] Samaria, F., Harter, A.: Parameterisation of a stochastic model for human face identification. In: Proceedings of 2nd IEEE Workshop on Applications of Computer Vision (1994)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.