Convexity-based clustering criteria: theory, algorithms, and applications in statistics. (English) Zbl 1058.62051

Summary: This paper deals with the construction of optimum partitions \({\mathcal B}= (B_1,\dots, B_m)\) of \(\mathbb{R}^p\) for a clustering criterion which is based on a convex function of the class centroids \(E[X\mid X\in B_i]\) as a generalization of the classical SSQ clustering criterion for n data points. We formulate a dual optimality problem involving two sets of variables and derive a maximum-support-plane (MSP) algorithm for constructing a (sub-)optimum partition as a generalized \(k\)-means algorithm. We present various modifications of the basic criterion and describe the corresponding MSP algorithm. It is shown that the method can also be used for solving optimality problems in classical statistics (maximizing Csiszár’s \(\varphi\)-divergence) and for simultaneous classification of the rows and columns of a contingency table.


62H30 Classification and discrimination; cluster analysis (statistical aspects)
90C90 Applications of mathematical programming
Full Text: DOI