A new theoretical framework for \(K\)-means-type clustering. (English) Zbl 1085.68132

Chu, Wesley (ed.) et al., Foundations and advances in data mining. Berlin: Springer (ISBN 3-540-25057-3/hbk). Studies in Fuzziness and Soft Computing 180, 79-96 (2005).
Summary: One of the fundamental clustering problems is to assign \(n\) points into \(k\) clusters based on the Minimal Sum-of-Squares (MSSC), which is known to be NP-hard. In this paper, by using matrix arguments, we first model MSSC as a so-called 0-1 Semi-Definite Programming (SDP). The classical \(K\)-means algorithm can be interpreted as a special heuristic for the underlying 0-1 SDP. Moreover, the 0-1 SDP model can be further approximated by the relaxed and polynomially solvable linear and semidefinite programming. This opens new avenues for solving MSSC. The 0-1 SDP model can be applied not only to MSSC, but also to other scenarios of clustering as well. In particular, we show that the recently proposed normalized \(k\)-cut and spectral clustering can also be embedded into the 0-1 SDP model in various kernel spaces.
For the entire collection see [Zbl 1073.68004].


68T05 Learning and adaptive systems in artificial intelligence
68T10 Pattern recognition, speech recognition