A robust method for cluster analysis. (English) Zbl 1064.62074

Summary: Let there be given a contaminated list of \(n\) \(\mathbb{R}^d\)-valued observations coming from \(g\) different, normally distributed populations with a common covariance matrix. We compute the ML-estimator with respect to a certain statistical model with \(n-r\) outliers for the parameters of the \(g\) populations; it detects outliers and simultaneously partitions their complement into \(g\) clusters. It turns out that the estimator unites both the minimum-covariance-determinant rejection method and the well-known pooled determinant criterion of cluster analysis. We also propose an efficient algorithm for approximating this estimator and study its breakdown points for mean values and pooled within groups sum of squares and products matrices.


62H30 Classification and discrimination; cluster analysis (statistical aspects)
62F35 Robustness and adaptive procedures (parametric inference)
65C60 Computational problems in statistics (MSC2010)
Full Text: DOI arXiv


