On complexity of some problems of cluster analysis of vector sequences.

*(Russian, English)*Zbl 1324.68047
Diskretn. Anal. Issled. Oper. 20, No. 2, 47-57 (2013); translation in J. Appl. Ind. Math. 7, No. 3, 363-369 (2013).

Summary: NP-completeness of two clustering (partition) problems is proved for a finite sequence of Euclidean vectors. In the optimization versions of both problems it is required to partition the elements of the sequence into a fixed number of clusters minimizing the sum of squares of the distances from the cluster elements to their centers. In the first problem the sizes of clusters are the part of input, while in the second they are unknown (they are the variables for optimization). Except for the center of one (special) cluster, the center of each cluster is the mean value of all vectors contained in it. The center of the special cluster is zero. Also, the partition must satisfy the following condition: The difference between the indices of two consecutive vectors in every nonspecial cluster is bounded below and above by two given constants.

##### MSC:

68Q17 | Computational difficulty of problems (lower bounds, completeness, difficulty of approximation, etc.) |

62H30 | Classification and discrimination; cluster analysis (statistical aspects) |