×

A semi-supervised competitive agglomeration algorithm based on dot density and applications in image clustering. (Chinese. English summary) Zbl 1324.68161

Summary: The competitive agglomeration is a very classic algorithm in clustering algorithm. The algorithm has the ability to get cluster number automatically. It judges and gives up the false clustering centers during iterative process of continuous until the last number of cluster is most appropriate for sample date. Through this way it avoids the influence on the clustering results by anticipating parameters incorrectly, and does not need to set precise clustering number for sample data. But during its clustering, it fails to take into account the known information, which is little but prevalent in the sample data. However the known information is important for the clustering results. Obviously, making proper use of the information is conducive to improve the clustering rate. Moreover, the algorithm uses the Euclidean distance as the similarity function. Even though the distance formula has the advantages in calculation and is wildly used in common algorithms, the distance is only applicable to spherical clustering and it has the trend of equal partition for data sets. There are many different kinds of sample data may need cluster. And considering the diversity of sample data, a conclusion would be gotten, that all these above would restrict the application scope of the algorithm. To solve these problems, the semi-supervised entry is introduced to enhance partitioning capability of membership matrix. It has the ability of learning which could help the algorithm make full use of the information that known in sample data. And a distance correction with the information of dot density is built. The dot density could reflect the importance of one point in data clustering and can be built for adjusting the Euclidean distance, in order to avoid the distance leading a trend of equal partition for clustering result. Finally a semi-supervised algorithm based on density is proposed. Four images are divided into two groups, which are artificial images and real images. They are designed for examining the segmentation. Three other algorithms are used for comparison with the algorithm proposed. Through the clustering segmentation results of images and the comparison with other algorithms in performance, the results show that the proposed algorithm can get more accurate center value and get better clustering results.

MSC:

68T10 Pattern recognition, speech recognition
68T05 Learning and adaptive systems in artificial intelligence
68U10 Computing methodologies for image processing
PDFBibTeX XMLCite
Full Text: DOI