×

A fast implementation of the ISODATA clustering algorithm. (English) Zbl 1118.65015

Clustering is central to many image processing and remote sensing applications. ISODATA is one of the most popular and widely used clustering methods in geoscience applications, but it can run slowly, particularly with large data sets. The authors present a more efficient approach to ISODATA clustering, which achieves better running times by storing the points in a \(kd\)-tree and through a modification of the way in which the algorithm estimates the dispersion of each cluster. They also present an approximate version of the algorithm which allows the user to further improve the running time, at the expense of lower fidelity in computing the nearest cluster center to each point.
The authors provide both theoretical and empirical justification that their modified approach produces clusterings that are very similar to those produced by the standard ISODATA approach. They also provide empirical studies on both synthetic data and remotely sensed Landsat and MODIS images that show that the present approach has significantly lower running times.

MSC:

65D18 Numerical aspects of computer graphics, image analysis, and computational geometry
62H30 Classification and discrimination; cluster analysis (statistical aspects)
86A32 Geostatistics
65C60 Computational problems in statistics (MSC2010)
94A08 Image processing (compression, reconstruction, etc.) in information and communication theory
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] DOI: 10.1007/PL00009387 · Zbl 0910.68215 · doi:10.1007/PL00009387
[2] DOI: 10.1145/293347.293348 · Zbl 1065.68650 · doi:10.1145/293347.293348
[3] DOI: 10.1137/S0097539702416402 · Zbl 1105.68118 · doi:10.1137/S0097539702416402
[4] DOI: 10.1145/361002.361007 · Zbl 0306.68061 · doi:10.1145/361002.361007
[5] L. Bottou and Y. Bengio, Advances in Neural Information Processing Systems 7, eds. G. Tesauro and D. Touretzky (MIT Press, Cambridge, MA, 1995) pp. 585–592.
[6] Feller W., An Introduction to Probability Theory and Its Applications (1968) · Zbl 0155.23101
[7] Forgey E., Biometrics 21 pp 768–
[8] DOI: 10.1145/355744.355745 · Zbl 0364.68037 · doi:10.1145/355744.355745
[9] Fukunaga K., Introduction to Statistical Pattern Recognition (1990) · Zbl 0711.62052
[10] Garey M. R., Computers and Intractability: A Guide to the Theory of NP-Completeness (1979) · Zbl 0411.68039
[11] DOI: 10.1007/978-1-4615-3626-0 · doi:10.1007/978-1-4615-3626-0
[12] DOI: 10.1016/0304-3975(85)90224-5 · Zbl 0567.62048 · doi:10.1016/0304-3975(85)90224-5
[13] DOI: 10.1007/s00454-004-2822-7 · Zbl 1094.68103 · doi:10.1007/s00454-004-2822-7
[14] DOI: 10.1007/s00453-004-1127-9 · Zbl 1069.68108 · doi:10.1007/s00453-004-1127-9
[15] Hochbaum D. S., Approximation Algorithms for NP-hard Problems (1997) · Zbl 1368.68010
[16] Jain A. K., Algorithms for Clustering Data (1988) · Zbl 0665.62061
[17] Jensen J. R., Introductory Digital Image Processing: A Remote Sensing Perspective (1996)
[18] DOI: 10.1109/TPAMI.2002.1017616 · doi:10.1109/TPAMI.2002.1017616
[19] DOI: 10.1016/j.comgeo.2004.03.003 · Zbl 1077.68109 · doi:10.1016/j.comgeo.2004.03.003
[20] DOI: 10.1002/9780470316801 · Zbl 1345.62009 · doi:10.1002/9780470316801
[21] DOI: 10.1007/978-3-642-88163-3 · doi:10.1007/978-3-642-88163-3
[22] DOI: 10.1109/TIT.1982.1056489 · Zbl 0504.94015 · doi:10.1109/TIT.1982.1056489
[23] DOI: 10.1023/A:1009735908398 · doi:10.1023/A:1009735908398
[24] DOI: 10.1007/s004540010019 · Zbl 0959.68126 · doi:10.1007/s004540010019
[25] DOI: 10.1109/TKDE.2002.1033770 · Zbl 05109550 · doi:10.1109/TKDE.2002.1033770
[26] DOI: 10.1214/aop/1176993713 · Zbl 0502.62055 · doi:10.1214/aop/1176993713
[27] DOI: 10.1007/978-3-662-03978-6 · doi:10.1007/978-3-662-03978-6
[28] Selim S. Z., IEEE Trans. Patt. Anal. Mach. Intell. 6 pp 81–
[29] DOI: 10.1007/PL00009311 · Zbl 0878.68131 · doi:10.1007/PL00009311
[30] Tou J. T., Pattern Recognition Principles (1974) · Zbl 0299.68058
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.