×

Design of blurring mean-shift algorithms for data classification. (English) Zbl 1349.62267

Summary: The mean-shift algorithm is an iterative method of mode seeking and data clustering based on the kernel density estimator. The blurring mean-shift is an accelerated version which uses the original data only in the first step, then re-smoothes previous estimates. It converges to local centroids, but may suffer from problems of asymptotic bias, which fundamentally depend on the design of its smoothing components. This paper develops nearest-neighbor implementations and data-driven techniques of bandwidth selection, which enhance the clustering performance of the blurring method. These solutions can be applied to the whole class of mean-shift algorithms, including the iterative local mean method. Extended simulation experiments and applications to well known data-sets show the goodness of the blurring estimator with respect to other algorithms.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
62G05 Nonparametric estimation
62G07 Density estimation
68T10 Pattern recognition, speech recognition
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] ALIYARI GHASSABEH, Y. (2013), “On the Convergence of the Mean Shift Algorithm in the One-Dimensional Space,” Pattern Recognition Letters, 34, 1423-1427. · Zbl 0297.62025
[2] CARREIRA-PERPIÑÁN, M.Á. (2006), “Fast Nonparametric Clustering with Gaussian Blurring Mean Shift,” in Proceedings of 23rd International Conference on Machine Learning, ICML 2006, pp. 153-160. · Zbl 0800.62219
[3] CARREIRA-PERPIÑÁN, M.Á. (2007), “Gaussian Mean Shift is an EM Algorithm,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 767-776. · doi:10.1109/TPAMI.2007.1057
[4] CARREIRA-PERPIÑÁN, M.Á. (2008), “Generalized Blurring Mean Shift Algorithms for Nonparametric Clustering,” IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1-8.
[5] CHACÓN, J.E., and DUONG, T. (2013), “Data-driven Density Derivative Estimation, with Applications to Nonparametric Clustering and Bump Hunting,” Electronic Journal of Statistics, 7, 499-532. · Zbl 1337.62067 · doi:10.1214/13-EJS781
[6] CHEN, T.-L. (2015), “On the Convergence and Consistency of the Blurring Mean Shift Process,” Annals of the Institute of Statistical Mathematics, 67, 157-176. · Zbl 1331.68267 · doi:10.1007/s10463-013-0443-8
[7] CHENG, Y. (1995), “Mean Shift, Mode Seeking and Clustering,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 17, 790-799. · doi:10.1109/34.400568
[8] COMANICIU, D., and MEER, P. (2002), “Mean Shift: A Robust Approach Toward Feature Space Analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 603-619. · doi:10.1109/34.1000236
[9] DUONG, T. (2014), Package ‘ks’, ver. 1.9.1., Cran R Project, available at: http://cran.rproject.org/web/packages/ks/ks.pdf. · Zbl 1117.62028
[10] FUKUNAGA, K., and HOSTETLER, L.D. (1975), “The Estimation of the Gradient of a Density Function, with Applications in Pattern Recognition,” IEEE Transactions on Information Theory, 21, 32-40. · Zbl 0297.62025 · doi:10.1109/TIT.1975.1055330
[11] GRILLENZONI, C. (2007), “Pattern Recognition via Robust Smoothing, with Application to Laser Data,” Australian & New Zealand Journal of Statistics, 37, 137-153. · Zbl 1117.62028 · doi:10.1111/j.1467-842X.2007.00469.x
[12] GRILLENZONI, C. (2014), “Detection of Tectonic Faults by Spatial Clustering of Earthquake Hypocenters,” Spatial Statistics, 7, 62-78. · doi:10.1016/j.spasta.2013.11.003
[13] ISAACSON, D.L., and MADSEN, R.W. (1976), Markov Chains, Theory and Applications, New York: Wiley. · Zbl 0332.60043
[14] LI, X., HU, Z., and WU F. (2007), “A Note on the Convergence of the Mean Shift,” Pattern Recognition, 40, 1756-1762. · Zbl 1111.68111 · doi:10.1016/j.patcog.2006.10.016
[15] RAO, S., DE MEDEIROS MARTINS A., and PRÍNCIPE, J. (2009), “Mean Shift: An Information Theoretic Perspective”, Pattern Recognition Letters, 30, 222-230. · doi:10.1016/j.patrec.2008.09.011
[16] RIPLEY, B., and WAND M. (2014), Package ‘KernSmooth’, ver. 2.23-12, available at http://cran.r-project.org/web/packages/KernSmooth/KernSmooth.pdf. · Zbl 1337.62067
[17] ROUSEEUW, P.J. (1986), “Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis,” Journal of Computation and Applied Mathematics, 20, 53-65. · Zbl 0636.62059 · doi:10.1016/0377-0427(87)90125-7
[18] SHEATHER, S.J., and JONES, M.C. (1991), “A Reliable Data-based Bandwidth Selection Method for Kernel Density Estimation,” Journal of the Royal Statistical Society, B, 53, 683-690. · Zbl 0800.62219
[19] SILVERMAN, B.W. (1986), Density Estimation for Statistics and Data Analysis, London: Chapman & Hall. · Zbl 0617.62042 · doi:10.1007/978-1-4899-3324-9
[20] WANG, K.,WANG B., and PENGL. (2009), “Validation for Cluster Analyses”, Data Science Journal, 8, 88-93. · doi:10.2481/dsj.007-020
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.