The \(k\)-means algorithm for 3D shapes with an application to apparel design. (English) Zbl 1414.62295

Summary: Clustering of objects according to shapes is of key importance in many scientific fields. In this paper we focus on the case where the shape of an object is represented by a configuration matrix of landmarks. It is well known that this shape space has a finite-dimensional Riemannian manifold structure (non-Euclidean) which makes it difficult to work with. Papers about clustering on this space are scarce in the literature. The basic foundation of the \(k\)-means algorithm is the fact that the sample mean is the value that minimizes the Euclidean distance from each point to the centroid of the cluster to which it belongs, so, our idea is integrating the Procrustes type distances and Procrustes mean into the \(k\)-means algorithm to adapt it to the shape analysis context. As far as we know, there have been just two attempts in that way. In this paper we propose to adapt the classical \(k\)-means Lloyd algorithm to the context of Shape Analysis, focusing on the three dimensional case. We present a study comparing its performance with the Hartigan-Wong \(k\)-means algorithm, one that was previously adapted to the field of Statistical Shape Analysis. We demonstrate the better performance of the Lloyd version and, finally, we propose to add a trimmed procedure. We apply both to a 3D database obtained from an anthropometric survey of the Spanish female population conducted in this country in 2006. The algorithms presented in this paper are available in the Anthropometry R package, whose most current version is always available from the Comprehensive R Archive Network.


62H35 Image analysis in multivariate analysis
62H30 Classification and discrimination; cluster analysis (statistical aspects)
Full Text: DOI


[1] Alemany S, González JC, Nácher B, Soriano C, Arnáiz C, Heras H (2010) Anthropometric survey of the spanish female population aimed at the apparel industry. In: Proceedings of the 2010 Intl Conference on 3D Body scanning Technologies, Lugano, Switzerland, pp 1-10
[2] Amaral, G.; Dore, L.; Lessa, R.; Stosic, B., k-means algorithm in statistical shape analysis, Commun Stat Simul Comput, 39, 1016-1026, (2010) · Zbl 1192.62160
[3] Anderberg M (1973) Cluster analysis for applications. Academic Press, New York · Zbl 0299.62029
[4] Best, D.; Fisher, N., Efficient simulation of the von mises distribution, J R Stat Soc Ser C (Appl Stat), 28, 152-157, (1979) · Zbl 0435.62021
[5] Bhattacharya, R.; Patrangenaru, V., Nonparametric estimation of location and dispersion on riemannian manifolds, J Stat Plann Inference, 108, 23-35, (2002) · Zbl 1031.62024
[6] Bhattacharya, R.; Patrangenaru, V., Large sample theory of intrinsic and extrinsic sample means on manifolds, Ann Stat, 31, 1-29, (2003) · Zbl 1020.62026
[7] Bock, HH; Brito, P. (ed.); Bertrand, P. (ed.); Cucumel, G. (ed.); Carvalho, F. (ed.), Clustering methods: a history of k-means algorithms, 161-172, (2007), Berlin Heidelberg · Zbl 1181.68229
[8] Bock, HH, Origins and extensions of the k-means algorithm in cluster analysis, Electron J Hist Prob Stat, 4, 1-18, (2008)
[9] Cai, X.; Li, Z.; Chang, CC; Dempsey, P., Analysis of alignment influence on 3-D anthropometric statistics, Tsinghua Sci Technol, 10, 623-626, (2005)
[10] Chernoff H (1970) Metric considerations in cluster analysis. In: Proc. 6th Berkeley Symposium on Mathematical Statistics and Probability. University of California Press, pp 621-629
[11] Chung, M.; Lina, H.; Wang, MJJ, The development of sizing systems for taiwanese elementary- and high-school students, Int J Ind Ergon, 37, 707-716, (2007)
[12] Claude J (2008) Morphometrics with R. use R!. Springer, New York · Zbl 1166.62081
[13] Dryden IE, Mardia KV (1998) Statistical shape analysis. Wiley, Chichester · Zbl 0901.62072
[14] Dryden IL (2012) Shapes package. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org, contributed package
[15] European Committee for Standardization. European Standard EN 13402-2: Size system of clothing. Primary and secondary dimensions (2002)
[16] Fletcher, P.; Lu, C.; Pizer, S.; Joshi, S., Principal geodesic analysis for the study of nonlinear statistics of shape, Med Imaging IEEE Trans, 23, 995-1005, (2004)
[17] Fréchet, M., Les éléments aléatoires de nature quelconque dans un espace distancié, Ann Inst Henri Poincare Prob Stat, 10, 215-310, (1948) · Zbl 0035.20802
[18] García-Escudero, LA; Gordaliza, A., Robustness properties of k-means and trimmed k-means, J Am Stat Assoc, 94, 956-969, (1999) · Zbl 1072.62547
[19] Georgescu V (2009) Clustering of fuzzy shapes by integrating Procrustean metrics and full mean shape estimation into k-means algorithm. In: IFSA-EUSFLAT Conference (Lisbon, Portugal), pp 1679-1684
[20] Hand, DJ; Krzanowski, WJ, Optimising k-means clustering results with standard software packages, Comput Stat Data Anal, 49, 969.973, (2005) · Zbl 1429.62244
[21] Hartiga JA, Wong MA (1979) A K-means clustering algorithm. Appl Stat 100-108
[22] Hastie T, Tibshirani R, Friedman J (2008) The elements of statistical learning. Springer, New York
[23] Ibáñez, MV; Vinué, G.; Alemany, S.; Simó, A.; Epifanio, I.; Domingo, J.; Ayala, G., Apparel sizing using trimmed PAM and OWA operators, Expert Syst Appl, 39, 10,512-10,520, (2012)
[24] Jain, AK, Data clustering: 50 years beyond k-means, Pattern Recognit Lett, 31, 651-666, (2010)
[25] Kanungo, T.; Mount, DM; Netanyahu, NS; Piatko, C.; Silverman, R.; Wu, AY, An efficient k-means clustering algorithm: analysis and implementation, IEEE Trans Pattern Anal Mach Intell, 24, 881-892, (2002)
[26] Karcher, H., Riemannian center of mass and mollifier smoothing, Commun Pure Appl Math, 30, 509-541, (1977) · Zbl 0354.57005
[27] Kaufman L, Rousseeuw P (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New York · Zbl 1345.62009
[28] Kendall, D., The diffusion of shape, Adv Appl Prob, 9, 428-430, (1977)
[29] Kendall DG, Barden D, Carne T, Le H (2009) Shape and shape theory. Wiley, Chichester · Zbl 0940.60006
[30] Kendall, WS, Probability, convexity, and harmonic maps with small image i: uniqueness and fine existence, Proc Lond Math Soc, 3, 371-406, (1990) · Zbl 0675.58042
[31] Kent, J.; Mardia, K., Consistency of procrustes estimators, J R Stat Soc Ser B, 59, 281-290, (1997) · Zbl 0890.62041
[32] Kobayashi S, Nomizu K (1969) Foundations of differential geometry, vol 2. Wiley, Chichester · Zbl 0175.48504
[33] Lawing, A.; Polly, P., Geometric morphometrics: recent applications to the study of evolution and development, J Zool, 280, 1-7, (2010)
[34] Le, H., On the consistency of Procrustean mean shapes, Adv Appl Prob, 30, 53-63, (1998) · Zbl 0906.60007
[35] Lloyd SP (1957) Least squares quantization in pcm. bell telephone labs memorandum, murray hill, nj. reprinted. In: IEEE Trans Information Theory IT-28 (1982) 2:129-137
[36] MacQueen J (1967) Some methoods for classification and analysis of mulivariate observations. In: Proc 5th Berkely Symp Math Statist Probab. Univ of California Press B (ed) 1965/66, vol 1, pp 281-297
[37] Nazeer KAA, Sebastian MP (2009) Improving the accuracy and efficiency of the k-means clustering algorithm. In: Proceedings of the World Congress on Engineering (London, UK), pp 1-5
[38] Ng R, Ashdown S, Chan A (2007) Intelligent size table generation. Sen’i Gakkaishi (J Soc Fiber Sci Technol Jpn) 63(11):384-387
[39] Pennec X (2006) Intrinsic statistics on riemannian manifolds: basic tools for geometric measurements. J Math Imaging Vis 25(1):127-154
[40] Qiu W, Joe H (2013) ClusterGeneration: random cluster generation (with specified degree of separation. http://CRAN.R-project.org/package=clusterGeneration, R package version 1.3.1
[41] R Development Core Team (2014) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org, ISBN 3-900051-07-0
[42] Rohlf, JF, Shape statistics: Procrustes superimpositions and tangent spaces, J Classif, 16, 197-223, (1999) · Zbl 0954.62077
[43] S-plus original by Ulric Lund and R port by Claudio Agostinelli (2012) CircStats: Circular Statistics, from “Topics in circular Statistics” (2001). http://CRAN.R-project.org/package=CircStats, R package version 0.2-4
[44] Simmons K (2002) Body shape analysis using three-dimensional body scanning technology. PhD thesis, North Carolina State University
[45] Small C (1996) The statistical theory of shape. Springer, New York · Zbl 0859.62087
[46] Sokal R, Sneath PH (1963) Principles of numerical taxonomy. Freeman, San Francisco
[47] Steinhaus, H., Sur la division des corps matériels en parties, Bull Acad Pol Sci, IV, 801-804, (1956) · Zbl 0079.16403
[48] Steinley, D., K-means clustering: a half-century synthesis, Br J Math Stat Psychol, 59, 1-34, (2006)
[49] Stoyan LA, Stoyan H (1995) Fractals, random shapes and point fields. Wiley, Chichester
[50] Theodoridis S, Koutroumbas K (1999) Pattern recognition. Academic, New York
[51] Veitch D, Fitzgerald C et al (2013) Sizing up Australia—the next step. Safe Work Australia, Canberra
[52] Vinué G, Epifanio I, Simó A, Ibáñez MV, Domingo J, Ayala G (2014) Anthropometry: an R Package for analysis of anthropometric data. http://CRAN.R-project.org/package=Anthropometry, R package version 1.0
[53] Woods, R., Characterizing volume and surface deformations in an atlas framework: theory, applications, and implementation, NeuroImage, 18, 769-788, (2003)
[54] Zheng, R.; Yu, W.; Fan, J., Development of a new chinese bra sizing system based on breast anthropometric measurements, Int J Ind Ergon, 37, 697-705, (2007)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.