Archetypal shapes based on landmarks and extension to handle missing data. (English) Zbl 1416.62326

Summary: Archetype and archetypoid analysis are extended to shapes. The objective is to find representative shapes. Archetypal shapes are pure (extreme) shapes. We focus on the case where the shape of an object is represented by a configuration matrix of landmarks. As shape space is not a vectorial space, we work in the tangent space, the linearized space about the mean shape. Then, each observation is approximated by a convex combination of actual observations (archetypoids) or archetypes, which are a convex combination of observations in the data set. These tools can contribute to the understanding of shapes, as in the usual multivariate case, since they lie somewhere between clustering and matrix factorization methods. A new simplex visualization tool is also proposed to provide a picture of the archetypal analysis results. We also propose new algorithms for performing archetypal analysis with missing data and its extension to incomplete shapes. A well-known data set is used to illustrate the methodologies developed. The proposed methodology is applied to an apparel design problem in children.


62H25 Factor analysis and principal components; correspondence analysis
62H11 Directional data; spatial statistics
62H30 Classification and discrimination; cluster analysis (statistical aspects)
Full Text: DOI Link


[1] Arbour, JH; Brown, CM, Incomplete specimens in geometric morphometric analyses, Methods Ecol Evol, 5, 16-26, (2014)
[2] Ayala, G.; Epifanio, I.; Simó, A.; Zapater, V., Clustering of spatial point patterns, Comput Stat Data Anal, 50, 1016-1032, (2006) · Zbl 1431.62437
[3] Bookstein, Fred L., New statistical methods for shape, 27-63, (1978), Berlin, Heidelberg
[4] Brown, CM; Arbour, JH; Jackson, DA, Testing of the effect of missing data estimation and distribution in morphometric multivariate data analyses, Syst Biol, 61, 941-954, (2012)
[5] Canhasi, E.; Kononenko, I., Multi-document summarization via archetypal analysis of the content-graph joint model, Knowl Inf Syst, (2013)
[6] Canhasi, E.; Kononenko, I., Weighted archetypal analysis of the multi-element graph for query-focused multi-document summarization, Expert Syst Appl, 41, 535-543, (2014)
[7] Chan, B.; Mitchell, D.; Cram, L., Archetypal analysis of galaxy spectra, Mon Not R Astron Soc, 338, 790-795, (2003)
[8] Claude J (2008) Morphometrics with R. Springer, New York · Zbl 1166.62081
[9] Cutler, A.; Breiman, L., Archetypal analysis, Technometrics, 36, 338-347, (1994) · Zbl 0804.62002
[10] Davis, T.; Love, B., Memory for category information is idealized through contrast with competing options, Psychol Sci, 21, 234-242, (2010)
[11] D’Esposito, MR; Ragozini, G., A new R-ordering procedure to rank multivariate performances, Quaderni di Statistica, 10, 22-40, (2008)
[12] D’Esposito, MR; Palumbo, F.; Ragozini, G., Interval archetypes: a new tool for interval data analysis, Stat Anal Data Min, 5, 322-335, (2012)
[13] Dryden IL (2015) Shapes: statistical shape analysis. R package version 1.1-11. https://CRAN.R-project.org/package=shapes
[14] Dryden IL, Mardia KV (1998) Statistical shape analysis. Wiley, Chichester · Zbl 0901.62072
[15] Dryden IL, Mardia KV (2016) Statistical shape analysis: with applications in R. Wiley, Chichester · Zbl 1381.62003
[16] Dryden, IL; Zempléni, A., Extreme shape analysis, J R Stat Soc Ser C, 55, 103-121, (2006) · Zbl 05188730
[17] Du, J.; Dryden, IL; Huang, X., Size and shape analysis of error-prone shape data, J Am Stat Assoc, 110, 368-379, (2015) · Zbl 1373.62127
[18] Eirola, E.; Doquire, G.; Verleysen, M.; Lendasse, A., Distance estimation in numerical data sets with missing values, Inf Sci, 240, 115-128, (2013) · Zbl 1320.68134
[19] Eneh S (2015) Showroom the future of online fashion retailing 2.0: enhancing the online shopping experience. Master’s thesis, University of Borås, Faculty of Textiles, Engineering and Business
[20] Epifanio, I., Functional archetype and archetypoid analysis, Comput Stat Data Anal, 104, 24-34, (2016) · Zbl 1466.62062
[21] Epifanio, I.; Vinué, G.; Alemany, S., Archetypal analysis: contributions for estimating boundary cases in multivariate accommodation problem, Comput Ind Eng, 64, 757-765, (2013)
[22] Eugster, MJ; Leisch, F., From spider-man to hero—archetypal analysis in R, J Stat Softw, 30, 1-23, (2009)
[23] Eugster, MJA, Performance profiles based on archetypal athletes, Int J Perform Anal Sport, 12, 166-187, (2012)
[24] Eugster, MJA; Leisch, F., Weighted and robust archetypal analysis, Comput Stat Data Anal, 55, 1215-1225, (2011) · Zbl 1328.65027
[25] Fréchet, M., LES éléments aléatoires de nature quelconque dans un espace distancié, Annales de l’Institut Henri Poincaré Probabilités et Statistiques, 10, 215-310, (1948) · Zbl 0035.20802
[26] Goodall C (1991) Procrustes methods in the statistical analysis of shape. J R Stat Soc Ser B (Methodological) 53(2):285-339 · Zbl 0800.62346
[27] Guerrero J, ASEPRI (2000) Estudio de tallas y medidas de la población infantil internacional. Asociación Española de Fabricantes de Productos para la Infancia (ASEPRI)
[28] Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Data mining, inference and prediction, 2nd edn. Springer, New York · Zbl 1273.62005
[29] Hinrich, JL; Bardenfleth, SE; Roge, RE; Churchill, NW; Madsen, KH; Mørup, M., Archetypal analysis for modeling multisubject fMRI data, IEEE J Sel Top Sign Proces, 10, 1160-1171, (2016)
[30] Ibáñez, MV; Vinué, G.; Alemany, S.; Simó, A.; Epifanio, I.; Domingo, J.; Ayala, G., Apparel sizing using trimmed PAM and OWA operators, Expert Syst Appl, 39, 10,512-10,520, (2012)
[31] Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New York · Zbl 1345.62009
[32] Kendall, D., Shape manifolds, Procrustean metrics, and complex projective spaces, Lond Math Soc, 16, 81-121, (1984) · Zbl 0579.62100
[33] Kendall DG, Barden D, Carne T, Le H (2009) Shape and shape theory. Wiley, Chichester · Zbl 0940.60006
[34] Lawson CL, Hanson RJ (1974) Solving least squares problems. Prentice Hall, Englewood Cliffs · Zbl 0860.65028
[35] Li S, Wang P, Louviere J, Carson R (2003) Archetypal analysis: a new way to segment markets based on extreme individuals. In: ANZMAC 2003 conference proceedings, pp 1674-1679
[36] MacLeod N (2015) Proceedings of the third international symposium on biological shape analysis, Chap The direct analysis of digital images (eigenimage) with a comment on the use of discriminant analysis in morphometrics. World Scientific, Singapore, pp 156-182
[37] Midgley D, Venaik S (2013) Marketing strategy in MNC subsidiaries: pure versus hybrid archetypes. In: McDougall-Covin P, Kiyak T (eds) Proceedings of the 55th annual meeting of the academy of international business, pp 215-216
[38] Mørup, M.; Hansen, LK, Archetypal analysis for machine learning and data mining, Neurocomputing, 80, 54-63, (2012)
[39] Pennec, X., Intrinsic statistics on Riemannian manifolds: basic tools for geometric measurements, J Math Imaging Vis, 25, 127-154, (2006)
[40] Porzio, GC; Ragozini, G.; Vistocco, D., On the use of archetypes as benchmarks, Appl Stoch Models Bus Ind, 24, 419-437, (2008) · Zbl 1199.90016
[41] R Development Core Team (2017) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. ISBN 3-900051-07-0. http://www.R-project.org
[42] Ragozini G, D’Esposito MR (2015) Archetypal networks. In: Proceedings of the 2015 IEEE/ACM international conference on advances in social networks analysis and mining 2015, ACM, New York, pp 807-814
[43] Ragozini, G.; Palumbo, F.; D’Esposito, MR, Archetypal analysis for data-driven prototype identification, Stat Anal Data Min ASA Data Sci J, 10, 6-20, (2017)
[44] Robinette, KM; Veitch, D., Sustainable sizing, Hum Fact J Hum Fact Ergonom Soc, 58, 657-664, (2016)
[45] Rohlf, FJ, On applications of geometric morphometrics to studies of ontogeny and phylogeny, Syst Biol, 47, 147-158, (1998)
[46] Rohlf, FJ, Shape statistics: procrustes superimpositions and tangent spaces, J Classif, 16, 197-223, (1999) · Zbl 0954.62077
[47] Seth, S.; Eugster, MJA, Archetypal analysis for nominal observations, IEEE Trans Pattern Anal Mach Intell, 38, 849-861, (2016)
[48] Seth, S.; Eugster, MJA, Probabilistic archetypal analysis, Mach Learn, 102, 85-113, (2016) · Zbl 1352.62083
[49] Sjöstrand K, Stegmann MB, Larsen R (2006) Sparse principal component analysis in medical shape modeling. In: International symposium on medical imaging, vol 6144. The International Society for Optical Engineering (SPIE), San Diego
[50] Sjöstrand, K.; Rostrup, E.; Ryberg, C.; Larsen, R.; Studholme, C.; Baezner, H.; Ferro, J.; Fazekas, F.; Pantoni, L.; Inzitari, D.; Waldemar, G., Sparse decomposition and modeling of anatomical shape variation, IEEE Trans Med Imaging, 26, 1625-1635, (2007)
[51] Slice, DE, Landmark coordinates aligned by procrustes analysis do not Lie in kendall’s shape space, Syst Biol, 50, 141-149, (2001)
[52] Stoyan LA, Stoyan H (1995) Fractals, random shapes and point fields. Wiley, Chichester · Zbl 0828.62085
[53] Theodosiou T, Kazanidis I, Valsamidis S, Kontogiannis S (2013) Courseware usage archetyping. In: Proceedings of the 17th panhellenic conference on informatics, ACM, New York, PCI ’13, pp 243-249
[54] Thøgersen, JC; Mørup, M.; Damkiær, S.; Molin, S.; Jelsbak, L., Archetypal analysis of diverse pseudomonas aeruginosa transcriptomes reveals adaptation in cystic fibrosis airways, BMC Bioinform, 14, 279, (2013)
[55] Thurau, C.; Kersting, K.; Wahabzada, M.; Bauckhage, C., Descriptive matrix factorization for sustainability: adopting the principle of opposites, Data Min Knowl Disc, 24, 325-354, (2012) · Zbl 1235.62002
[56] Tsanousa, A.; Laskaris, N.; Angelis, L., A novel single-trial methodology for studying brain response variability based on archetypal analysis, Expert Syst Appl, 42, 8454-8462, (2015)
[57] Vinué, G., Anthropometry: an R package for analysis of anthropometric data, J Stat Softw, 77, 1-39, (2017)
[58] Vinué, G.; Epifanio, I., Archetypoid analysis for sports analytics, Data Min Knowl Discov, 31, 1643-1677, (2017)
[59] Vinué, G.; Epifanio, I.; Alemany, S., Archetypoids: a new approach to define representative archetypal data, Comput Stat Data Anal, 87, 102-115, (2015) · Zbl 1468.62203
[60] Vinué G, Epifanio I, Simó A, Ibáñez M, Domingo J, Ayala G (2015b) Anthropometry: an R package for analysis of anthropometric data. R package version 1:5
[61] Vinué, G.; Simó, A.; Alemany, S., The k-means algorithm for 3D shapes with an application to apparel design, Adv Data Anal Classif, 10, 103-132, (2016)
[62] Viscosi, V.; Cardini, A., Leaf morphology, taxonomy and geometric morphometrics: a simplified protocol for beginners, PLoS ONE, 6, 1-20, (2011)
[63] Zapater, V.; Martínez-Costa, L.; Ayala, G.; Domingo, J., Classifying human endothelial cells based on individual granulometric size distributions, Image Vis Comput, 20, 783-791, (2002)
[64] Zou H, Hastie T (2012) elasticnet: elastic-net for sparse estimation and sparse PCA. R package version 1.1. http://CRAN.R-project.org/package=elasticnet
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.