×

Principal component analysis for compositional data vectors. (English) Zbl 1329.65036

Summary: Since Aitchison’s founding research work, compositional data analysis has attracted growing attention in recent decades. As a powerful technique for exploratory analysis, principal component analysis (PCA) has been extended to compositional data. Despite extensive efforts in PCA on compositional data parts as variables, this paper contributes to modeling PCA for compositional data vectors. Based on algebraic operators in Simplex space, the PCA process is deduced and transformed into calculating some inner products. Properties of principal components are also investigated. Two real-data examples illustrate the merits of the proposed PCA for compositional data vectors.

MSC:

62-08 Computational methods for problems pertaining to statistics
62H25 Factor analysis and principal components; correspondence analysis

Software:

fda (R)
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Aitchison J (1982) The statistical analysis of compositional data. J R Stat Soc Ser B (Methodol) 44:139-177 · Zbl 0491.62017
[2] Aitchison J (1983) Principal component analysis of compositional data. Biometrika 70(1):57-65 · Zbl 0515.62057 · doi:10.1093/biomet/70.1.57
[3] Aitchison J (1984) Reducing the dimensionality of compositional data sets. Math Geol 16(6):617-635 · doi:10.1007/BF01029321
[4] Aitchison J (1986) The statistical analysis of compositional data. Monographs on statistics and applied probability. Chapman and Hal, London, New York Reprinted in 2003 with additional material by The Blackburn Press. Caldwell, NJ · Zbl 1304.65064
[5] Aitchison J (1997) The one-hour course in compositional data analysis or compositional data analysis is simple. In: Pawlowsky-Glahn V (ed) Proceedings of IAMG’97-The third annual conference of the International Association for Mathematical Geology: International Center for Numerical Methods in Engineering (CIMNE), Barcelona pp 3-35 · Zbl 1031.86007
[6] Aitchison J, Greenacre M (2002) Biplots of compositional data. J R Stat Soc Ser C Appl Stat 51(4):375-392 · Zbl 1111.62300 · doi:10.1111/1467-9876.00275
[7] Bacon-Shone, J.; Pawlowsky-Glahn, V. (ed.); Buccianti, A. (ed.), A short history of compositional data analysis, 3-11 (2011), Chichester
[8] Bali JL, Boente G, Tyler DE, Wang JL (2011) Robust functional principal components: a projection-pursuit approach. Ann Stat 39(6):2852-2882 · Zbl 1246.62145 · doi:10.1214/11-AOS923
[9] Cazes P, Chouakria A, Diday E, Schektman Y (1997) Extension de l’analyse en composantes principales à des donnés de type intervalle. Revue de Statistique Appliqueé XLV(3):5-24
[10] Filzmoser P (1999) Robust principal component and factor analysis in the geostatistical treatment of environmental data. Environmetrics 10:363-375 · doi:10.1002/(SICI)1099-095X(199907/08)10:4<363::AID-ENV362>3.0.CO;2-0
[11] Filzmoser, P.; Hron, K.; Pawlowsky-Glahn, V. (ed.); Buccianti, A. (ed.), Robust statistical analysis, 59-72 (2011), Chichester · doi:10.1002/9781119976462.ch5
[12] Filzmoser P, Hron K, Reimann C (2009) Principal component analysis for compositional data with outliers. Environmetrics 20(6):621-632 · doi:10.1002/env.966
[13] Fisher AGB (1939) Production, primary, secondary and tertiary. Econ Rec 15(1):24-38 · doi:10.1111/j.1475-4932.1939.tb01015.x
[14] Gallo M (2012) Coda in three-way arrays and relative sample space. Electron J Appl Stat Anal 5(3):400-405
[15] Gallo M (2013) Log-ratio and parallel factor analysis: an approach to analyze three-way compositional data. Adv Dyn Model Econ Soc Syst Stud Comput Intell 448:209-221
[16] Gallo M, Lucadamo A (2008) Parafac/candecom analysis for compositional data. In: 10th European symposium on statistical methods for the food industry, Louvain-La-Neuve, pp 91-99
[17] Gioia F, Lauro CN (2006) Principal component analysis on interval data. Comput Stat 21:343-363 · Zbl 1113.62072 · doi:10.1007/s00180-006-0267-6
[18] Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer, New York · Zbl 1011.62064
[19] Martín-Fernández, JA; Barceló-Vidal, C.; Pawlowsky-Glahn, V.; Rizzi, A. (ed.); Vichi, M. (ed.); Bock, HH (ed.), A critical approach to non-parametric classification of compositional data, 49-56 (1998), Berlin · doi:10.1007/978-3-642-72253-0_7
[20] Orlik T (2011) Getting to grips with China’s GDP data. FT Press, New Jersey
[21] Palarea-Albaladejo J, Martín-Fernández J (2013) Values below detection limit in compositional chemical data. Anal Chim Acta 764:32-43 · doi:10.1016/j.aca.2012.12.029
[22] Pawlowsky-Glahn V, Egozcue JJ (2001) Geometric approach to statistical analysis on the simplex. Stoch Environ Res Risk Assess 15(5):384-398 · Zbl 0987.62001 · doi:10.1007/s004770100077
[23] Pawlowsky-Glahn V, Egozcue JJ (2002) Blu estimators and compositional data. Math Geol 34(3):259-274 · Zbl 1031.86007 · doi:10.1023/A:1014890722372
[24] Ramsay J, Silverman BW (2005) Functional data analysis. Springer, New York · doi:10.1002/0470013192.bsa239
[25] Sawant P, Billor N, Shin H (2012) Functional outlier detection with robust functional principal component analysis. Comput Stat 27:83-102 · Zbl 1304.65064 · doi:10.1007/s00180-011-0239-3
[26] Valderrama MJ (2007) An overview to modelling functional data. Comput Stat 22:331-334 · doi:10.1007/s00180-007-0043-2
[27] Wang H, Liu Q, Mok HM, Fu L, Tse WM (2007) A hyperspherical transformation forecasting model for compositional data. Eur J Oper Res 179:459-468 · Zbl 1114.90049 · doi:10.1016/j.ejor.2006.03.039
[28] Wang H, Guan R, Wu J (2012) CIPCA: complete-information-based principal component analysis for interval-valued data. Neurocomputing 86:158-169 · doi:10.1016/j.neucom.2012.01.018
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.