×

Correlation between compositional parts based on symmetric balances. (English) Zbl 1369.86020

Summary: Correlation coefficients are most popular in statistical practice for measuring pairwise variable associations. Compositional data, carrying only relative information, require a different treatment in correlation analysis. For identifying the association between two compositional parts in terms of their dominance with respect to the other parts in the composition, symmetric balances are constructed, which capture all relative information in the form of aggregated logratios of both compositional parts of interest. The resulting coordinates have the form of logratios of individual parts to a (weighted) “average representative” of the other parts, and thus, they clearly indicate how the respective parts dominate in the composition on average. The balances form orthonormal coordinates, and thus, the standard correlation measures relying on the Euclidean geometry can be used to measure the association. Simulation studies provide deeper insight into the proposed approach, and allow for comparisons with alternative measures. An application from geochemistry (Kola moss) indicates that correlations based on symmetric balances serve as a sensitive tool to reveal underlying geochemical processes.

MSC:

86A32 Geostatistics

Software:

R
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Aitchison J (1986) The statistical analysis of compositional data. Chapman and Hall, London · Zbl 0688.62004 · doi:10.1007/978-94-009-4109-0
[2] Buccianti A, Pawlowsky-Glahn V (2005) New perspectives on water chemistry and compositional data analysis. Math Geol 37(7):703-727 · Zbl 1103.62111 · doi:10.1007/s11004-005-7376-6
[3] Chayes F (1960) On correlation between variables of constant sum. J Geophys Res 65(12):4185-4193 · Zbl 1236.34119 · doi:10.1029/JZ065i012p04185
[4] Eaton M (1983) Multivariate statistics. A vector space approach. Wiley, New York · Zbl 0587.62097
[5] Egozcue J (2009) Reply to “On the Harker variation diagrams; <InlineEquation ID=”IEq146“> <EquationSource Format=”TEX“>\[ \ldots \] <EquationSource Format=”MATHML“> <math xmlns:xlink=”http://www.w3.org/1999/xlink“> …” by J.A. Cortés. Math Geosci 41(7):829-834 · Zbl 1178.86018 · doi:10.1007/s11004-009-9238-0
[6] Egozcue J, Pawlowsky-Glahn V (2015) Proceedings of the 6th international workshop on compositional data analysis. In: Thió-Henestrosa S, Martín Ferníndez J (eds) Changing the reference measure in the simplex and its weighting effects, University of Girona, Girona, pp 1-10 · Zbl 1156.86308
[7] Egozcue JJ, Pawlowsky-Glahn V (2005) Groups of parts and their balances in compositional data analysis. Math Geol 37:795-828 · Zbl 1177.86018 · doi:10.1007/s11004-005-7381-9
[8] Egozcue JJ, Pawlowsky-Glahn V, Mateu-Figueras G, Barceló-Vidal C (2003) Isometric logratio transformations for compositional data analysis. Math Geol 35:279-300 · Zbl 1302.86024 · doi:10.1023/A:1023818214614
[9] Egozcue JJ, Lovell D, Pawlowsky-Glahn V (2013) Testing compositional association. In: Hron K, Filzmoser P, Templ M (eds) Proceedings of the 5th International Workshop on Compositional Data Analysis. Vorau, Austria · Zbl 07258984
[10] Filzmoser P, Hron K (2015) Robust coordinates for compositional data using weighted balances. In: Nordhausen K, Taskinen S (eds) Modern nonparametric. Robust and multivariate Methods. Springer, Heidelberg · Zbl 1135.62040
[11] Filzmoser P, Hron K, Reimann C (2009) Univariate statistical analysis of environmental (compositional) data: problems and possibilities. Sci Total Environ 407:6100-6108 · doi:10.1016/j.scitotenv.2009.08.008
[12] Filzmoser P, Hron K, Reimann C (2010) The bivariate statistical analysis of environmental (compositional) data. Sci Total Environ 408(19):4230-4238 · Zbl 1191.39011 · doi:10.1016/j.scitotenv.2010.05.011
[13] Fišerová E, Hron K (2011) On interpretation of orthonormal coordinates for compositional data. Math Geosci 43:455-468 · doi:10.1007/s11004-011-9333-x
[14] Johnson RA, Wichern DW (2007) Applied multivariate statistical analysis, 6th edn. Prentice Hall, Englewood · Zbl 1269.62044
[15] Korhoňová M, Hron K, Klimčíková D, Müller L, Bednář P, Barták P (2009) Coffee aroma—statistical analysis of compositional data. Talanta 80(82):710-715 · doi:10.1016/j.talanta.2009.07.054
[16] McKinley J, Hron K, Grunsky E, Reimann C, de Caritat P, Filzmoser P, van den Boogaart K, Tolosana-Delgado R (2016) The single component geochemical map: fact or fiction. J Geochem Explor 162:16-28 · doi:10.1016/j.gexplo.2015.12.005
[17] Pawlowsky-Glahn V, Buccianti A (2011) Compositional data analysis: theory and applications. Wiley, Chichester · Zbl 1103.62111 · doi:10.1002/9781119976462
[18] Pawlowsky-Glahn V, Egozcue JJ (2001) Geometric approach to statistical analysis on the simplex. Stoch Environ Res Risk Assess (SERRA) 15(5):384-398 · Zbl 0987.62001 · doi:10.1007/s004770100077
[19] Pawlowsky-Glahn V, Egozcue J, Tolosana-Delgado R (2015) Modeling and analysis of compositional data. Wiley, Chichester
[20] Pearson K (1897) Mathematical contributions to the theory of evolution. On a form of spurious correlation which may arise when indices are used in the measurement of organs. Proc R Soc Lond LX:489-502 · JFM 28.0209.02
[21] R Development Core Team (2015) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
[22] Reimann C, Äyräs M, VC, et al (1998) Environmental geochemical Atlas of the Central Barents Region. NGU-GTK-CKE Special publication, Geological Survey of Norway, Trondheim, Norway
[23] Reimann C, Filzmoser P, Fabian K, Hron K, Birke M, Demetriades A, Dinelli E, Ladenberger A, The GEMAS Project Team (2012) The concept of compositional data analysis in practice. Total major element concentrations in agricultural and grazing land soils of Europe. Sci Total Environ 426:196-210
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.