×

zbMATH — the first resource for mathematics

Linear regression with compositional explanatory variables. (English) Zbl 07263531
Summary: Compositional explanatory variables should not be directly used in a linear regression model because any inference statistic can become misleading. While various approaches for this problem were proposed, here an approach based on the isometric logratio (ilr) transformation is used. It turns out that the resulting model is easy to handle, and that parameter estimation can be done in like in usual linear regression. Moreover, it is possible to use the ilr variables for inference statistics in order to obtain an appropriate interpretation of the model.

MSC:
62-XX Statistics
PDF BibTeX Cite
Full Text: DOI
References:
[1] Aitchison, J. 1986. The Statistical Analysis of Compositional Data, London: Chapman & Hall. · Zbl 0688.62004
[2] Aitchison, J. and Bacon-Shone, J. 1984. Log contrast models for experiments with mixtures. Biometrika, 71(2): 323-330.
[3] Aitchison, J., Barceló-Vidal, C., Martín-Fernández, J. A. and Pawlowsky-Glahn, V. 2000. Logratio analysis and compositional distance. Math. Geol., 32: 271-275. · Zbl 1101.86309
[4] Billheimer, D., Guttorp, P. and Fagan, W. 2001. Statistical interpretation of species composition. J. Am. Stat. Assoc., 96: 1205-1214. · Zbl 1073.62573
[5] Buccianti, A., Egozcue, J. J. and Pawlowsky-Glahn, V. 2008. Another look at the chemical relationships in the dissolved phase of complex river systems. Math. Geosci., 40: 475-488. · Zbl 1153.86337
[6] Buccianti, A., Mateu-Figueras, G. and Pawlowsky-Glahn, V. 2006. “Frequency distributions and natural laws in geochemistry”. In Compositional Data Analysis in the Geosciences: From Theory to Practice 264, Edited by: Buccianti, A., Mateu-Figueras, G. and Pawlowsky-Glahn, V. 175-189. London: Geological Society. Special Publications
[7] Egozcue, J. J. 2009. Reply to “On the Harker variation diagrams” by J.A. Cortés. Math. Geosci., 41: 829-834. · Zbl 1178.86018
[8] Egozcue, J. J. and Pawlowsky-Glahn, V. 2005. Groups of parts and their balances in compositional data analysis. Math. Geol., 37: 795-828. · Zbl 1177.86018
[9] Egozcue, J. J. and Pawlowsky-Glahn, V. 2006. “Simplicial geometry for compositional data”. In Compositional Data Analysis in the Geosciences: From Theory to Practice 264, Edited by: Buccianti, A., Mateu-Figueras, G. and Pawlowsky-Glahn, V. 145-160. London: Geological Society. Special Publications · Zbl 1156.86307
[10] Egozcue, J. J., Pawlowsky-Glahn, V., Mateu-Figueras, G. and Barceló-Vidal, C. 2003. Isometric logratio transformations for compositional data analysis. Math. Geol., 35: 279-300. · Zbl 1302.86024
[11] Filzmoser, P., Hron, K. and Reimann, C. 2009. Univariate analysis of environmental (compositional) data: Problems and possibilities. Sci. Total Environ., 407: 6100-6108.
[12] Fišerová, E. and Hron, K. 2010. On interpretation of orthonormal coordinates for compositional data. Math. Geosci., 43: 455-468.
[13] Fišerová, E., Kubáček, K. and Kunderová, P. 2007. Linear Statistical Models - Regularity and Singularities, Praha: Academia.
[14] Hoerl, A. E. and Kennard, R. 1970. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12: 55-67. · Zbl 0202.17205
[15] Hron, K., Templ, M. and Filzmoser, P. 2010. Imputation of missing values for compositional data using classical and robust methods. Comput. Stat. Data Anal., 54: 3095-3107. · Zbl 1284.62049
[16] Kubáček, L., Kubáčková, L. and Volaufová, J. 1995. Statistical Models with Linear Structures. Veda, Bratislava,
[17] Maronna, R., Martin, R. D. and Yohai, V. J. 2006. Robust Statistics: Theory and Methods, New York: John Wiley. · Zbl 1094.62040
[18] Pawlowsky-Glahn, V. and Egozcue, J. J. 2001. Geometric approach to statistical analysis on the simplex. Stoch. Environ. Res. Risk Assessment, 15: 384-398. · Zbl 0987.62001
[19] R Development Core Team. 2010. R: A Language and Environment for Statistical Computing, Vienna, , Austria: R Foundation for Statistical Computing.
[20] Scheffé, H. 1958. Experiments with mixtures. J. Roy. Stat. Soc. - B, 20: 344-360. · Zbl 0088.12401
[21] Scheffé, H. 1963. The simplex-centroid design for experiments with mixtures. J. Roy. Stat. Soc. - B, 25: 235-263. · Zbl 0133.12301
[22] Varmuza, K. and Filzmoser, P. 2009. Introduction to Multivariate Statistical Analysis in Chemometrics, Boca Raton, FL: CRC Press.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.