Adaptation of interval PCA to symbolic histogram variables. (English) Zbl 1255.62173

Summary: This paper is an adaptation of symbolic interval Principal Component Analysis (PCA) to histogram data. We proposed two methodologies. The first one involved three steps: the coding of bins of histograms, the ordinary PCA of means of variables and the representation of dispersions of symbolic observations we call concepts. For the representation of dispersion of these concepts we proposed the transformation of histograms into intervals. Then, we suggest the projection of the hypercubes or the interval lengths associated to each concept on the principal axes of the ordinary PCA of means. In the second methodology, we proposed the use of the three previous steps with the angular transformations.


62H25 Factor analysis and principal components; correspondence analysis


ade4; SODAS; CRAN; FactoMineR
Full Text: DOI


[1] Aitchison J (1986) The statistical analysis of compositionnal data. Chapman and Hall, London · Zbl 0688.62004
[2] Bock H-H, Diday E (2000) Analysis of symbolic data exploratory methods for extracting statistical information from complex data. Springer, Heidelberg, p 425 · Zbl 1039.62501
[3] Billard L, Diday E (2006) Symbolic data analysis: conceptual statistics and data mining. In: Wiley series in computational statistics · Zbl 1117.62002
[4] Bishop Y, Feinberg S, Holland P (1975) Discrete multivariate analysis, theory and practice. MIT Press, Cambridge
[5] Cazes P, Chouakria A, Diday E, Schektman Y (1997) Extension de l’analyse en composantes principales a des données de type intervalle. Rev Statistique Appliquée 45(3): 5–24
[6] Cazes P (2002) Analyse factorielle d’un tableau de lois de probabilité. Rev Statistique Appliquée 50(3): 5–24
[7] Chessel D, Dufour A-B, Thioulouse J (2004) The ade4 package-IOne- table methods. R News 4: 5–10
[8] Diday E, Noirhomme M (2008) Symbolic data analysis and the SODAS software. Wiley, London · Zbl 1275.62029
[9] Eckart C, Young G (1936) The approximation of one matrix by another of lower rank. Psychometria 1: 211–218 · JFM 62.1075.02
[10] Escoffier B, Pagès J (1998) Analyses factorielles simples et multiples; objectifs,méthodes et interprètation. 3rd edn. Dunod, Paris
[11] Fisher RA (1922) On the mathematical foundations of theoretical statistics. Philos Trans Roy Soc London Ser A 222: 309–368 · JFM 48.1280.02
[12] Gower JC (1975) Generalized procrustes analysis. Psychometrika 40: 33–51 · Zbl 0305.62038
[13] Husson F, Josse J, Le S, Mazet J (2009) Package FactomineR : an R package for exploratory data analysis. R News, CRAN-2009
[14] Ichino M (2008) Symbolic PCA for histogram-valued data. In: Proceedings IASC. December 5–8, Yokohama, Japan, 2008
[15] Ichino M (2011) The quantile method for symbolic principal component analysis. Stat Anal Data Min 4(2): 184–198
[16] Lavit C (1988) Analyse conjointe de tableaux quantitatifs. Masson, Paris
[17] L’Hermier des Plantes H (1976) Structuration des Tableaux à Trois Indices de la Statistique. Thèse de 3e cycle. Université de Montpellier
[18] Nagabhushan P, Kumar P (2007) Principal component analysis of histogram data. Springer, Berlin
[19] Rodriguez O, Diday E, Winsberg S (2001) Generalization of the principal component analysis to histogram data. Workshop on symbolic data analysis, 4th Europ. Conf. on Princ., Sept. 12–16, 2000, Lyon, 1
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.