Clustering the rows and columns of a contingency table. (English) Zbl 0652.62053

Summary: A number of ways of investigating heterogeneity in a two-way contingency table are reviewed. In particular, we consider chi-square decompositions of the Pearson chi-square statistic with respect to the nodes of a hierarchical clustering of the rows and/or the columns of the table. A cut-off point which indicates “significant clustering” may be defined on the binary trees associated with the respective row and column cluster analyses. This approach provides a simple graphical procedure which is useful in interpreting a significant chi-square statistic of a contingency table.


62H30 Classification and discrimination; cluster analysis (statistical aspects)
62H17 Contingency tables
Full Text: DOI


[1] BENZECRI, J.-P. (1973),L’Analyse des Données, Tome (Vol.) 1 – La Taxinomie, Tome 2 – L’Analyse des Correspondances, Paris: Dunod.
[2] BENZECRI, J.-P., and CAZES, P. (1978), ”Probleme sur la classification,”Cahiers de L’Analyse des Données, 3, 95–101.
[3] CLEVELAND, W.S., and RELLES, D.A. (1975), ”Clustering by Identification with Special Application to Two-way Tables of Counts,”Journal of the American Statistical Association, 70, 626–630.
[4] EVERITT, B.,Cluster Analysis, London: Heinemann. · Zbl 0507.62060
[5] GABRIEL, K.R. (1966), ”Simultaneous Test Procedures for Multiple Comparisons on Categorical Data,”Journal of the American Statistical Association, 61, 1081–1096.
[6] GILULA, Z. (1986), ”Grouping and Association in Contingency Tables: An Exploratory Canonical Correlation Approach,”Journal of the American Statistical Association, 81, 773–779. · Zbl 0648.62061
[7] GILULA, Z., and HABERMAN, S.J. (1986), ”Canonical Analysis of Contingency Tables by Maximum Likelihood,”Journal of the American Statistical Association, 81, 780–788. · Zbl 0623.62047
[8] GILULA, Z. and KRIEGER, A.M. (1983), ”The Decomposability and Monotonicity of Pearson’s Chi-Square for Collapsed Contingency Tables with Applications,”Journal of the American Statistical Association, 78, 176–180. · Zbl 0546.62030
[9] GOLD, R.Z. (1963), ”Tests Auxilliary to x2 Tests in a Markov Chain,”Annals of Mathematical Statistics, 34, 56–74. · Zbl 0114.09102
[10] GOODMAN, L.A. (1964), ”Simultaneous Confidence Intervals for Contrasts Among Multinomial Populations,”Annals of Mathematical Statistics, 35, 716–725. · Zbl 0227.62025
[11] GOODMAN, L.A. (1965), ”On Simultaneous Confidence Intervals for Multinomial Proportions,”Technometrics, 7, 247–254. · Zbl 0131.17701
[12] GOODMAN, L.A. (1985), ”The Analysis of Cross-Classified Data Having Ordered and/or Unordered Categories: Association Models, Correlation Models, and Asymmetry Models for Contingency Tables with or without Missing Entries,”Annals of Statistics, 13, 10–69. · Zbl 0613.62070
[13] GOVAERT G. (1984), ”Classification Simultanée de Tableaux Binaires,” inData Analysis and Informatics 3, eds. E. Diday, M. Jambu, L. Lebart, J. Pages, and R. Tomassone, Amsterdam: North Holland, 223–236.
[14] GREENACRE, M.J. (1984),Theory and Applications of Correspondence Analysis, London: Academic Press. · Zbl 0555.62005
[15] GUTTMAN, L. (1971), ”Measurement as Structural Theory,”Psychometrika, 36, 329–347.
[16] HIROTSU, C. (1983), ”Defining the Pattern of Association in Two-way Contingency Tables,”Biometrika, 70, 579–589. · Zbl 0534.62036
[17] JAMBU, M. (1978),Classification Automatique pour L’Analyse des Données, 1 – Méthodes et Algorithmes, Paris: Dunod. · Zbl 0419.62057
[18] JAMBU, M., and LEBEAUX, M.O. (1983),Cluster Analysis and Data Analysis, Amsterdam: North Holland. · Zbl 0521.62054
[19] LANCE, G.N., and WILLIAMS, W.T. (1967), ”A General Theory of Classificatory Sorting Strategies. 1. Hierarchical Systems,”Computer Journal, 9, 373–380.
[20] LEBART, L. (1975),Validité des Résultats en Analyse des Données, Paris: CREDOC-DGRST.
[21] LEBART, L., MORINEAU, A., and WARWICK, K. (1984),Multivariate Descriptive Statistical Analysis, New York: Wiley. · Zbl 0658.62069
[22] O’NEILL, M.E. (1981), ”A Note on the Canonical Correlations from Contingency Tables,”Australian Journal of Statistics, 23, 58–66. · Zbl 0492.62045
[23] PEARSON, E.S., and HARTLEY, H.O. (1972),Biometrika Tables for Statisticians, Volume 2, Cambridge, England: Cambridge University Press. · Zbl 0255.62003
[24] QUESENBERRY, C.P., and HURST, D.C. (1964), ”Large Sample Simultaneous Confidence Intervals for Multinomial Proportions,”Technometrics, 6, 191–195. · Zbl 0129.32605
[25] SNEE, R.D. (1974), ”Graphical Display of Two-way Contingency Tables,”American Statistician, 28, 9–12. · Zbl 0361.62037
[26] THARU, J., and WILLIAMS, W.T. (1966), ”Concentration of Entries in Binary Arrays,”Nature, 210, 549.
[27] WARD, J.H. (1963), ”Hierarchical Grouping to Optimize an Objective Function,”Journal of the American Statistical Association, 58, 236–244.
[28] WISHART, D. (1969), ”An Algorithm for Hierarchical Classifications,”Biometrics, 25, 165–170.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.