×

Hierarchical clustering of asymmetric networks. (English) Zbl 1414.91298

Summary: This paper considers networks where relationships between nodes are represented by directed dissimilarities. The goal is to study methods that, based on the dissimilarity structure, output hierarchical clusters, i.e., a family of nested partitions indexed by a connectivity parameter. Our construction of hierarchical clustering methods is built around the concept of admissible methods, which are those that abide by the axioms of value – nodes in a network with two nodes are clustered together at the maximum of the two dissimilarities between them – and transformation – when dissimilarities are reduced, the network may become more clustered but not less. Two particular methods, termed reciprocal and nonreciprocal clustering, are shown to provide upper and lower bounds in the space of admissible methods. Furthermore, alternative clustering methodologies and axioms are considered. In particular, modifying the axiom of value such that clustering in two-node networks occurs at the minimum of the two dissimilarities entails the existence of a unique admissible clustering method. Finally, the developed clustering methods are implemented to analyze the internal migration in the United States.

MSC:

91C20 Clustering in the social and behavioral sciences
62H30 Classification and discrimination; cluster analysis (statistical aspects)
PDF BibTeX XML Cite
Full Text: DOI arXiv

References:

[1] Ackerman M, Ben-David S (2008) Measures of clustering quality: a working set of axioms for clustering. In: Neural Info. Process. Syst. (NIPS), pp 121-128
[2] Bach F, Jordan M (2004) Learning spectral clustering. In: Neural Info. Process. Syst. (NIPS), pp 305-312
[3] Ben-David S, Von Luxburg U, Pál D (2006) A sober look at clustering stability. In: Conf. Learning Theory (COLT), pp 5-19 · Zbl 1143.68520
[4] Boyd, JP, Asymmetric clusters of internal migration regions of France, IEEE Trans Syst Man Cybern, 2, 101-104, (1980)
[5] Burago D, Burago Y, Ivanov S (2001) A course in metric geometry, AMS Graduate Studies in Math., vol 33. American Mathematical Society, Providence · Zbl 0981.51016
[6] Carlsson, G.; Mémoli, F., Characterization, stability and convergence of hierarchical clustering methods, J Mach Learn Res, 11, 1425-1470, (2010) · Zbl 1242.62050
[7] Carlsson G, Mémoli F (2010b) Multiparameter hierarchical clustering methods. In: Conf. Intl. Fed. Classif. Soc. (IFCS). Springer, Berlin, pp 63-70
[8] Carlsson, G.; Mémoli, F., Classifying clustering schemes, Found Comput Math, 13, 221-252, (2013) · Zbl 1358.62057
[9] Carlsson G, Mémoli F, Ribeiro A, Segarra S (2013a) Alternative axiomatic constructions for hierarchical clustering of asymmetric networks. In: Global Conf. on Signal and Info. Process. (GlobalSIP), pp 791-794
[10] Carlsson G, Memoli F, Ribeiro A, Segarra S (2013b) Axiomatic construction of hierarchical clustering in asymmetric networks. In: Intl. Conf. on Acoustics, Speech and Signal Process. (ICASSP), pp 5219-5223
[11] Carlsson, G.; Mémoli, F.; Ribeiro, A.; Segarra, S., Hierarchical quasi-clustering methods for asymmetric networks, JMLR W&CP: Int Conf Mach Learn, 32, 352-360, (2014)
[12] Chino, N., A brief survey of asymmetric MDS and some open problems, Behaviormetrika, 39, 127-165, (2012)
[13] Chino, N.; Shiraiwa, K., Geometrical structures of some non-distance models for asymmetric MDS, Behaviormetrika, 20, 35-47, (1993)
[14] Choi JI, Jain M, Srinivasan K, Levis P, Katti S (2010) Achieving single channel, full duplex wireless communication. In: Proc. Intl. Conf. Mobile Comp. and Netw. ACM, pp 1-12
[15] Chung FR (1997) Spectral graph theory, vol 92. American Mathematical Soc, Providence · Zbl 0867.05046
[16] Guyon I, Von Luxburg U, Williamson RC (2009) Clustering: science or art. In: NIPS 2009 wksp. on clustering theory
[17] Hu, TC, The maximum capacity route problem, Oper Res, 9, 898-900, (1961)
[18] Hubert, L., Min and max hierarchical clustering using asymmetric similarity measures, Psychometrika, 38, 63-72, (1973) · Zbl 0251.92014
[19] Jain A, Dubes RC (1988) Algorithms for clustering data. Prentice Hall Advanced Reference Series, Prentice Hall Inc
[20] Kleinberg JM (2002) An impossibility theorem for clustering. In: Neural Info. Process. Syst. (NIPS), pp 446-453
[21] Lance, GN; Williams, WT, A general theory of classificatory sorting strategies 1: hierarchical systems, Comput J, 9, 373-380, (1967)
[22] Meila M, Pentney W (2007) Clustering by weighted cuts in directed graphs. SIAM Intl Conf Data Mining, pp 135-144
[23] Müllner D (2011) Modern hierarchical, agglomerative clustering algorithms. ArXiv e-prints arXiv:1109.2378
[24] Murtagh F (1985) Multidimensional clustering algorithms. Compstat Lectures. Physica, Vienna · Zbl 0601.62085
[25] Newman, M.; Girvan, M., Community structure in social and biological networks, Proc Natl Acad Sci, 99, 7821-7826, (2002) · Zbl 1032.91716
[26] Newman, M.; Girvan, M., Finding and evaluating community structure in networks, Phys Rev E, 69, 026113, (2004)
[27] Ng A, Jordan M, Weiss Y (2002) On spectral clustering: analysis and an algorithm. In: Neural Info. Process. Syst. (NIPS), pp 849-856
[28] Okada, A.; Iwamoto, T., University enrollment flow among the Japanese prefectures: a comparison before and after the joint first stage achievement test by asymmetric cluster analysis, Behaviormetrika, 23, 169-185, (1996)
[29] Pentney W, Meila M (2005) Spectral clustering of biological sequence data. In: Ntnl. Conf. Artificial Intel., pp 845-850
[30] Saito T, Yadohisa H (2004) Data analysis of asymmetric structures: advanced approaches in computational statistics. CRC Press, Boca Raton · Zbl 1067.62006
[31] Sato, Y., An analysis of sociometric data by MDS in Minkowski space, Stat Theory Data Anal, II, 385-396, (1988) · Zbl 0738.92024
[32] Shi, J.; Malik, J., Normalized cuts and image segmentation, IEEE Trans Pattern Anal Mach Intell, 22, 888-905, (2000)
[33] Slater, P., Hierarchical internal migration regions of France, IEEE Trans Syst Man Cybern, 4, 321-324, (1976)
[34] Slater, P., A partial hierarchical regionalization of 3140 US counties on the basis of 1965-1970 intercounty migration, Environ Plan A, 16, 545-550, (1984)
[35] Smith Z, Chowdhury S, Mémoli, (2016) Hierarchical representations of network data with optimal distortion bounds. In: Asilomar Conf. Signals, Systems and Computers, pp 1773-1777
[36] Tarjan, RE, An improved algorithm for hierarchical clustering using strong components, Inf Process Lett, 17, 37-41, (1983) · Zbl 0509.68059
[37] Vicari, D., Classification of asymmetric proximity data, J Classif, 31, 386-420, (2014) · Zbl 1360.62359
[38] Vicari D (2015) CLUSKEXT: clustering model for skew-symmetric data including external information. Adv Data Anal Classif 1-22
[39] Luxburg, U., A tutorial on spectral clustering, Stat Comput, 17, 395-416, (2007)
[40] Von Luxburg U, Ben-David S (2005) Towards a statistical theory of clustering. In: PASCAL wksp. on statistics and optimization of clustering
[41] Xu, R.; Wunsch, D., Survey of clustering algorithms, IEEE Trans Neural Netw, 16, 645-678, (2005)
[42] Zadeh RB, Ben-David S (2009) A uniqueness theorem for clustering. In: Conf. Uncert. Artif. Intell. (UAI), pp 639-646
[43] Zhao, Y.; Karypis, G., Hierarchical clustering algorithms for document datasets, Data Min Knowl Discov, 10, 141-168, (2005)
[44] Zhou D, Schölkopf B, Hofmann T (2005) Semi-supervised learning on directed graphs. In: Neural Info. Process. Syst. (NIPS), pp 1633-1640
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.