×

Discovering patterns in time-varying graphs: a triclustering approach. (English) Zbl 1416.62375

Summary: This paper introduces a novel technique to track structures in time varying graphs. The method uses a maximum a posteriori approach for adjusting a three-dimensional co-clustering of the source vertices, the destination vertices and the time, to the data under study, in a way that does not require any hyper-parameter tuning. The three dimensions are simultaneously segmented in order to build clusters of source vertices, destination vertices and time segments where the edge distributions across clusters of vertices follow the same evolution over the time segments. The main novelty of this approach lies in that the time segments are directly inferred from the evolution of the edge distribution between the vertices, thus not requiring the user to make any a priori quantization. Experiments conducted on artificial data illustrate the good behavior of the technique, and a study of a real-life data set shows the potential of the proposed approach for exploratory data analysis.

MSC:

62H99 Multivariate analysis
62H30 Classification and discrimination; cluster analysis (statistical aspects)
68T05 Learning and adaptive systems in artificial intelligence

Software:

PMTK; GraphScope
PDF BibTeX XML Cite
Full Text: DOI arXiv

References:

[1] Bekkerman R, El-Yaniv R, McCallum A (2005) Multi-way distributional clustering via pairwise interractions. In: ICML, pp 41-48
[2] Borgatti, SP, A comment on doreian’s regular equivalence in symmetric structures, Soc Netw, 10, 265-271, (1988)
[3] Boullé M (2011) Data grid models for preparation and modeling in supervised learning. In: Guyon I, Cawley G, Dror G, Saffari A (eds) Hands-on pattern recognition: challenges in machine learning, vol 1. Microtome Publishing, pp 99-130
[4] Casteigts, A.; Flocchini, P.; Quattrociocchi, W.; Santoro, N., Time-varying graphs and dynamic networks, Int J Parallel Emerg Distrib Syst, 27, 387-408, (2012)
[5] Cover TM, Thomas JA (2006) Elements of information theory, 2nd edn. Wiley, New York · Zbl 1140.94001
[6] Dhillon IS, Mallela S, Modha D (2003) Information-theoretic co-clustering. In: KDD ’03, pp 89-98
[7] Erdős, P.; Rényi, A., On random graphs. I, Publ Math, 6, 290-297, (1959) · Zbl 0092.15705
[8] Fortunato, S., Community detection in graphs, Phys Rep, 486, 75-174, (2010)
[9] Goldenberg, A.; Zheng, AX; Fienberg, S.; Airoldi, EM, A survey of statistical network models, Found Trends Mach Learn, 2, 129-233, (2009) · Zbl 1184.68030
[10] Grünwald P (2007) The minimum description length principle. Mit Press, Cambridge
[11] Guigourès R, Boullé M, Rossi F (2012) A triclustering approach for time evolving graphs. In: Co-clustering and applications, IEEE 12th international conference on data mining workshops (ICDMW 2012), Brussels, Belgium, pp 115-122. doi:10.1109/ICDMW.2012.61
[12] Hansen, P.; Mladenovic, N., Variable neighborhood search: principles and applications, Eur J Oper Res, 130, 449-467, (2001) · Zbl 0981.90063
[13] Hartigan, J., Direct clustering of a data matrix, J Am Stat Assoc, 67, 123-129, (1972)
[14] Hintze, JL; Nelson, RD, Violin plots: a box plot-density trace synergism, Am Stat, 52, 181-184, (1998)
[15] Hopcroft, J.; Khan, O.; Kulis, B.; Selman, B., Tracking evolving communities in large linked networks, PNAS, 101, 5249-5253, (2004)
[16] Kemp C, Tenenbaum J (2006) Learning systems of concepts with an infinite relational model. In: AAAI’06
[17] Lang KJ (2009) Information theoretic comparison of stochastic graph models: some experiments. In: WAW, pp 1-12 · Zbl 1207.05183
[18] Li, Y.; Jain, A., Classification of text documents, Comput J, 41, 537-546, (1998) · Zbl 0920.68048
[19] Lin, J., Divergence measures based on the Shannon entropy, IEEE Trans Inf Theory, 37, 145-151, (1991) · Zbl 0712.94004
[20] Murphy KP (2012) Machine learning: a probabilistic perspective. MIT Press, Cambridge · Zbl 1295.68003
[21] Nadel SF (1957) The theory of social structure. Cohen & West, London
[22] Nadif M, Govaert G (2010) Model-based co-clustering for continuous data. In: ICMLA, pp 175-180 · Zbl 1187.62117
[23] Nowicki, K.; Snijders, T., Estimation and prediction for stochastic blockstructures, J Am Stat Assoc, 96, 1077-1087, (2001) · Zbl 1072.62542
[24] Palla, G.; Derenyi, I.; Farkas, I.; Vicsek, T., Uncovering the overlapping community structure of complex networks in nature and society, Nature, 435, 814-818, (2005)
[25] Palla, G.; Barabási, AL; Vicsek, T., Quantifying social group evolution, Nature, 446, 664-667, (2007)
[26] Rege M, Dong M, Fotouhi F (2006) Co-clustering documents and words using bipartite isoperimetric graph partitioning. In: ICDM, pp 532-541
[27] Schaeffer, S., Graph clustering, Comput Sci Rev, 1, 27-64, (2007) · Zbl 1302.68237
[28] Schepers, J.; Mechelen, I.; Ceulemans, E., Three-mode partitioning, Comput Stat Data Anal, 51, 1623-1642, (2006) · Zbl 1157.62447
[29] Shannon, CE, A mathematical theory of communication, Bell Syst Tech J, 27, 379-423, (1948) · Zbl 1154.94303
[30] Slonim, N.; Tishby, N., Agglomerative information bottleneck, Adv Neural Inf Process Syst, 12, 617-623, (1999)
[31] Strehl, A.; Ghosh, J., Cluster ensembles—a knowledge reuse framework for combining multiple partition, JMLR, 3, 583-617, (2003) · Zbl 1084.68759
[32] Sun J, Faloutsos C, Papadimitriou S, Yu P (2007) Graphscope: parameter-free mining of large time-evolving graphs. In: KDD ’07, pp 687-696
[33] Mechelen, I.; Bock, HH; Boeck, P., Two-mode clustering methods: a structured overview, Stat Methods Med Res, 13, 363-394, (2004) · Zbl 1053.62078
[34] White, DR; Reitz, KP, Graph and semigroup homomorphisms on networks of relations, Soc Netw, 5, 193-324, (1983)
[35] White, H.; Boorman, S.; Breiger, R., Social structure from multiple networks: I. blockmodels of roles and positions, Am J Sociol, 81, 730-780, (1976)
[36] Xing, EP; Fu, W.; Song, L., A state-space mixed membership blockmodel for dynamic network tomography, Ann Appl Stat, 4, 535-566, (2010) · Zbl 1194.62133
[37] Zhao L, Zaki M (2005) Tricluster: an effective algorithm for mining coherent clusters in 3d microarray data. In: SIGMOD conference, pp 694-705
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.