×

The geometry of continuous latent space models for network data. (English) Zbl 1429.62433

Summary: We review the class of continuous latent space (statistical) models for network data, paying particular attention to the role of the geometry of the latent space. In these models, the presence/absence of network dyadic ties are assumed to be conditionally independent given the dyads’ unobserved positions in a latent space. In this way, these models provide a probabilistic framework for embedding network nodes in a continuous space equipped with a geometry that facilitates the description of dependence between random dyadic ties. Specifically, these models naturally capture homophilous tendencies and triadic clustering, among other common properties of observed networks. In addition to reviewing the literature on continuous latent space models from a geometric perspective, we highlight the important role the geometry of the latent space plays on properties of networks arising from these models via intuition and simulation. Finally, we discuss results from spectral graph theory that allow us to explore the role of the geometry of the latent space, independent of network size. We conclude with conjectures about how these results might be used to infer the appropriate latent space geometry from observed networks.

MSC:

62M45 Neural nets and related approaches to inference from stochastic processes
05C90 Applications of graph theory
62H30 Classification and discrimination; cluster analysis (statistical aspects)
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] Abbe, E., Bandeira, A. S. and Hall, G. (2016). Exact recovery in the stochastic block model. IEEE Trans. Inform. Theory 62 471-487. · Zbl 1359.94047 · doi:10.1109/TIT.2015.2490670
[2] Abu-Ata, M. and Dragan, F. F. (2016). Metric tree-like structures in real-world networks: An empirical study. Networks 67 49-68.
[3] Airoldi, E. M., Blei, D. M., Fienberg, S. E., Goldenberg, A., Xing, E. P. and Zheng, A. X. (2008a). Statistical Network Analysis: Models, Issues, and New Directions: ICML 2006 Workshop on Statistical Network Analysis, Pittsburgh, PA, USA, June 29, 2006, Revised Selected Papers 4503. Springer, Berlin.
[4] Airoldi, E. M., Blei, D. M., Fienberg, S. E. and Xing, E. P. (2008b). Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9 1981-2014. · Zbl 1225.68143
[5] Aitchison, J., Barceló-Vidal, C., Martín-Fernández, J. and Pawlowsky-Glahn, V. (2000). Logratio analysis and compositional distance. Mathematical Geology 32 271-275. · Zbl 1101.86309
[6] Aldecoa, R., Orsini, C. and Krioukov, D. (2015). Hyperbolic graph generator. Comput. Phys. Commun. 196 492-496. · Zbl 1360.68638 · doi:10.1016/j.cpc.2015.05.028
[7] Amini, A. A., Chen, A., Bickel, P. J. and Levina, E. (2013). Pseudo-likelihood methods for community detection in large sparse networks. Ann. Statist. 41 2097-2122. · Zbl 1277.62166 · doi:10.1214/13-AOS1138
[8] Asta, D. and Shalizi, C. R. (2014). Geometric network comparison. Under review. Preprint. Available at arXiv:1411.1350.
[9] Barabási, A.-L. and Albert, R. (1999). Emergence of scaling in random networks. Science 286 509-512. · Zbl 1226.05223 · doi:10.1126/science.286.5439.509
[10] Bartholomew, D., Knott, M. and Moustaki, I. (2011). Latent Variable Models and Factor Analysis: A Unified Approach, 3rd ed. Wiley Series in Probability and Statistics. Wiley, Chichester. · Zbl 1266.62040
[11] Belkin, M. and Niyogi, P. (2005). Towards a theoretical foundation for Laplacian-based manifold methods. In Learning Theory. Lecture Notes in Computer Science 3559 486-500. Springer, Berlin. · Zbl 1137.68521
[12] Bickel, P. J. and Chen, A. (2009). A nonparametric view of network models and Newman-Girvan and other modularities. Proc. Natl. Acad. Sci. USA. 106 21068-21073. · Zbl 1359.62411 · doi:10.1073/pnas.0907096106
[13] Butts, C. T. (2008). Network: A package for managing relational data in R. J. Stat. Softw. 24.
[14] Butts, C. T. (2016). sna: Tools for Social Network Analysis. R package version 2.4. https://CRAN.R-project.org/package=sna.
[15] Celinska, D. and Kopczynski, E. (2017). Programming languages in GitHub: A visualization in hyperbolic plane. In Proceedings of the Eleventh International AAAI Conference on Web and Social Media (ICWSM 2017) 727-728. Association for the Advancement of Artificial Intelligence. Menlo Park, CA.
[16] Chen, K. and Lei, J. (2018). Network cross-validation for determining the number of communities in network data. J. Amer. Statist. Assoc. 113 241-251. · Zbl 1398.62159 · doi:10.1080/01621459.2016.1246365
[17] Clauset, A., Moore, C. and Newman, M. E. J. (2008). Hierarchical structure and the prediction of missing links in networks. Nature 453 98-101.
[18] Clauset, A., Newman, M. E. and Moore, C. (2004). Finding community structure in very large networks. Phys. Rev. E 70 066111.
[19] Coleman, J. S. (1964). Introduction to Mathematical Sociology. Free Press, New York.
[20] Csardi, G. and Nepusz, T. (2006). The igraph software package for complex network research. InterJournal Complex Systems 1695.
[21] Davis, J. A. (1970). Clustering and hierarchy in interpersonal relations: Testing two graph theoretical models on 742 sociomatrices. Am. Sociol. Rev. 843-851.
[22] Diaconis, P. and Janson, S. (2008). Graph limits and exchangeable random graphs. Rend. Mat. Appl. (7) 28 33-61. · Zbl 1162.60009
[23] Doveton, H. (1998). Beyond the perfect martini: Teaching the mathematics of petrological logs. In Proceedings of IAMG98, The Fourth Annual Conference of the International Association for Mathematical Geology 71-75. ACM Press/Addison-Wesley Co., De Frede, Naples.
[24] Eash, R., Chon, K., Lee, Y. and Boyce, D. (1979). Equilibrium traffic assignment on an aggregated highway network for sketch planning. Transportation Research 13 243-257.
[25] Erdős, P. and Rényi, A. (1960). On the evolution of random graphs. Magy. Tud. Akad. Mat. Kut. Intéz. Közl. 5 17-61. · Zbl 0103.16301
[26] Fienberg, S. E. and Wasserman, S. S. (1981). Categorical data analysis of single sociometric relations. Sociol. Method. 12 156-192.
[27] Frank, O. and Strauss, D. (1986). Markov graphs. J. Amer. Statist. Assoc. 81 832-842. · Zbl 0607.05057 · doi:10.1080/01621459.1986.10478342
[28] Gao, C., Lu, Y. and Zhou, H. H. (2015). Rate-optimal graphon estimation. Ann. Statist. 43 2624-2652. · Zbl 1332.60050 · doi:10.1214/15-AOS1354
[29] Gilbert, E. N. (1959). Random graphs. Ann. Math. Stat. 30 1141-1144. · Zbl 0168.40801 · doi:10.1214/aoms/1177706098
[30] Goodreau, S. M., Kitts, J. A. and Morris, M. (2009). Birds of a feather, or friend of a friend? Using exponential random graph models to investigate adolescent social networks. Demography 46 103-125.
[31] Handcock, M. S., Raftery, A. E. and Tantrum, J. M. (2007). Model-based clustering for social networks. J. Roy. Statist. Soc. Ser. A 170 301-354.
[32] Hein, M., Audibert, J.-Y. and von Luxburg, U. (2007). Graph Laplacians and their convergence on random neighborhood graphs. J. Mach. Learn. Res. 8 1325-1368. · Zbl 1222.68215
[33] Hoff, P. D. (2005). Bilinear mixed-effects models for dyadic data. J. Amer. Statist. Assoc. 100 286-295. · Zbl 1117.62353 · doi:10.1198/016214504000001015
[34] Hoff, P. (2008). Modeling homophily and stochastic equivalence in symmetric relational data. In Advances in Neural Information Processing Systems 657-664.
[35] Hoff, P. D. (2009). Multiplicative latent factor models for description and prediction of social networks. Computational and Mathematical Organization Theory 15 261-272.
[36] Hoff, P. D., Raftery, A. E. and Handcock, M. S. (2002). Latent space approaches to social network analysis. J. Amer. Statist. Assoc. 97 1090-1098. · Zbl 1041.62098 · doi:10.1198/016214502388618906
[37] Holland, P. W. and Leinhardt, S. (1970). A method for detecting structure in sociometric data. Amer. J. Sociol. 492-513.
[38] Holland, P. W. and Leinhardt, S. (1981). An exponential family of probability distributions for directed graphs. J. Amer. Statist. Assoc. 76 33-65. · Zbl 0457.62090 · doi:10.1080/01621459.1981.10477598
[39] Holly, J. E. (2001). Pictures of ultrametric spaces, the \(p\)-adic numbers, and valued fields. Amer. Math. Monthly 108 721-728. · Zbl 1039.12003 · doi:10.1080/00029890.2001.11919803
[40] Ibragimov, Z. (2014). A hyperbolic filling for ultrametric spaces. Comput. Methods Funct. Theory 14 315-329. · Zbl 1322.30022 · doi:10.1007/s40315-014-0050-6
[41] Ivriĭ, V. Ja. (1980). The second term of the spectral asymptotics for a Laplace-Beltrami operator on manifolds with boundary. Funktsional. Anal. i Prilozhen. 14 25-34. · Zbl 0448.58024
[42] James, C. (1990). Foundations of Social Theory. Belknap, Cambridge, MA.
[43] Kolaczyk, E. D. and Csárdi, G. (2014). Statistical Analysis of Network Data with R. Use R! Springer, New York. · Zbl 1290.62002
[44] Krioukov, D., Papadopoulos, F., Kitsak, M., Vahdat, A. and Boguñá, M. (2010). Hyperbolic geometry of complex networks. Phys. Rev. E (3) 82 036106, 18.
[45] Krivitsky, P. N., Handcock, M. S., Raftery, A. E. and Hoff, P. D. (2009). Representing degree distributions, clustering, and homophily in social networks with latent cluster random effects models. Soc. Netw. 31 204-213.
[46] Kunegis, J. (2013). KONECT—The Koblenz Network Collection.
[47] Lamping, J., Rao, R. and Pirolli, P. (1995). A focus+ context technique based on hyperbolic geometry for visualizing large hierarchies. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems 401-408. ACM Press/Addison-Wesley, New York.
[48] Lazarsfeld, P. F., Henry, N. W. and Anderson, T. W. (1968). Latent Structure Analysis 109. Houghton Mifflin, Boston.
[49] Lichnerowicz, A. (1958). Geometrie des groupes de transformations, Dunod, Paris, 1958. Zentralblatt MATH 96. · Zbl 0096.16001
[50] Lovász, L. (2012). Large Networks and Graph Limits. American Mathematical Society Colloquium Publications 60. Amer. Math. Soc., Providence, RI. · Zbl 1292.05001
[51] Madore, D. Hyperbolic maze. Available at http://www.madore.org/ david/math/hyperbolic-maze.html. Hyperbolic maze.
[52] McCormick, T. H. and Zheng, T. (2015). Latent surface models for networks using aggregated relational data. J. Amer. Statist. Assoc. 110 1684-1695. · Zbl 1373.62579 · doi:10.1080/01621459.2014.991395
[53] Minhas, S., Hoff, P. D. and Ward, M. D. (2016). Inferential approaches for network analyses: AMEN for latent factor models. Preprint. Available at arXiv:1611.00460.
[54] Munzner, T. (1997). H3: Laying out large directed graphs in 3D hyperbolic space. In IEEE Symposium on Information Visualization, 1997. Proceedings 2-10. IEEE Press, New York.
[55] Newman, M. E. and Girvan, M. (2004). Finding and evaluating community structure in networks. Phys. Rev. E 69 026113.
[56] Nickel, C. L. M. (2009). Random dot product graphs a model for social networks. Ph.D. dissertation. Johns Hopkins Univ., Baltimore, MD.
[57] Opsahl, T. (2009). Structure and evolution of weighted networks. Ph.D thesis, Queen Mary, Univ. London.
[58] Padgett, J. F. (1994). Marriage and Elite Structure in Renaissance Florence, 1282-1500. Social Science History Association.
[59] Pao, H., Coppersmith, G. A. and Priebe, C. E. (2011). Statistical inference on random graphs: Comparative power analyses via Monte Carlo. J. Comput. Graph. Statist. 20 395-416.
[60] Pattison, P. and Robins, G. (2002). Neighborhood-based models for social networks. Sociol. Method. 32 301-337.
[61] R Core Team (2016). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.
[62] Rapoport, A. (1953). Spread of information through a population with socio-structural bias. I. Assumption of transitivity. Bull. Math. Biophys. 15 523-533.
[63] Robins, G., Pattison, P., Kalish, Y. and Lusher, D. (2007). An introduction to exponential random graph (p∗) models for social networks. Soc. Netw. 29 173-191.
[64] Rogue, Z. and Rogue, T. (2011). HyperRogue. Available at http://www.roguetemple.com/z/hyper/.
[65] Saldaña, D. F., Yu, Y. and Feng, Y. (2017). How many communities are there? J. Comput. Graph. Statist. 26 171-181.
[66] Sarkar, P. and Moore, A. W. (2006). Dynamic social network analysis using latent space models. In Advances in Neural Information Processing Systems 1145-1152.
[67] Schweinberger, M. and Snijders, T. A. (2003). Settings in social networks: A measurement model. Sociol. Method. 33 307-341.
[68] Sewell, D. K. and Chen, Y. (2015). Latent space models for dynamic networks. J. Amer. Statist. Assoc. 110 1646-1657. · Zbl 1373.62580 · doi:10.1080/01621459.2014.988214
[69] Simmel, G. (1950). The sociology of Georg Simmel. In Individual and Society. 3-84. Free Press, New York.
[70] Smith, A. (2017). Statistical methodology for multiple networks. PhD thesis, The Ohio State Univ., Columbus, OH.
[71] Snijders, T. A. (2002). Markov chain Monte Carlo estimation of exponential random graph models. Journal of Social Structure 3 1-40.
[72] Snijders, T. A. B. and Nowicki, K. (1997). Estimation and prediction for stochastic blockmodels for graphs with latent block structure. J. Classification 14 75-100. · Zbl 0896.62063 · doi:10.1007/s003579900004
[73] Snijders, T. A., Pattison, P. E., Robins, G. L. and Handcock, M. S. (2006). New specifications for exponential random graph models. Sociol. Method. 36 99-153.
[74] Solé, R. V. and Valverde, S. (2004). Information theory of complex networks: On evolution and architectural constraints. In Complex Networks. Lecture Notes in Physics 650 189-207. Springer, Berlin. · Zbl 1109.90068
[75] Spearman, C. (1904). “General intelligence,” objectively determined and measured. The American Journal of Psychology 15 201-292.
[76] Ting, D., Huang, L. and Jordan, M. (2011). An analysis of the convergence of graph Laplacians. Preprint. Available at arXiv:1101.5435.
[77] van der Linden, W. J. and Hambleton, R. K., eds. (2013). Handbook of Modern Item Response Theory. Springer, New York. · Zbl 0872.62099
[78] van Duijn, M. A. J., Snijders, T. A. B. and Zijlstra, B. J. H. (2004). \(p_2\): A random effects model with covariates for directed graphs. Stat. Neerl. 58 234-254. · Zbl 1050.62117 · doi:10.1046/j.0039-0402.2003.00258.x
[79] Wasserman, S. and Pattison, P. (1996). Logit models and logistic regressions for social networks. I. An introduction to Markov graphs and \(p\). Psychometrika 61 401-425. · Zbl 0866.92029 · doi:10.1007/BF02294547
[80] Watts, D. J. (1999a). Networks, dynamics, and the small-world phenomenon. Amer. J. Sociol. 105 493-527.
[81] Watts, D. J. and Strogatz, S. H. (1998). Collective dynamics of ‘small-world’ networks. Nature 393 440-442. · Zbl 1368.05139 · doi:10.1038/30918
[82] Weyl, H. (1911). Sur la distribution asymptotique des valeurs propres. Nouvelles de la Société des Sciences sur Göttingen, Mathematical-Physical Class 110-117.
[83] Wolfe, P. J. and Olhede, S. C. (2013). Nonparametric graphon estimation. Preprint. Available at arXiv:1309.5936.
[84] Young, S. J. and Scheinerman, E. R. (2007). Random dot product graph models for social networks. In Algorithms and Models for the Web-Graph. Lecture Notes in Computer Science 4863 138-149. Springer, Berlin. · Zbl 1136.05322
[85] Zachary, W. W. (1977). An information flow model for conflict and fission in small groups. Journal of Anthropological Research 33 452-473.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.