×

Consistency and asymptotic normality of stochastic block models estimators from sampled data. (English) Zbl 1454.62286

This paper is a research in the domain of statistical analyses of networks with missing data. The authors start with the presentation of the stochastic block model, SBM, and missing data for SBM, followed by some sampling design examples, definition of the complete-observed log-likelihood, introduction of parametric models and a set of assumptions on the parameter space. The concepts of identifiability of SBM, permutation, equivalence and parameter symmetry as well as distance, set of local assignments, c-regular assignments, class distinctness and confusion matrix are introduced. The local asymptotic normality of the complete-observed model is proved. The main result states that the observed-likelihood ratio behaves like the complete likelihood ratio, up to a bounded multiplicative factor. As a consequence of the main result, the asymptotic behavior of the maximum likelihood estimator, MLE, and the variational estimator, VE, for the incomplete data models are investigated. Appendix A contains proofs for technical results and Appendix B proofs related to the main results. Appendix C contains results related to sub-exponential random variables and Appendix D results related to likelihood ration of assignments.

MSC:

62M45 Neural nets and related approaches to inference from stochastic processes
62D10 Missing data
PDFBibTeX XMLCite
Full Text: DOI arXiv Euclid

References:

[1] C. Aicher, A. Z. Jacobs, and A. Clauset. Learning latent block structure in weighted networks., J. Compl. Net., 3(2):221-248, 2014. · Zbl 1397.68151 · doi:10.1093/comnet/cnu026
[2] C. Ambroise and C. Matias. New consistent and asymptotically normal parameter estimates for random-graph mixture models., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 74(1):3-35, 2012. · Zbl 1411.62051 · doi:10.1111/j.1467-9868.2011.01009.x
[3] P. Barbillon, S. Donnet, E. Lazega, and A. Bar-Hen. Stochastic block models for multiplex networks: an application to networks of researchers., J. R. Stat. Soc. C-Appl., 2015. · doi:10.1111/rssa.12193
[4] P. Bickel, D. Choi, X. Chang, H. Zhang, et al. Asymptotic normality of maximum likelihood and its variational approximation for stochastic blockmodels., The Annals of Statistics, 41(4) :1922-1943, 2013. · Zbl 1292.62042 · doi:10.1214/13-AOS1124
[5] V. Brault, C. Keribin, and M. Mariadassou. Consistency and asymptotic normality of latent blocks model estimators., Electronic Journal of Statistics, 14(1) :1234-1268, 2020. · Zbl 1439.62256 · doi:10.1214/20-EJS1695
[6] S. Chatterjee. Matrix estimation by universal singular value thresholding., The Annals of Statistics, 43(1):177-214, 2015. · Zbl 1308.62038 · doi:10.1214/14-AOS1272
[7] A. Celisse, J.-J. Daudin, L. Pierre, et al. Consistency of maximum-likelihood and variational estimators in the stochastic block model., Electronic Journal of Statistics, 6 :1847-1899, 2012. · Zbl 1295.62028 · doi:10.1214/12-EJS729
[8] D. S. Choi, P. J. Wolfe, and E. M. Airoldi. Stochastic blockmodels with growing number of classes., Biometrika, 99 2:273-284, 2012. · Zbl 1318.62207 · doi:10.1093/biomet/asr053
[9] J.-J. Daudin, F. Picard, and S. Robin. A mixture model for random graphs., Stat. Comp., 18(2):173-183, 2008.
[10] P. Erdos and A. Renyi. On random graphs., Publicationes Mathematicae, 6:290-297, 1959. · Zbl 0092.15705
[11] O. Frank and F. Harary. Cluster inference by using transitivity indices in empirical graphs., J. Am. Stat. Soc., 77(380):835-840, 1982. · Zbl 0505.62043 · doi:10.1080/01621459.1982.10477895
[12] M. S. Handcock and K. J. Gile. Modeling social networks from sampled data., The Annals of Applied Statistics, 4(1):5-25, 2010. · Zbl 1189.62187 · doi:10.1214/08-AOAS221
[13] W. Hoeffding. A class of statistics with asymptotically normal distribution., The Annals of Mathematical Statistics, 19(3):293-325, 1948. · Zbl 0032.04101 · doi:10.1214/aoms/1177730196
[14] P. W. Holland, K. B. Laskey, and S. Leinhardt. Stochastic blockmodels: First steps., Social Networks, 5(2):109-137, 1983.
[15] J. Hu, H. Qin, T. Yan, and Y. Zhao. On consistency of model selection for stochastic block models., arXiv:1611.01238, 2017.
[16] E. D. Kolaczyk., Statistical Analysis of Network Data, Methods and Models. Springer, 2009. · Zbl 1277.62021
[17] P. Latouche, É. Birmelé, and C. Ambroise. Variational bayesian inference and complexity control for stochastic block models., Stat. Modelling, 12(1):93-115, 2012. · Zbl 1420.62114
[18] M. Mariadassou and C. Matias. Convergence of the groups posterior distribution in latent or stochastic block models., Bernoulli, 21(1):537-573, 2015. · Zbl 1329.62285 · doi:10.3150/13-BEJ579
[19] M. Mariadassou, S. Robin, and C. Vacher. Uncovering latent structure in valued graphs: A variational approach., Ann. Appl. Stat., 4(2):715-742, 06 2010. · Zbl 1194.62125 · doi:10.1214/10-AOAS361
[20] C. Matias and S. Robin. Modeling heterogeneity in random graphs through latent space models: a selective review., ESAIM Proc. Sur., 47:55-74, 2014. · Zbl 1335.05002 · doi:10.1051/proc/201447004
[21] K. Nowicki and T. A. B. Snijders. Estimation and prediction for stochastic blockstructures., J. Am. Stat. Soc., 96(455) :1077-1087, September 2001. · Zbl 1072.62542 · doi:10.1198/016214501753208735
[22] K. Rohe, S. Chatterjee, and B. Yu. Spectral clustering and the high-dimensional stochastic block model., Ann. Stat., 2010. · Zbl 1227.62042 · doi:10.1214/11-AOS887
[23] D. B. Rubin. Inference and missing data., Biometrika, 63(3):581-592, 1976. · Zbl 0344.62034 · doi:10.1093/biomet/63.3.581
[24] J. Shanthikumar and U. Sumita. A central limit theorem for random sums of random variables., Operations Research Letters, 3(3):153-155, 1984. https://doi.org/10.1016/0167-6377(84)90008-7. · Zbl 0546.60023
[25] T. A. Snijders and K. Nowicki. Estimation and prediction for stochastic blockmodels for graphs with latent block structure., J. Class., 14(1):75-100, 1997. · Zbl 0896.62063 · doi:10.1007/s003579900004
[26] T. Tabouy, P. Barbillon, and J. Chiquet. Variational inference for stochastic block models from sampled data., Journal of the American Statistical Association, 115(529):455-466, 2020. https://doi.org/10.1080/01621459.2018.1562934. · Zbl 1437.62072 · doi:10.1080/01621459.2018.1562934
[27] M. J. Wainwright. Basic tail and concentration bounds., https://www.stat.berkeley.edu/ mjwain/stat210b/Chap2/, 2015.
[28] Y. X. R. Wang and P. J. Bickel. Likelihood-based model selection for stochastic block models., Ann. Statist., 45(2):500-528, 04 2017. https://doi.org/10.1214/16-AOS1457. · Zbl 1371.62017 · doi:10.1214/16-AOS1457
[29] S. Wasserman and K. Faust., Social Network Analysis: Methods and Applications. Structural Analysis in the Social Sciences. Cambridge University Press, 1994. https://doi.org/10.1017/CBO9780511815478. · Zbl 0926.91066
[30] W.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.