×

A constrained \(\ell1\) minimization approach for estimating multiple sparse Gaussian or nonparanormal graphical models. (English) Zbl 1460.62092

Summary: Identifying context-specific entity networks from aggregated data is an important task, arising often in bioinformatics and neuroimaging applications. Computationally, this task can be formulated as jointly estimating multiple different, but related, sparse undirected graphical models (UGM) from aggregated samples across several contexts. Previous joint-UGM studies have mostly focused on sparse Gaussian graphical models (sGGMs) and can’t identify context-specific edge patterns directly. We, therefore, propose a novel approach, SIMULE (detecting \(\underline{\mathrm{S}}\)hared and \(\underline{\mathrm{I}}\)ndividual parts of \(\underline{\mathrm{MUL}}\)tiple graphs \(\underline{\mathrm{E}}\)xplicitly) to learn multi-UGM via a constrained \(\ell1\) minimization. SIMULE automatically infers both specific edge patterns that are unique to each context and shared interactions preserved among all the contexts. Through the \(\ell \)1 constrained formulation, this problem is cast as multiple independent subtasks of linear programming that can be solved efficiently in parallel. In addition to Gaussian data, SIMULE can also handle multivariate Nonparanormal data that greatly relaxes the normality assumption that many real-world applications do not follow. We provide a novel theoretical proof showing that SIMULE achieves a consistent result at the rate \(O(\log (Kp)/n_{tot})\). On multiple synthetic datasets and two biomedical datasets, SIMULE shows significant improvement over state-of-the-art multi-sGGM and single-UGM baselines (SIMULE implementation and the used datasets https://github.com/QData/SIMULE).

MSC:

62H22 Probabilistic graphical models
62H12 Estimation in multivariate analysis
62J07 Ridge regression; shrinkage estimators (Lasso)
62P10 Applications of statistics to biology and medical sciences; meta analysis
68T05 Learning and adaptive systems in artificial intelligence
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Antoniadis, A., & Fan, J. (2011). Regularization of wavelet approximations. Journal of the American Statistical Association, 96(455), 939-967. · Zbl 1072.62561 · doi:10.1198/016214501753208942
[2] Banerjee, O., El Ghaoui, L., & d’Aspremont, A. (2008). Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. The Journal of Machine Learning Research, 9, 485-516. · Zbl 1225.68149
[3] Boyd, S. P., & Vandenberghe, L. (2004). Convex optimization. Cambridge: Cambridge University Press. · Zbl 1058.90049 · doi:10.1017/CBO9780511804441
[4] Buchman, D., Schmidt, M., Mohamed, S., Poole, D., & de Freitas, N. (2012). On sparse, spectral and other parameterizations of binary probabilistic models. In AISTATS (pp. 173-181)
[5] Cai, T., Liu, W., & Luo, X. (2011). A constrained 1 minimization approach to sparse precision matrix estimation. Journal of the American Statistical Association, 106(494), 594-607. · Zbl 1232.62087 · doi:10.1198/jasa.2011.tm10155
[6] Candes, E., & Tao, T. (2007). The Dantzig selector: Statistical estimation when p is much larger than n. The Annals of Statistics, 35(6), 2313-2351. · Zbl 1139.62019 · doi:10.1214/009053606000001523
[7] Caruana, R. (1997). Multitask learning. Machine Learning, 28(1), 41-75. · doi:10.1023/A:1007379606734
[8] Cheng, C., Yan, K. K., Hwang, W., Qian, J., Bhardwaj, N., Rozowsky, J., et al. (2011). Construction and analysis of an integrated regulatory network derived from high-throughput sequencing data. PLoS Computational Biology, 7(11), e1002190. · doi:10.1371/journal.pcbi.1002190
[9] Chiquet, J., Grandvalet, Y., & Ambroise, C. (2011). Inferring multiple graphical structures. Statistics and Computing, 21(4), 537-553. · Zbl 1221.62085 · doi:10.1007/s11222-010-9191-2
[10] Da Wei Huang, B. T. S., & Lempicki, R. A. (2008). Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature Protocols, 4(1), 44-57. · doi:10.1038/nprot.2008.211
[11] Danaher, P., Wang, P., & Witten, D. M. (2013). The joint graphical lasso for inverse covariance estimation across multiple classes. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(2), 373-397. · Zbl 07555455 · doi:10.1111/rssb.12033
[12] Di Martino, A., Yan, C. G., Li, Q., Denio, E., Castellanos, F. X., Alaerts, K., et al. (2014). The autism brain imaging data exchange: Towards a large-scale evaluation of the intrinsic brain architecture in autism. Molecular Psychiatry, 19(6), 659-667. · doi:10.1038/mp.2013.78
[13] ENCODE Project Consortium. (2011). A user’s guide to the encyclopedia of DNA elements (ENCODE). PLoS Biology,9(4), e1001046.
[14] ENCODE Project Consortium. (2012). An integrated encyclopedia of DNA elements in the human genome. Nature, 489(7414), 57-74. · Zbl 1191.62101
[15] Evgeniou, T., & Pontil, M. (2004). Regularized multi-task learning. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 109-117). ACM.
[16] Fan, J., Han, F., & Liu, H. (2014). Challenges of big data analysis. National Science Review,. doi:10.1093/nsr/nwt032. · doi:10.1093/nsr/nwt032
[17] Fan, J., Liao, Y., & Mincheva, M. (2013). Large covariance estimation by thresholding principal orthogonal complements. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 75(4), 603-680. · Zbl 1411.62138 · doi:10.1111/rssb.12016
[18] Fazayeli, F., & Banerjee, A. (2016). Generalized direct change estimation in ising model structure. arXiv preprint arXiv:1606.05302.
[19] Friedman, J., Hastie, T., & Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3), 432-441. · Zbl 1143.62076 · doi:10.1093/biostatistics/kxm045
[20] Guo, J., Levina, E., Michailidis, G., & Zhu, J. (2011). Joint estimation of multiple graphical models. Biometrika,. doi:10.1093/biomet/asq060. · Zbl 1214.62058 · doi:10.1093/biomet/asq060
[21] Han, F., Liu, H., & Caffo, B. (2013). Sparse median graphs estimation in a high dimensional semiparametric model. arXiv preprint arXiv:1310.3223. · Zbl 1391.62128
[22] Hara, S., & Washio, T. (2013). Learning a common substructure of multiple graphical Gaussian models. Neural Networks, 38, 23-38. · Zbl 1328.68165 · doi:10.1016/j.neunet.2012.11.004
[23] Hastie, T., Tibshirani, R., Friedman, J., Hastie, T., Friedman, J., & Tibshirani, R. (2009). The elements of statistical learning. Berlin: Springer. · Zbl 1273.62005 · doi:10.1007/978-0-387-84858-7
[24] Höfling, H., & Tibshirani, R. (2009). Estimation of sparse binary pairwise markov networks using pseudo-likelihoods. The Journal of Machine Learning Research, 10, 883-906. · Zbl 1245.62121
[25] Honorio, J., & Samaras, D. (2010). Multi-task learning of Gaussian graphical models. In Proceedings of the 27th international conference on machine learning (ICML-10) (pp. 447-454). · Zbl 1221.62085
[26] Hsieh, C. J., Sustik, M. A., Dhillon, I. S., & Ravikumar, P. D. (2011). Sparse inverse covariance matrix estimation using quadratic approximation. In NIPS (pp. 2330-2338).
[27] Huang, S., Li, J., Sun, L., Ye, J., Fleisher, A., Wu, T., et al. (2010). Learning brain connectivity of alzheimer’s disease by sparse inverse covariance estimation. NeuroImage, 50(3), 935-949. · doi:10.1016/j.neuroimage.2009.12.120
[28] Ideker, T., & Krogan, N. J. (2012). Differential network biology. Molecular Systems Biology, 8(1), 565.
[29] Kelly, C., Biswal, B. B., Craddock, R. C., Castellanos, F. X., & Milham, M. P. (2012). Characterizing variation in the functional connectome: Promise and pitfalls. Trends in Cognitive Sciences, 16(3), 181-188. · doi:10.1016/j.tics.2012.02.001
[30] Kolar, M., Song, L., Ahmed, A., Xing, E. P., et al. (2010). Estimating time-varying networks. The Annals of Applied Statistics, 4(1), 94-123. · Zbl 1189.62142 · doi:10.1214/09-AOAS308
[31] Krumsiek, J., Suhre, K., Illig, T., Adamski, J., & Theis, F. J. (2011). Gaussian graphical modeling reconstructs pathway reactions from high-throughput metabolomics data. BMC Systems Biology, 5(1), 21. · doi:10.1186/1752-0509-5-21
[32] Lam, C., & Fan, J. (2009). Sparsistency and rates of convergence in large covariance matrix estimation. Annals of Statistics, 37(6B), 4254. · Zbl 1191.62101 · doi:10.1214/09-AOS720
[33] Lauritzen, S. L. (1996). Graphical models. Oxford: Oxford University Press. · Zbl 0907.62001
[34] Levina, E., Rothman, A., Zhu, J., et al. (2008). Sparse estimation of large covariance matrices via a nested lasso penalty. The Annals of Applied Statistics, 2(1), 245-263. · Zbl 1137.62338 · doi:10.1214/07-AOAS139
[35] Liu, H., Han, F., & Zhang, C. (2012). Transelliptical graphical models. In Advances in Neural Information Processing Systems (pp. 809-817). · Zbl 1072.62561
[36] Liu, H., Lafferty, J., & Wasserman, L. (2009). The nonparanormal: Semiparametric estimation of high dimensional undirected graphs. The Journal of Machine Learning Research, 10, 2295-2328. · Zbl 1235.62035
[37] Liu, H., Wang, L., & Zhao, T. (2014). Sparse covariance matrix estimation with eigenvalue constraints. Journal of Computational and Graphical Statistics, 23(2), 439-459. · doi:10.1080/10618600.2013.782818
[38] Liu, S., Quinn, J. A., Gutmann, M. U., & Sugiyama, M. (2013). Direct learning of sparse changes in Markov networks by density ratio estimation. In Joint European conference on machine learning and knowledge discovery in databases (pp. 596-611). Springer.
[39] Ma, S., Gong, Q., & Bohnert, H. J. (2007). An arabidopsis gene network based on the graphical Gaussian model. Genome Research, 17(11), 1614-1625. · doi:10.1101/gr.6911207
[40] Mardia, K. V., Kent, J. T., & Bibby, J. M. (1980). Multivariate analysis. London: Academic Press. · Zbl 0432.62029
[41] McCall, M. N., Uppal, K., Jaffee, H. A., Zilliox, M. J., & Irizarry, R. A. (2011). The gene expression barcode: Leveraging public data repositories to begin cataloging the human and murine transcriptomes. Nucleic Acids Research, 39(suppl 1), D1011-D1015. · doi:10.1093/nar/gkq1259
[42] Meinshausen, N., & Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. The Annals of Statistics, 34(3), 1436-1462. · Zbl 1113.62082 · doi:10.1214/009053606000000281
[43] Min, M. R., Ning, X., Cheng, C., & Gerstein, M. (2014). Interpretable sparse high-order Boltzmann machines. In Proceedings of the seventeenth international conference on artificial intelligence and statistics (pp. 614-622).
[44] Mohan, K., London, P., Fazel, M., Lee, S. I., & Witten, D. (2013). Node-based learning of multiple Gaussian graphical models. arXiv preprint arXiv:1303.5145. · Zbl 1318.62181
[45] Monti, R. P., Anagnostopoulos, C., & Montana, G. (2015). Learning population and subject-specific brain connectivity networks via mixed neighborhood selection. arXiv preprint arXiv:1512.01947. · Zbl 1383.62282
[46] Negahban, S., Yu, B., Wainwright, M. J., & Ravikumar, P. K. (2009). A unified framework for high-dimensional analysis of \[m\] m-estimators with decomposable regularizers. In Advances in Neural Information Processing Systems (pp. 1348-1356). · Zbl 1331.62350
[47] Ng, B., Varoquaux, G., Poline, J. B., & Thirion, B. (2013). A novel sparse group Gaussian graphical model for functional connectivity estimation. In Information processing in medical imaging (pp. 256-267). Springer. · Zbl 1437.62595
[48] Orchard, S., Ammari, M., Aranda, B., Breuza, L., Briganti, L., Broackes-Carter, F., et al. (2013). The MIntAct project IntAct as a common curation platform for 11 molecular interaction databases. Nucleic Acids Research,. doi:10.1093/nar/gkt1115. · doi:10.1093/nar/gkt1115
[49] Pang, H., Liu, H., & Vanderbei, R. (2014). The fastclime package for linear programming and large-scale precision matrix estimation in R. Journal of Machine Learning Research, 15, 489-493. · Zbl 1318.90002
[50] Prasad, T. K., Goel, R., Kandasamy, K., Keerthikumar, S., Kumar, S., Mathivanan, S., et al. (2009). Human protein reference database 2009 update. Nucleic Acids Research, 37(suppl 1), D767-D772. · doi:10.1093/nar/gkn892
[51] Qiu, H., Han, F., Liu, H., & Caffo, B. (2013). Joint estimation of multiple graphical models from high dimensional time series. arXiv preprint arXiv: 1311.0219. · Zbl 1414.62379
[52] Ripley, B. D. (2009). Stochastic simulation (Vol. 316). London: Wiley. · Zbl 1113.65003
[53] Rothman, A. J. (2012). Positive definite estimators of large covariance matrices. Biometrika, 99(3), 733-740. · Zbl 1437.62595 · doi:10.1093/biomet/ass025
[54] Rothman, A. J., Bickel, P. J., Levina, E., Zhu, J., et al. (2008). Sparse permutation invariant covariance estimation. Electronic Journal of Statistics, 2, 494-515. · Zbl 1320.62135 · doi:10.1214/08-EJS176
[55] Schmidt, M., & Murphy, K. (2010). Convex structure learning in log-linear models: Beyond pairwise potentials. In Proceedings of the international conference on artificial intelligence and statistics (AISTATS).
[56] Stark, C., Breitkreutz, B. J., Reguly, T., Boucher, L., Breitkreutz, A., & Tyers, M. (2006). Biogrid: A general repository for interaction datasets. Nucleic Acids Research, 34(suppl 1), D535-D539. · doi:10.1093/nar/gkj109
[57] Sugiyama, M., Kanamori, T., Suzuki, T., du Plessis, M. C., Liu, S., & Takeuchi, I. (2013). Density-difference estimation. Neural Computation, 25(10), 2734-2775. · Zbl 1418.62147 · doi:10.1162/NECO_a_00492
[58] Sun, L., Patel, R., Liu, J., Chen, K., Wu, T., Li, J., et al. (2009). Mining brain region connectivity for Alzheimer’s disease study via sparse inverse covariance estimation. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 1335-1344). ACM. · Zbl 1072.62561
[59] The Cancer Genome Atlas Research Network. (2011). Integrated genomic analyses of ovarian carcinoma. Nature, 474(7353), 609-615. · Zbl 1368.62181
[60] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 267-288. · Zbl 0850.62538
[61] Wainwright, M. J., & Jordan, M. I. (2006). Log-determinant relaxation for approximate inference in discrete Markov random fields. IEEE Transactions on Signal Processing, 54(6), 2099-2109. · Zbl 1374.94616 · doi:10.1109/TSP.2006.874409
[62] Yang, E., Lozano, A. C., & Ravikumar, P. K. (2014). Elementary estimators for graphical models. In Advances in neural information processing systems (pp. 2159-2167).
[63] Yuan, M., & Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model. Biometrika, 94(1), 19-35. · Zbl 1142.62408 · doi:10.1093/biomet/asm018
[64] Zhang, B., & Wang, Y. (2012). Learning structural changes of Gaussian graphical models in controlled experiments. arXiv preprint arXiv:1203.3532.
[65] Zhang, C. H. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of statistics, 38(2), 894-942. · Zbl 1183.62120 · doi:10.1214/09-AOS729
[66] Zhang, Y., & Schneider, J. G. (2010). Learning multiple tasks with a sparse matrix-normal penalty. In Advances in neural information processing systems (pp. 2550-2558).
[67] Zhu, Y., Shen, X., & Pan, W. (2014). Structural pursuit over multiple undirected graphs. Journal of the American Statistical Association, 109(508), 1683-1696. · Zbl 1368.62181 · doi:10.1080/01621459.2014.921182
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.