×

High-dimensional Poisson structural equation model learning via \(\ell_1\)-regularized regression. (English) Zbl 1434.68445

Summary: In this paper, we develop a new approach to learning high-dimensional Poisson structural equation models from only observational data without strong assumptions such as faithfulness and a sparse moralized graph. A key component of our method is to decouple the ordering estimation or parent search where the problems can be efficiently addressed using \(\ell_1\)-regularized regression and the moments relation. We show that sample size \(n = \Omega( d^2 \log^9 p)\) is sufficient for our polynomial time Moments Ratio Scoring (MRS) algorithm to recover the true directed graph, where \(p\) is the number of nodes and \(d\) is the maximum indegree. We verify through simulations that our algorithm is statistically consistent in the high-dimensional \(p>n\) setting, and performs well compared to state-of-the-art ODS, GES, and MMHC algorithms. We also demonstrate through multivariate real count data that our MRS algorithm is well-suited to estimating DAG models for multivariate count data in comparison to other methods used for discrete data.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
62F07 Statistical ranking and selection procedures
62H12 Estimation in multivariate analysis
PDFBibTeX XMLCite
Full Text: arXiv Link

References:

[1] Peter B¨uhlmann, Jonas Peters, Jan Ernest, et al. Cam: Causal additive models, highdimensional order search and penalized regression.The Annals of Statistics, 42(6):2526- 2556, 2014. · Zbl 1309.62063
[2] David M Chickering, Dan Geiger, David Heckerman, et al. Learning bayesian networks is np-hard. Technical report, Citeseer, 1994. · Zbl 0831.68096
[3] David Maxwell Chickering. Learning bayesian networks is np-complete. InLearning from data, pages 121-130. Springer, 1996.
[4] David Maxwell Chickering. Optimal structure identification with greedy search.The Journal of Machine Learning Research, 3:507-554, 2003. · Zbl 1084.68519
[5] Kenji Doya.Bayesian brain: Probabilistic approaches to neural coding. MIT press, 2007. · Zbl 1137.91015
[6] Mathias Drton, Wenyu Chen, and Y Samuel Wang. On causal discovery with equal variance assumption.arXiv preprint arXiv:1807.03419, 2018. · Zbl 1435.62166
[7] Jerome Friedman, Trevor Hastie, and Rob Tibshirani. glmnet: Lasso and elastic-net regularized generalized linear models.R package version, 1(4), 2009.
[8] Jerome Friedman, Trevor Hastie, and Rob Tibshirani. Regularization paths for generalized linear models via coordinate descent.Journal of statistical software, 33(1):1, 2010. · Zbl 1109.62302
[9] Nir Friedman, Michal Linial, Iftach Nachman, and Dana Pe’er. Using bayesian networks to analyze expression data.Journal of computational biology, 7(3-4):601-620, 2000.
[10] Michael Friendly.Lahman: Sean ’Lahman’ Baseball Database, 2017. URLhttps://CRAN. R-project.org/package=Lahman. R package version 6.0-0.
[11] Morten Frydenberg. The chain graph markov property.Scandinavian Journal of Statistics, pages 333-353, 1990. · Zbl 0713.60013
[12] Asish Ghoshal and Jean Honorio. Information-theoretic limits of bayesian network structure learning. InArtificial Intelligence and Statistics, pages 767-775, 2017a.
[13] Asish Ghoshal and Jean Honorio. Learning identifiable gaussian bayesian networks in polynomial time and sample complexity. InAdvances in Neural Information Processing Systems, pages 6457-6466, 2017b.
[14] Asish Ghoshal and Jean Honorio. Learning linear structural equation models in polynomial time and sample complexity. InProceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, pages 1466-1475, 2018.
[15] David Heckerman, Dan Geiger, and David M Chickering. Learning Bayesian networks: The combination of knowledge and statistical data.Machine learning, 20(3):197-243, 1995. · Zbl 0831.68096
[16] Patrik O Hoyer, Dominik Janzing, Joris M Mooij, Jonas Peters, and Bernhard Sch¨olkopf. Nonlinear causal discovery with additive noise models. InAdvances in Neural Information Processing Systems, pages 689-696, 2009.
[17] David I Inouye, Eunho Yang, Genevera I Allen, and Pradeep Ravikumar. A review of multivariate distributions for count data derived from the poisson distribution.Wiley Interdisciplinary Reviews: Computational Statistics, 9(3), 2017.
[18] Jinzhu Jia, Fang Xie, and Lihu Xu. Sparse poisson regression with penalized weighted score function.arXiv preprint arXiv:1703.03965, 2017. · Zbl 1431.62300
[19] Jeffrey O Kephart and Steve R White. Directed-graph epidemiological models of computer viruses. InResearch in Security and Privacy, 1991. Proceedings., 1991 IEEE Computer Society Symposium on, pages 343-359. IEEE, 1991.
[20] Steffen L Lauritzen.Graphical models. Oxford University Press, 1996. · Zbl 0907.62001
[21] Po-Ling Loh and Peter B¨uhlmann. High-dimensional learning of linear causal networks via inverse covariance estimation.The Journal of Machine Learning Research, 15(1): 3065-3105, 2014. · Zbl 1318.68148
[22] Nicolai Meinshausen and Peter B¨uhlmann. High-dimensional graphs and variable selection with the lasso.The Annals of Statistics, pages 1436-1462, 2006. · Zbl 1113.62082
[23] Gunwoong Park and Hyewon Park. Identifiability of generalized hypergeometric distribution (ghd) directed acyclic graphical models. InProceedings of Machine Learning Research, pages 158-166, 2019.
[24] Gunwoong Park and Garvesh Raskutti. Learning large-scale poisson dag models based on overdispersion scoring. InAdvances in Neural Information Processing Systems, pages 631-639, 2015. · Zbl 1470.68156
[25] Gunwoong Park and Garvesh Raskutti. Learning quadratic variance function (qvf) dag models via overdispersion scoring (ods).Journal of Machine Learning Research, 18(224): 1-44, 2018. · Zbl 1470.68156
[26] Jonas Peters and Peter B¨uhlmann. Identifiability of gaussian structural equation models with equal error variances.Biometrika, 101(1):219-228, 2014. · Zbl 1285.62005
[27] Jonas Peters, Joris Mooij, Dominik Janzing, and Bernhard Sch¨olkopf. Identifiability of causal graphs using functional models. pages 589-598, Corvallis, OR, USA, July 2011. AUAI Press.
[28] Pradeep Ravikumar, Martin J Wainwright, Garvesh Raskutti, Bin Yu, et al.Highdimensional covariance estimation by minimizing‘1-penalized log-determinant divergence.Electronic Journal of Statistics, 5:935-980, 2011. · Zbl 1274.62190
[29] Shohei Shimizu, Patrik O Hoyer, Aapo Hyv¨arinen, and Antti Kerminen. A linear nonGaussian acyclic model for causal discovery.The Journal of Machine Learning Research, 7:2003-2030, 2006. · Zbl 1222.68304
[30] Shohei Shimizu, Takanori Inazumi, Yasuhiro Sogawa, Aapo Hyv¨arinen, Yoshinobu Kawahara, Takashi Washio, Patrik O Hoyer, and Kenneth Bollen. Directlingam: A direct method for learning a linear non-gaussian structural equation model.Journal of Machine Learning Research, 12(Apr):1225-1248, 2011. · Zbl 1280.68195
[31] Peter Spirtes. Directed cyclic graphical representations of feedback models. InProceedings of the Eleventh conference on Uncertainty in artificial intelligence, pages 491-498. Morgan Kaufmann Publishers Inc., 1995.
[32] Peter Spirtes, Clark N Glymour, and Richard Scheines.Causation, prediction, and search. MIT press, 2000. · Zbl 0806.62001
[33] Ioannis Tsamardinos and Constantin F Aliferis. Towards principled feature selection: Relevancy, filters and wrappers. InProceedings of the ninth international workshop on Artificial Intelligence and Statistics. Morgan Kaufmann Publishers: Key West, FL, USA, 2003.
[34] Ioannis Tsamardinos, Laura E Brown, and Constantin F Aliferis. The max-min hill-climbing bayesian network structure learning algorithm.Machine learning, 65(1):31-78, 2006. · Zbl 1470.68192
[35] Caroline Uhler, Garvesh Raskutti, Peter B¨uhlmann, and Bin Yu. Geometry of the faithfulness assumption in causal inference.The Annals of Statistics, pages 436-463, 2013. · Zbl 1267.62068
[36] Martin J Wainwright, John D Lafferty, and Pradeep K Ravikumar. High-dimensional graphical model selection using‘1-regularized logistic regression. InAdvances in Neural Information Processing Systems, pages 1465-1472, 2006. · Zbl 1189.62115
[37] Ying-Wooi Wan, Genevera I Allen, Yulia Baker, Eunho Yang, Pradeep Ravikumar, Matthew Anderson, and Zhandong Liu.Xmrf: an r package to fit markov networks to highthroughput genetics data.BMC systems biology, 10(3):69, 2016.
[38] Eunho Yang, Pradeep K Ravikumar, Genevera I Allen, and Zhandong Liu. On poisson graphical models. InAdvances in Neural Information Processing Systems, pages 1718- 1726, 2013. · Zbl 1351.62111
[39] Eunho Yang, Pradeep Ravikumar, Genevera I Allen, and Zhandong Liu. Graphical models via univariate exponential family distributions.Journal of Machine Learning Research, 16(1):3813-3847, 2015. · Zbl 1351.62111
[40] Jiji Zhang and Peter Spirtes.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.