×

Learning directed acyclic graph SPNs in sub-quadratic time. (English) Zbl 1433.68342

Summary: In this paper, we present Prometheus, a graph partitioning based algorithm that creates multiple variable decompositions efficiently for learning Sum-Product Network structures across both continuous and discrete domains. Prometheus proceeds by creating multiple candidate decompositions that are represented compactly with an acyclic directed graph in which common parts of different decompositions are shared. It eliminates the correlation threshold hyperparameter often used in other structure learning techniques, allowing Prometheus to learn structures that are robust in low data regimes. Prometheus outperforms other structure learning techniques in 30 discrete and continuous domains. We also extend Prometheus to exploit sparsity in correlations between features in order to obtain an efficient sub-quadratic algorithm (w.r.t. the number of features) that scales better to high dimensional datasets.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
62H22 Probabilistic graphical models

Software:

QUIC; STL-10 dataset
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Abbe, E.; Bandeira, A. S.; Hall, G., Exact recovery in the stochastic block model, IEEE Trans. Inf. Theory, 62, 471-487 (2016) · Zbl 1359.94047
[2] Adel, T.; Balduzzi, D.; Ghodsi, A., Learning the structure of sum-product networks via an SVD-based algorithm, (UAI (2015)), 32-41
[3] Bandeira, A. S.; Dobriban, E.; Mixon, D. G.; Sawin, W. F., Certifying the restricted isometry property is hard, IEEE Trans. Inf. Theory, 59, 3448-3450 (2013) · Zbl 1364.94109
[4] Bansal, N.; Blum, A.; Chawla, S., Correlation clustering, Mach. Learn., 56, 89-113 (2004) · Zbl 1089.68085
[5] Baraniuk, R.; Davenport, M.; DeVore, R.; Wakin, M., A simple proof of the restricted isometry property for random matrices, Constr. Approx., 28, 253-263 (2008) · Zbl 1177.15015
[6] Blum, A.; Hopcroft, J.; Kannan, R., Foundations of Data Science (2016), Vorabversion eines Lehrbuchs
[7] Buluç, A.; Meyerhenke, H.; Safro, I.; Sanders, P.; Schulz, C., Recent advances in graph partitioning, (Algorithm Engineering (2016), Springer), 117-158
[8] Candes, E. J., The restricted isometry property and its implications for compressed sensing, C. R. Math., 346, 589-592 (2008) · Zbl 1153.94002
[9] Cheng, W. C.; Kok, S.; Pham, H. V.; Chieu, H. L.; Chai, K. M.A., Language modeling with sum-product networks, (Annual Conf. of the Int. Speech Communication Association (2014))
[10] Choi, A.; Darwiche, A., On relaxing determinism in arithmetic circuits, (Proceedings of the 34th International Conference on Machine Learning, vol. 70 (2017), JMLR.org), 825-833
[11] Coates, A.; Ng, A.; Lee, H., An analysis of single-layer networks in unsupervised feature learning, (Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (2011)), 215-223
[12] Cormen, T. H., Introduction to Algorithms (2009), MIT Press
[13] Darwiche, A., A differential approach to inference in Bayesian networks, J. ACM, 50, 280-305 (2003) · Zbl 1325.68226
[14] Dennis, A.; Ventura, D., Learning the architecture of sum-product networks using clustering on variables, (NIPS (2012))
[15] Dennis, A. W.; Ventura, D., Greedy structure search for sum-product networks, (IJCAI (2015)), 932-938
[16] Dinh, L.; Sohl-Dickstein, J.; Bengio, S., Density estimation using real NVP (2016), arXiv preprint
[17] Eldar, Y. C.; Kutyniok, G., Compressed Sensing: Theory and Applications (2012), Cambridge University Press
[18] Feldmann, A. E.; Foschini, L., Balanced partitions of trees and applications, Algorithmica, 71, 354-376 (2015) · Zbl 1315.68201
[19] Fiduccia, C. M.; Mattheyses, R. M., A linear-time heuristic for improving network partitions, (Papers on Twenty-Five Years of Electronic Design Automation, ACM (1988)), 241-247
[20] Friesen, A. L.; Domingos, P., Recursive decomposition for nonconvex optimization, (Proceedings of the 24th International Joint Conference on Artificial Intelligence (2015)), 253-259
[21] Gens, R.; Domingos, P., Discriminative learning of sum-product networks, (Advances in Neural Information Processing Systems (2012)), 3239-3247
[22] Gens, R.; Domingos, P., Learning the structure of sum-product networks, (ICML (2013)), 873-880
[23] Hartmann, T., Discriminative Convolutional Sum-Product Networks on GPU (2014), Rheinische Friedrich-Wilhelms-Universität: Rheinische Friedrich-Wilhelms-Universität Bonn, Ph.D. thesis
[24] Hsieh, C. J.; Sustik, M. A.; Dhillon, I. S.; Ravikumar, P., Quic: quadratic approximation for sparse inverse covariance estimation, J. Mach. Learn. Res., 15, 2911-2947 (2014) · Zbl 1319.65048
[25] Hsu, W.; Kalra, A.; Poupart, P., Online structure learning for sum-product networks with Gaussian leaves (2017), arXiv preprint
[26] Jaini, P.; Rashwan, A.; Zhao, H.; Liu, Y.; Banijamali, E.; Chen, Z.; Poupart, P., Online algorithms for sum-product networks with continuous variables, (Conference on Probabilistic Graphical Models (2016)), 228-239
[27] Karger, D. R.; Klein, P. N.; Tarjan, R. E., A randomized linear-time algorithm to find minimum spanning trees, J. ACM, 42, 321-328 (1995) · Zbl 0886.68079
[28] Kruskal, J. B., On the shortest spanning subtree of a graph and the traveling salesman problem, Proc. Am. Math. Soc., 7, 48-50 (1956) · Zbl 0070.18404
[29] Lee, S. W.; Heo, M. O.; Zhang, B. T., Online incremental structure learning of sum-product networks, (NIPS (2013)), 220-227
[30] Liang, Y.; Bekker, J.; Van den Broeck, G., Learning the structure of probabilistic sentential decision diagrams, (Proceedings of the 33rd Conference on Uncertainty in Artificial Intelligence (UAI) (2017))
[31] Liang, Y.; Van den Broeck, G., Learning logistic circuits, AAAI Proc., 1, 3 (2019)
[32] Lin, S.; Kernighan, B. W., An effective heuristic algorithm for the traveling-salesman problem, Oper. Res., 21, 498-516 (1973) · Zbl 0256.90038
[33] Liu, Z.; Luo, P.; Wang, X.; Tang, X., Deep learning face attributes in the wild, (Proceedings of the IEEE International Conference on Computer Vision (2015)), 3730-3738
[34] Mazumdar, A.; Saha, B., Clustering with noisy queries, (Advances in Neural Information Processing Systems (2017)), 5788-5799
[35] Molina, A.; Vergari, A.; Di Mauro, N.; Natarajan, S.; Esposito, F.; Kersting, K., Mixed sum-product networks: a deep architecture for hybrid domains (2018), AAAI paper
[36] Papadimitriou, C. H.; Steiglitz, K., The max-flow, min-cut theorem, (Combinatorial Optimization: Algorithms and Complexity (1998), Dover), 120-128
[37] Papamakarios, G.; Pavlakou, T.; Murray, I., Masked autoregressive flow for density estimation, (Advances in Neural Information Processing Systems (2017)), 2338-2347
[38] Peharz, R.; Geiger, B. C.; Pernkopf, F., Greedy part-wise learning of sum-product networks, (Machine Learning and Knowledge Discovery in Databases (2013), Springer), 612-627
[39] Poon, H.; Domingos, P., Sum-product networks: a new deep architecture, (2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops) (2011), IEEE), 689-690
[40] Rahman, T.; Gogate, V., Merging strategies for sum-product networks: from trees to graphs, (UAI (2016))
[41] Rooshenas, A.; Lowd, D., Learning sum-product networks with direct and indirect variable interactions, (ICML (2014)), 710-718
[42] Salakhutdinov, R.; Hinton, G. E., Deep Boltzmann machines, (AISTATS (2009)), 448-455
[43] Shi, J.; Malik, J., Normalized cuts and image segmentation (2000), Departmental Papers (CIS) 107
[44] Shwartz, O.; Nadler, B., Detecting the large entries of a sparse covariance matrix in sub-quadratic time, Inf. Inference, 5, 304-330 (2016) · Zbl 1386.94037
[45] Trapp, M.; Peharz, R.; Skowron, M.; Madl, T.; Pernkopf, F.; Trappl, R., Structure inference in sum-product networks using infinite sum-product trees, (NIPS Workshop on Practical Bayesian Nonparametrics (2016))
[46] Vergari, A.; Di Mauro, N.; Esposito, F., Simplifying, regularizing and strengthening sum-product network structure learning, (ECML-PKDD (2015)), 343-358
[47] Wilcoxon, F., Some rapid approximate statistical procedures, Ann. N.Y. Acad. Sci., 808-814 (1950)
[48] Fattahi, S.; Zhang, R. Y.; Sojoudi, S., Linear-time algorithm for learning large-scale sparse graphical models, IEEE Access, 7, 12658-12672 (2019)
[49] Zhao, H.; Melibari, M.; Poupart, P., On the relationship between sum-product networks and Bayesian networks, (ICML (2015))
[50] Zhao, H.; Poupart, P.; Gordon, G. J., A unified approach for learning the parameters of sum-product networks, (Advances in Neural Information Processing Systems (2016)), 433-441
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.