×

Learning discrete decomposable graphical models via constraint optimization. (English) Zbl 1505.62200

Summary: Statistical model learning problems are traditionally solved using either heuristic greedy optimization or stochastic simulation, such as Markov chain Monte Carlo or simulated annealing. Recently, there has been an increasing interest in the use of combinatorial search methods, including those based on computational logic. Some of these methods are particularly attractive since they can also be successful in proving the global optimality of solutions, in contrast to stochastic algorithms that only guarantee optimality at the limit. Here we improve and generalize a recently introduced constraint-based method for learning undirected graphical models. The new method combines perfect elimination orderings with various strategies for solution pruning and offers a dramatic improvement both in terms of time and memory complexity. We also show that the method is capable of efficiently handling a more general class of models, called stratified/labeled graphical models, which have an astronomically larger model space.

MSC:

62-08 Computational methods for problems pertaining to statistics
62H22 Probabilistic graphical models
62H17 Contingency tables
68N17 Logic programming
68T20 Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.)

Software:

CPLEX; Sat4j
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Bartlett, M., Cussens, J.: Advances in Bayesian network learning using integer programming. In: Proceedings of the 29th International Conference on Uncertainty in Artificial Intelligence, pp. 182-191. AUAI Press (2013)
[2] Berg, J., Järvisalo, M., Malone, B.: Learning optimal bounded treewidth Bayesian networks via maximum satisfiability. In: Proceedings of the 17th International Conference on Artificial Intelligence and Statistics, pp. 86-95. JMLR.org (2014)
[3] Boutilier, C., Friedman, N., Goldszmidt, M., Koller, D.: Context-specific independence in Bayesian networks. In: Proceedings of the 12th International Conference on Uncertainty in Artificial Intelligence, pp. 115-123. Morgan Kaufmann (1996)
[4] Brewka, G., Eiter, T., Truszczyński, M.: Answer set programming at a glance. Commun. ACM 54(12), 92-103 (2011) · doi:10.1145/2043174.2043195
[5] Chandran, L.S., Ibarra, L., Ruskey, F., Sawada, J.: Generating and characterizing the perfect elimination orderings of a chordal graph. Theor. Comput. Sci. 307(2), 303-317 (2003) · Zbl 1070.68116 · doi:10.1016/S0304-3975(03)00221-4
[6] Cook, S.A.: The complexity of theorem-proving procedures. In: Proceedings of the 3rd Annual ACM Symposium on Theory of Computing, pp. 151-158. ACM Press (1971) · Zbl 0253.68020
[7] Corander, J.: Labelled graphical models. Scand. J. Stat. 30(3), 493-508 (2003) · Zbl 1034.62049 · doi:10.1111/1467-9469.00344
[8] Corander, J., Ekdahl, M., Koski, T.: Parallel interacting MCMC for learning of topologies of graphical models. Data Min. Knowl. Discov. 17(3), 431-456 (2008) · doi:10.1007/s10618-008-0099-9
[9] Corander, J., Janhunen, T., Rintanen, J., Nyman, H., Pensar, J.: Learning chordal Markov networks by constraint satisfaction. In: Proceedings of the 27th Annual Conference on Neural Information Processing Systems, pp. 1349-1357. NIPS Foundation (2013)
[10] Cussens, J.: Bayesian network learning by compiling to weighted MAX-SAT. In: Proceedings of the 24th International Conference on Uncertainty in Artificial Intelligence, pp. 105-112. AUAI Press (2008)
[11] Dawid, A.P., Lauritzen, S.L.: Hyper-Markov laws in the statistical analysis of decomposable graphical models. Ann. Stat. 21(3), 1272-1317 (1993) · Zbl 0815.62038 · doi:10.1214/aos/1176349260
[12] Dellaportas, P., Forster, J.J.: Markov chain Monte Carlo model determination for hierarchical and graphical log-linear models. Biometrika 86(3), 615-633 (1999) · Zbl 0949.62050 · doi:10.1093/biomet/86.3.615
[13] Dirac, G.A.: On rigid circuit graphs. Abhandlungen aus dem Mathematischen Seminar der Universität Hamburg 25(1-2), 71-76 (1961) · Zbl 0098.14703 · doi:10.1007/BF02992776
[14] Eriksen, P.S.: Context specific interaction models. Technical Report, Department of Mathematical Sciences, Aalborg University (1999)
[15] Eriksen, P.S.: Decomposable log-linear models. Technical Report, Department of Mathematical Sciences, Aalborg University (2005)
[16] Friedman, N., Goldszmidt, M.: Learning Bayesian networks with local structure. In: Proceedings of the 12th International Conference on Uncertainty in Artificial Intelligence, pp. 252-262. Morgan Kaufmann (1996) · Zbl 0910.68176
[17] Galinier, P., Habib, M., Paul, C.: Chordal graphs and their clique graphs. In: Proceedings of the 21st International Workshop on Graph-Theoretic Concepts in Computer Science, pp. 358-371. Springer (1995)
[18] Gebser, M., Janhunen, T., Rintanen, J.: Answer set programming as SAT modulo acyclicity. In: Proceedings of the 21st European Conference on Artificial Intelligence, pp. 351-356. IOS Press (2014) · Zbl 1366.68017
[19] Gebser, M., Kaufmann, B., Schaub, T.: Conflict-driven answer set solving: From theory to practice. Artificial Intelligence 187-188, 52-89 (2012) · Zbl 1251.68060 · doi:10.1016/j.artint.2012.04.001
[20] Giudici, P., Castello, R.: Improving Markov chain Monte Carlo model search for data mining. Mach. Learn. 50(1-2), 127-158 (2003) · Zbl 1050.68120 · doi:10.1023/A:1020202028934
[21] Giudici, P., Green, P.J.: Decomposable graphical Gaussian model determination. Biometrika 86(4), 785-801 (1999) · Zbl 0940.62019 · doi:10.1093/biomet/86.4.785
[22] Golumbic, M.C.: Algorithmic Graph Theory and Perfect Graphs. Academic Press, New York (1980) · Zbl 0541.05054
[23] Gomory, R.E.: Outline of an algorithm for integer solutions to linear programs. Bull. Am. Math. Soc. 64(5), 275-278 (1958) · Zbl 0085.35807 · doi:10.1090/S0002-9904-1958-10224-4
[24] Højsgaard, S.: Split models for contingency tables. Comput. Stat. Data Anal. 42(4), 621-645 (2003) · Zbl 1429.62202 · doi:10.1016/S0167-9473(02)00119-6
[25] Højsgaard, S.: Statistical inference in context specific interaction models for contingency tables. Scand. J. Stat. 31(1), 143-158 (2004) · Zbl 1051.62053 · doi:10.1111/j.1467-9469.2004.00378.x
[26] IBM Corporation: IBM ILOG CPLEX Optimization Studio CP Optimizer User’s Manual, version 12 release 6.0 edn. (2013)
[27] Johnson, D.S.: Approximation algorithms for combinatorial problems. J. Comput. Syst. Sci. 9(3), 256-278 (1974) · Zbl 0296.65036 · doi:10.1016/S0022-0000(74)80044-9
[28] Kangas, K., Koivisto, M., Niinimäki, T.: Learning chordal Markov networks by dynamic programming. In: Proceedings of the 28th Annual Conference on Neural Information Processing Systems, pp. 2357-2365. NIPS Foundation (2014)
[29] Kass, R., Raftery, A.: Bayes factors. J. Am. Stat. Assoc. 90(430), 773-795 (1995) · Zbl 0846.62028 · doi:10.1080/01621459.1995.10476572
[30] Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. The MIT Press, Cambridge (2009) · Zbl 1183.68483
[31] Lauritzen, S.L.: Graphical Models. Oxford University Press, Oxford (1996) · Zbl 0907.62001
[32] Le Berre, D., Parrain, A.: The Sat4j library, release 2.2 system description. J. Satisf. Boolean Model. Comput. 7, 59-64 (2010)
[33] Lifschitz, V.: Answer set programming and plan generation. Artif. Intell. 138(1-2), 39-54 (2002) · Zbl 0995.68020 · doi:10.1016/S0004-3702(02)00186-8
[34] Madigan, D., Raftery, A.E.: Model selection and accounting for model uncertainty in graphical models using Occam’s window. J. Am. Stat. Assoc. 89(428), 1535-1546 (1994) · Zbl 0814.62030 · doi:10.1080/01621459.1994.10476894
[35] Manquinho, V., Marques-Silva, J., Planes, J.: Algorithms for weighted Boolean optimization. In: Proceedings of the 12th International Conference on Theory and Applications of Satisfiability Testing, pp. 495-508. Springer (2009) · Zbl 1247.68258
[36] Marek, V., Truszczyński, M.: Stable models and an alternative logic programming paradigm. In: The Logic Programming Paradigm: A 25-Year Perspective, pp. 375-398. Springer (1999) · Zbl 0979.68524
[37] Martins, R., Manquinho, V., Lynce, I.: Parallel search for maximum satisfiability. AI Commun. 25(2), 75-95 (2012) · Zbl 1309.90057
[38] Niemelä, I.: Logic programming with stable model semantics as a constraint programming paradigm. Ann. Math. Artif. Intell. 25(3-4), 241-273 (1999) · Zbl 0940.68018
[39] Nyman, H., Pensar, J., Koski, T., Corander, J.: Stratified graphical models-context-specific independence in graphical models. Bayesian Anal. 9(4), 883-908 (2014) · Zbl 1327.62030
[40] Parviainen, P., Farahani, H.S., Lagergren, J.: Learning bounded tree-width Bayesian networks using integer linear programming. In: Proceedings of the 17th International Conference on Artificial Intelligence and Statistics, pp. 751-759. JMLR.org (2014)
[41] Pensar, J., Nyman, H., Koski, T., Corander, J.: Labeled directed acyclic graphs: a generalization of context-specific independence in directed graphical models. Data Min. Knowl. Discov. 29(2), 503-533 (2015) · Zbl 1403.68206 · doi:10.1007/s10618-014-0355-0
[42] Rose, D.J.: Triangulated graphs and the elimination process. J. Math. Anal. Appl. 32(3), 597-609 (1970) · Zbl 0216.02602 · doi:10.1016/0022-247X(70)90282-9
[43] Sebastiani, R.; Tomasi, S.; Gramlich, B. (ed.); Miller, D. (ed.); Sattler, U. (ed.), Optimization in SMT with LA(Q) cost functions, 484-498 (2012), Heidelberg · Zbl 1358.68264 · doi:10.1007/978-3-642-31365-3_38
[44] Silander, T., Kontkanen, P., Myllymäki, P.: On sensitivity of the MAP Bayesian network structure to the equivalent sample size parameter. In: Proceedings of the The 23rd Conference on Uncertainty in Artificial Intelligence (UAI-2007), pp. 360-367. AUAI Press (2007)
[45] Silander, T., Roos, T., Kontkanen, P., Myllymäki, P.: Factorized NML criterion for learning Bayesian network structures. In: Proceedings 4th European Workshop on Probabilistic Graphical Models (PGM-2008) (2008)
[46] Silander, T., Roos, T., Myllymäki, P.: Learning locally minimax optimal Bayesian networks. Int. J. Approx. Reason. 51(5), 544-557 (2010) · doi:10.1016/j.ijar.2010.01.012
[47] Sinz, C.: Towards an optimal CNF encoding of Boolean cardinality constraints. In: Proceedings of the 11th International Conference on Principles and Practice of Constraint Programming, pp. 827-831. Springer (2005) · Zbl 1153.68488
[48] Whittaker, J.: Graphical Models in Applied Multivariate Statistics. Wiley, New York (1990) · Zbl 0732.62056
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.