×

The role of local partial independence in learning of Bayesian networks. (English) Zbl 1344.68192

Summary: Bayesian networks are one of the most widely used tools for modeling multivariate systems. It has been demonstrated that more expressive models, which can capture additional structure in each conditional probability table (CPT), may enjoy improved predictive performance over traditional Bayesian networks despite having fewer parameters. Here we investigate this phenomenon for models of various degree of expressiveness on both extensive synthetic and real data. To characterize the regularities within CPTs in terms of independence relations, we introduce the notion of partial conditional independence (PCI) as a generalization of the well-known concept of context-specific independence (CSI). To model the structure of the CPTs, we use different graph-based representations which are convenient from a learning perspective. In addition to the previously studied decision trees and graphs, we introduce the concept of PCI-trees as a natural extension of the CSI-based trees. To identify plausible models we use the Bayesian score in combination with a greedy search algorithm. A comparison against ordinary Bayesian networks shows that models with local structures in general enjoy parametric sparsity and improved out-of-sample predictive performance, however, often it is necessary to regulate the model fit with an appropriate model structure prior to avoid overfitting in the learning process. The tree structures, in particular, lead to high quality models and suggest considerable potential for further exploration.

MSC:

68T05 Learning and adaptive systems in artificial intelligence

Software:

UCI-ml; bnlearn
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Barclay, L. M.; Hutton, J. L.; Smith, J. Q., Refining a Bayesian network using a chain event graph, Int. J. Approx. Reason., 54, 9, 1300-1309 (2013) · Zbl 1316.68169
[2] Bartlett, M.; Cussens, J., Advances in Bayesian network learning using integer programming, (Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence (2013)), 182-191
[3] Boutilier, C.; Friedman, N.; Goldszmidt, M.; Koller, D., Context-specific independence in Bayesian networks, (Proceedings of the Twelfth Annual Conference on Uncertainty in Artificial Intelligence (1996), Morgan Kaufmann), 115-123
[4] Buntine, W., Theory refinement on Bayesian networks, (Proceedings of the Seventh Conference on Uncertainty in Artificial Intelligence (1991), Morgan Kaufmann), 52-60
[5] Cano, A.; Gómez-Olmedo, M.; Masegosa, A.; Moral, S., Learning with Bayesian networks and probability trees to approximate a joint distribution, (Proceedings of the 2011 11th International Conference on Intelligent Systems Design and Applications (2011)), 624-629
[6] Cano, A.; Gómez-Olmedo, M.; Moral, S., Approximate inference in Bayesian networks using binary probability trees, Int. J. Approx. Reason., 52, 1, 49-62 (2011) · Zbl 1213.68616
[7] Chickering, D. M., Learning Bayesian networks is NP-complete, (Learning from Data: Artificial Intelligence and Statistics V (1996)), 121-130
[8] Chickering, D. M.; Heckerman, D.; Meek, C., A Bayesian approach to learning Bayesian networks with local structure, (Proceedings of Thirteenth Conference on Uncertainty in Artificial Intelligence (1997), Morgan Kaufmann)
[9] Cooper, G. F.; Herskovitz, E., A Bayesian method for the induction of probabilistic networks from data, Mach. Learn., 9, 309-347 (1992) · Zbl 0766.68109
[10] desJardins, M.; Rathod, P.; Getoor, L., Learning structured Bayesian networks: combining abstraction hierarchies and tree-structured conditional probability tables, Comput. Intell., 24, 1, 1-22 (2008)
[11] Freeman, G.; Smith, J. Q., Bayesian map model selection of chain event graphs, J. Multivar. Anal., 102, 7, 1152-1165 (2011) · Zbl 1216.62039
[12] Friedman, N.; Goldszmidt, M., Learning Bayesian networks with local structure, (Proceedings of the Twelfth Annual Conference on Uncertainty in Artificial Intelligence (1996), Morgan Kaufmann), 252-262
[13] Friedman, N.; Geiger, D.; Goldszmidt, M., Bayesian network classifiers, Mach. Learn., 29, 2-3, 131-163 (1997) · Zbl 0892.68077
[14] Geiger, D.; Heckerman, D., Knowledge representation and inference in similarity networks and Bayesian multinets, Artif. Intell., 82, 45-74 (1996) · Zbl 1517.68360
[15] Heckerman, D.; Geiger, D.; Chickering, D. M., Learning Bayesian networks: the combination of knowledge and statistical data, Mach. Learn., 20, 197-243 (1995) · Zbl 0831.68096
[16] Jaeger, M.; Nielsen, J. D.; Silander, T., Learning probabilistic decision graphs, Int. J. Approx. Reason., 42, 1-2, 84-100 (2006) · Zbl 1096.68704
[17] Lam, W.; Bacchus, F., Learning Bayesian belief networks: an approach based on the MDL principle, Comput. Intell., 10, 3, 269-293 (1994)
[18] Lichman, M., UCI machine learning repository (2013)
[19] Lintusaari, J., PCSI-labeled directed acyclic graphs (2014), University of Helsinki, Master’s thesis
[20] Nyman, H.; Xiong, J.; Pensar, J.; Corander, J., Marginal and simultaneous predictive classification using stratified graphical models, Adv. Data Anal. Classif. (2015)
[21] Pensar, J.; Nyman, H.; Koski, T.; Corander, J., Labeled directed acyclic graphs: a generalization of context-specific independence in directed graphical models, Data Min. Knowl. Discov., 29, 2, 503-533 (2015) · Zbl 1403.68206
[22] Poole, D.; Zhang, N. L., Exploiting contextual independence in probabilistic inference, J. Artif. Intell. Res., 18, 263-313 (2003) · Zbl 1056.68144
[23] Scutari, M., Learning Bayesian networks with the bnlearn R package, J. Stat. Softw., 35, 3, 1-22 (2010)
[24] Silander, T.; Myllymäki, P., A simple approach for finding the globally optimal Bayesian network structure, (Proceedings of the Twenty-Second Annual Conference on Uncertainty in Artificial Intelligence (2006), AUAI Press), 445-452
[25] Smith, J. Q.; Anderson, P. E., Conditional independence and chain event graphs, Artif. Intell., 172, 1, 42-68 (2008) · Zbl 1182.68303
[26] Tsamardinos, I.; Brown, L. E.; Aliferis, C. F., The max-min hill-climbing Bayesian network structure learning algorithm, Mach. Learn., 65, 1, 31-78 (2006) · Zbl 1470.68192
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.