×

Shapley homology: topological analysis of sample influence for neural networks. (English) Zbl 1473.62410

Summary: Data samples collected for training machine learning models are typically assumed to be independent and identically distributed (i.i.d.). Recent research has demonstrated that this assumption can be problematic as it simplifies the manifold of structured data. This has motivated different research areas such as data poisoning, model improvement, and explanation of machine learning models. In this work, we study the influence of a sample on determining the intrinsic topological features of its underlying manifold. We propose the Shapley homology framework, which provides a quantitative metric for the influence of a sample of the homology of a simplicial complex. Our proposed framework consists of two main parts: homology analysis, where we compute the Betti number of the target topological space, and Shapley value calculation, where we decompose the topological features of a complex built from data points to individual points. By interpreting the influence as a probability measure, we further define an entropy that reflects the complexity of the data manifold. Furthermore, we provide a preliminary discussion of the connection of the Shapley homology to the Vapnik-Chervonenkis dimension. Empirical studies show that when the zero-dimensional Shapley homology is used on neighboring graphs, samples with higher influence scores have a greater impact on the accuracy of neural networks that determine graph connectivity and on several regular grammars whose higher entropy values imply greater difficulty in being learned.

MSC:

62R40 Topological data analysis
62R30 Statistics on manifolds
68T05 Learning and adaptive systems in artificial intelligence

Software:

shap
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Anirudh, R., Thiagarajan, J. J., Sridhar, R., & Bremer, T. (2017). Influential sample selection: A graph signal processing approach. arXiv:1711.05407.
[2] Bubenik, P. (2015). Statistical topological data analysis using persistence landscapes. Journal of Machine Learning Research, 16(1), 77-102. · Zbl 1337.68221
[3] Carlsson, G., & Gabrielsson, R. B. (2018). Topological approaches to deep learning. CoRR, abs/1811.01122. · Zbl 1441.68220
[4] Carlsson, G., Ishkhanov, T., De Silva, V., & Zomorodian, A. (2008). On the local behavior of spaces of natural images. International Journal of Computer Vision, 76(1), 1-12. , · Zbl 1477.68463
[5] Chazal, F., & Michel, B. (2017). An introduction to topological data analysis: Fundamental and practical aspects for data scientists. CoRR, abs/1710.04019.
[6] Chen, J., Song, L., Wainwright, M. J., & Jordan, M. I. (2018). L-Shapley and c-Shapley: Efficient model interpretation for structured data. CoRR, abs/1808.02610.
[7] Chen, X., Liu, C., Li, B., Lu, K., & Song, D. (2017). Targeted backdoor attacks on deep learning systems using data poisoning. CoRR, abs/1712.05526.
[8] Cho, K., Van Merriënboer, B., Bahdanau, D., & Bengio, Y. (2014). On the properties of neural machine translation: Encoder-decoder approaches. In Proceedings of the Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (pp. 103-111). Stroudsburg, PA: Association for Computational Linguistics. ,
[9] Cohen-Steiner, D., Kong, W., Sohler, C., & Valiant, G. (2018). Approximating the spectrum of a graph. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1263-1271). New York: ACM.
[10] Conrad, K. (2008). Group actions.https://www.math.uconn.edu/∼kconrad/blurbs/grouptheory/gpaction.pdf
[11] Dai, H., Dai, B., & Song, L. (2016). Discriminative embeddings of latent variable models for structured data. arXiv:1603.05629.
[12] Dai, H., Li, H., Tian, T., Huang, X., Wang, L., Zhu, J., & Song, L. (2018). Adversarial attack on graph structured data. arXiv:1806.02371.
[13] Datta, A., Sen, S., & Zick, Y. (2016). Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems. In Proceedings of the IEEE Symposium on Security and Privacy (pp. 598-617). Piscataway, NJ: IEEE.
[14] De la Higuera, C. (2010). Grammatical inference: Learning automata and grammars. Cambridge: Cambridge University Press. , · Zbl 1227.68112
[15] Edelsbrunner, H., & Harer, J. (2008). Persistent homology: A survey. Contemporary Mathematics, 453, 257-282. , · Zbl 1145.55007
[16] Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14(2), 179-211. ,
[17] Gunning, D. (2017). Explainable artificial intelligence (XAI). Arlington, VA: Defense Advanced Research Projects Agency.
[18] Hanneke, S. (2016). The optimal sample complexity of PAC learning. Journal of Machine Learning Research, 17(1), 1319-1333. · Zbl 1360.68679
[19] Hatcher, A. (2005). Algebraic topology. Cambridge: Cambridge University Press. · Zbl 1044.55001
[20] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780. ,
[21] Hofer, C., Kwitt, R., Niethammer, M., & Uhl, A. (2017). Deep learning with topological signatures. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in neural information processing systems, 30 (pp. 1633-1643). Red Hook, NY: Curran.
[22] Ishigami, Y., & Tani, S. (1997). VC-dimensions of finite automata and commutative finite automata with K letters and N states. Discrete Applied Mathematics, 74(2), 123-134. , · Zbl 0870.68101
[23] Khoury, M., & Hadfield-Menell, D. (2018). On the geometry of adversarial examples. CoRR, abs/1811.00525.
[24] Koh, P. W., & Liang, P. (2017). Understanding black-box predictions via influence functions. In Proceedings of the 34th International Conference on Machine Learning (pp. 1885-1894).
[25] Lee, D., & Yoo, C. D. (2019). Learning to augment influential data.https://openreview.net/forum?id=BygIV2CcKm
[26] Li, C., Ovsjanikov, M., & Chazal, F. (2014). Persistence-based structural recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1995-2002). Piscataway, NJ: IEEE.
[27] Lundberg, S. M., & Lee, S. (2017). A unified approach to interpreting model predictions. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in neural information processing systems, 30 (pp. 4768-4777). Red Hook, NY: Curran.
[28] Marsden, A. (2013). Eigenvalues of the Laplacian and their relationship to the connectedness of a graph. University of Chicago lecture note.
[29] Narahari, Y. (2012). The Shapley value. https://scholar.google.com/scholar?oi=bibs&hl=zh-CN&cluster=12828497160209961416 · Zbl 1293.76128
[30] Newman, M. E. (2005). A measure of betweenness centrality based on random walks. Social Networks, 27(1), 39-54. ,
[31] Ren, M., Zeng, W., Yang, B., & Urtasun, R. (2018). Learning to reweight examples for robust deep learning. In Proceedings of the 35th International Conference on Machine Learning (pp. 4331-4340).
[32] Rezaei, S. S. C. (2013). Entropy and graphs. arXiv:1311.5632.
[33] Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1135-1144). New York: ACM. ,
[34] Rote, G., & Vegter, G. (2006). Computational topology: An introduction. In J.-D. Boissonnat& M. Teillaud (Eds.), Effective computational geometry for curves and surfaces (pp. 277-312). Berlin: Springer. , · Zbl 1116.65030
[35] Schapira, P. (2001). Categories and homological algebra. Société mathétique de France.
[36] Tomita, M. (1982). Dynamic construction of finite-state automata from examples using hill-climbing. In Proceedings of the Fourth Annual Conference of the Cognitive Science Society (pp. 105-108). Austin, TX: Computer Science Society.
[37] Turner, K., Mukherjee, S., & Boyer, D. M. (2014). Persistent homology transform for modeling shapes and surfaces. Information and Inference: A Journal of the IMA, 3(4), 310-344. , · Zbl 06840289
[38] Vapnik, V. N. (2000). The nature of statistical learning theory (2nd ed.). Berlin: Springer. , · Zbl 0934.62009
[39] Wang, Q., Zhang, K., Liu, X., & Giles, C. L. (2018). Verification of recurrent neural networks through rule extraction. arXiv:1811.06029.
[40] Wang, Q., Zhang, K., Ororbia, I., Alexander, G., Xing, X., Liu, X., & Giles, C. L. (2018). A comparative study of rule extraction for recurrent neural networks. arXiv:1801.05420.
[41] Wang, Q., Zhang, K., Ororbia II, A. G., Xing, X., Liu, X., & Giles, C. L. (2018). An empirical evaluation of rule extraction from recurrent neural networks. Neural Computation, 30(9), 2568-2591. ,
[42] Wang, T., Zhu, J., Torralba, A., & Efros, A. A. (2018). Dataset distillation. CoRR, abs/1811.10959.
[43] Wang, Y., & Chaudhuri, K. (2018). Data poisoning attacks against online learning. CoRR, abs/1808.08994.
[44] Weiss, G., Goldberg, Y., & Yahav, E. (2017). Extracting automata from recurrent neural networks using queries and counterexamples. arXiv:1711.09576.
[45] Williams, S. G. (2004). Introduction to symbolic dynamics. Proceedings of Symposia in Applied Mathematics, 60, 1-12. , · Zbl 1072.37015
[46] Yeh, C., Kim, J. S., Yen, I. E., & Ravikumar, P. (2018). Representer point selection for explaining deep neural networks. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, & R. Garnett (Eds.), Advances in neural information processing systems, 31 (pp. 9311-9321). Red Hook, NY: Curran.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.