×

Faster shift-reduce constituent parsing with a non-binary, bottom-up strategy. (English) Zbl 1478.68415

Summary: An increasingly wide range of artificial intelligence applications rely on syntactic information to process and extract meaning from natural language text or speech, with constituent trees being one of the most widely used syntactic formalisms. To produce these phrase-structure representations from sentences in natural language, shift-reduce constituent parsers have become one of the most efficient approaches. Increasing their accuracy and speed is still one of the main objectives pursued by the research community so that artificial intelligence applications that make use of parsing outputs, such as machine translation or voice assistant services, can improve their performance. With this goal in mind, we propose in this article a novel non-binary shift-reduce algorithm for constituent parsing. Our parser follows a classical bottom-up strategy but, unlike others, it straightforwardly creates non-binary branchings with just one \(\mathsf{Reduce}\) transition, instead of requiring prior binarization or a sequence of binary transitions, allowing its direct application to any language without the need of further resources such as percolation tables. As a result, it uses fewer transitions per sentence than existing transition-based constituent parsers, becoming the fastest such system and, as a consequence, speeding up downstream applications. Using static oracle training and greedy search, the accuracy of this novel approach is on par with state-of-the-art transition-based constituent parsers and outperforms all top-down and bottom-up greedy shift-reduce systems on the Wall Street Journal section from the English Penn Treebank and the Penn Chinese Treebank. Additionally, we develop a dynamic oracle for training the proposed transition-based algorithm, achieving further improvements in both benchmarks and obtaining the best accuracy to date on the Penn Chinese Treebank among greedy shift-reduce parsers.

MSC:

68T50 Natural language processing
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Abdi, A.; Idris, N.; Alguliev, R. M.; Aliguliyev, R. M., Automatic summarization assessment through a combination of semantic and syntactic information for intelligent educational systems, Inf. Process. Manag., 51, 4, 340-358 (2015)
[2] Aharoni, Goldberg, Towards string-to-tree neural machine translation, (Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (2017), Association for Computational Linguistics), 132-140
[3] Bloomfield, Language, 1030-1035 (1933), University of Chicago Press
[4] Branavan, S. R.K.; Silver, D.; Barzilay, R., Learning to win by reading manuals in a Monte-Carlo framework, (Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, HLT ’11, Stroudsburg, PA, USA (2011), Association for Computational Linguistics), 268-277
[5] Chali, Y.; Hasan, S. A.; Joty, S. R., Improving graph-based random walks for complex question answering using syntactic, shallow semantic and extended string subsequence kernels, Inf. Process. Manag., 47, 6, 843-855 (2011)
[6] Charniak, E., Tree-bank grammars, (Proceedings of AAAI/IAAI (1996)), 1031-1036
[7] Chen, D.; Manning, C. D., A fast and accurate dependency parser using neural networks, (Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL (2014)), 740-750
[8] Chomsky, N., Three models for the description of language, IRE Trans. Inf. Theory, IT-2, 113-124 (1956) · Zbl 0156.25401
[9] Coavoux, M.; Crabbé, B., Neural greedy constituent parsing with dynamic oracles, (Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, Germany (2016), Association for Computational Linguistics), 172-182
[10] Collins, M., Three generative, lexicalised models for statistical parsing, (Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (ACL) and the 8th Conference of the European Chapter of the Association for Computational Linguistics. Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics (ACL) and the 8th Conference of the European Chapter of the Association for Computational Linguistics, EACL (1997)), 16-23
[11] Crabbé, B., Multilingual discriminative lexicalized phrase structure parsing, (Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal (2015), Association for Computational Linguistics), 1847-1856
[12] Cross, J.; Huang, L., Incremental parsing with minimal features using bi-directional lstm, (Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (2016), Association for Computational Linguistics), 32-37
[13] Cross, J.; Huang, L., Span-based constituency parsing with a structure-label system and provably optimal dynamic oracles, (Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (2016), Association for Computational Linguistics), 1-11
[14] Dyer, C.; Ballesteros, M.; Ling, W.; Matthews, A.; Smith, N. A., Transition-based dependency parsing with stack long short-term memory, (Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (2015), Association for Computational Linguistics), 334-343
[15] Dyer, C.; Kuncoro, A.; Ballesteros, M.; Smith, N. A., Recurrent neural network grammars, (Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2016), Association for Computational Linguistics), 199-209
[16] Eriguchi, A.; Hashimoto, K.; Tsuruoka, Y., Tree-to-sequence attentional neural machine translation, (Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2016), Association for Computational Linguistics), 823-833
[17] Fang, L.; Tuan, L. A.; Hui, S. C.; Wu, L., Syntactic based approach for grammar question retrieval, Inf. Process. Manag., 54, 2, 184-202 (2018)
[18] Fernández-González, D.; Gómez-Rodríguez, C., Non-projective dependency parsing with non-local transitions, (Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), NAACL-HLT, New Orleans, Louisiana, USA, June 1-6, 2018 (2018)), 693-700
[19] Finkel, J. R.; Manning, C. D., Joint parsing and named entity recognition, (Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL’ 09, Stroudsburg, PA, USA (2009), Association for Computational Linguistics), 326-334
[20] Gaddy, D.; Stern, M.; Klein, D., What’s going on in neural constituency parsers? An analysis, (Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1-6, 2018 (2018), Association for Computational Linguistics), 999-1010
[21] Gildea, D.; Jurafsky, D., Automatic labeling of semantic roles, Comput. Linguist., 28, 3, 245-288 (2002)
[22] Goldberg, Y.; Nivre, J., A dynamic oracle for arc-eager dependency parsing, (Proceedings of COLING 2012. Proceedings of COLING 2012, Mumbai, India (2012), Association for Computational Linguistics), 959-976
[23] Goldberg, Y.; Nivre, J., Training deterministic parsers with non-deterministic oracles, Trans. Assoc. Comput. Linguistics, 1, 403-414 (2013)
[24] Gómez-Rodríguez, C., Finding the smallest binarization of a CFG is NP-hard, J. Comput. Syst. Sci., 80, 4, 796-805 (2014) · Zbl 1285.68084
[25] Gómez-Rodríguez, C.; Alonso-Alonso, I.; Vilares, D., How important is syntactic parsing accuracy? An empirical evaluation on rule-based sentiment analysis, Artif. Intell. Rev. (2017)
[26] Graves, A.; Schmidhuber, J., Framewise phoneme classification with bidirectional lstm and other neural network architectures, Neural Netw., 5-6 (2005)
[27] R. Higashinaka, N. Kobayashi, T. Hirano, C. Miyazaki, T. Meguro, T. Makino, Y. Matsuo, Syntactic filtering and content-based retrieval of twitter sentences for the generation of system utterances in dialogue systems, 2013.; R. Higashinaka, N. Kobayashi, T. Hirano, C. Miyazaki, T. Meguro, T. Makino, Y. Matsuo, Syntactic filtering and content-based retrieval of twitter sentences for the generation of system utterances in dialogue systems, 2013.
[28] Hochreiter, S.; Schmidhuber, J., Long short-term memory, Neural Comput., 9, 8, 1735-1780 (1997)
[29] Jie, Z.; Muis, A.; Lu, W., Efficient dependency-guided named entity recognition (2017)
[30] K, V.; Gupta, D., Unmasking text plagiarism using syntactic-semantic based natural language processing techniques: comparisons, analysis and challenges, Inf. Process. Manag., 54, 3, 408-432 (2018)
[31] Kahane, S.; Mazziotta, N., Syntactic polygraphs. a formalism extending both constituency and dependency, (Proceedings of the 14th Meeting on the Mathematics of Language. Proceedings of the 14th Meeting on the Mathematics of Language, MoL 2015, Chicago, USA (2015), Association for Computational Linguistics), 152-164 · Zbl 1376.68142
[32] Kiperwasser, E.; Goldberg, Y., Simple and accurate dependency parsing using bidirectional LSTM feature representations, Trans. Assoc. Comput. Linguistics, 4, 313-327 (2016)
[33] Kitaev, N.; Klein, D., Constituency parsing with a self-attentive encoder, (Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers, ACL 2018, Melbourne, Australia, July 15-20, 2018 (2018)), 2675-2685
[34] Klein, D.; Manning, C. D., Accurate unlexicalized parsing, (Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics. Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, ACL (2003)), 423-430
[35] Kuncoro, A.; Ballesteros, M.; Kong, L.; Dyer, C.; Neubig, G.; Smith, N. A., What do recurrent neural network grammars learn about syntax?, (Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers (2017), Association for Computational Linguistics), 1249-1258
[36] Li, J.; Xiong, D.; Tu, Z.; Zhu, M.; Zhang, M.; Zhou, G., Modeling source syntax for neural machine translation, (Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2017), Association for Computational Linguistics), 688-697
[37] Liu, J.; Zhang, Y., In-order transition-based constituent parsing, Trans. Assoc. Comput. Linguistics, 5, 413-424 (2017)
[38] Liu, J.; Zhang, Y., Shift-reduce constituent parsing with neural lookahead features, Trans. Assoc. Comput. Linguistics, 5, 45-58 (2017)
[39] Liu, L.; Zhu, M.; Shi, S., Improving sequence-to-sequence constituency parsing (2018)
[40] Ma, C.; Liu, L.; Tamura, A.; Zhao, T.; Sumita, E., Deterministic attention for sequence-to-sequence constituent parsing (2017)
[41] Marcus, M. P.; Santorini, B.; Marcinkiewicz, M. A., Building a large annotated corpus of English: the Penn Treebank, Comput. Linguist., 19, 313-330 (1993)
[42] Mi, H.; Huang, L., Shift-reduce constituency parsing with dynamic programming and pos tag lattice, (Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado (2015), Association for Computational Linguistics), 1030-1035
[43] Miceli Barone, A. V.; Attardi, G., Non-projective dependency-based pre-reordering with recurrent neural network for machine translation, (Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (2015), Association for Computational Linguistics), 846-856
[44] Mueller, T.; Schmid, H.; Schütze, H., Efficient higher-order crfs for morphological tagging, (Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (2013), Association for Computational Linguistics), 322-332
[45] Ng, V., Supervised noun phrase coreference research: the first fifteen years, (Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL ’10, Stroudsburg, PA, USA (2010), Association for Computational Linguistics), 1396-1411
[46] Nivre, J., An efficient algorithm for projective dependency parsing, (Proceedings of the 8th International Workshop on Parsing Technologies. Proceedings of the 8th International Workshop on Parsing Technologies, IWPT (2003)), 149-160
[47] Nivre, J.; Hall, J.; Nilsson, J., Memory-based dependency parsing, (Proceedings of the 8th Conference on Computational Natural Language Learning (2004)), 49-56
[48] Nivre, J.; Hall, J.; Nilsson, J., Memory-based dependency parsing, (Proceedings of the 8th Conference on Computational Natural Language Learning. Proceedings of the 8th Conference on Computational Natural Language Learning, CoNLL-2004 (2004)), 49-56
[49] Petrov, S.; Klein, D., Improved inference for unlexicalized parsing, (Proceedings of Human Language Technologies: the Annual Conference of the North American Chapter of the Association for Computational Linguistics. Proceedings of Human Language Technologies: the Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL HLT (2007)), 404-411
[50] Rajpurkar, P.; Zhang, J.; Lopyrev, K.; Liang, P., Squad: 100,000+ questions for machine comprehension of text, (Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (2016), Association for Computational Linguistics), 2383-2392
[51] Sagae, K.; Lavie, A., A classifier-based parser with linear run-time complexity, (Proceedings of the 9th International Workshop on Parsing Technologies. Proceedings of the 9th International Workshop on Parsing Technologies, IWPT (2005)), 125-132
[52] Saif, H.; He, Y.; Fernandez, M.; Alani, H., Contextual semantics for sentiment analysis of Twitter, Inf. Process. Manag., 52, 1, 5-19 (2016), emotion and Sentiment in Social and Expressive Media
[53] Stern, M.; Andreas, J.; Klein, D., A minimal span-based neural constituency parser, (Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers, ACL 2017, Vancouver, Canada, July 30-August 4 (2018)), 3772-3782
[54] Swayamdipta, S.; Thomson, S.; Lee, K.; Zettlemoyer, L.; Dyer, C.; Smith, N. A., Syntactic scaffolds for semantic structures, (Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (2018), Association for Computational Linguistics), 3772-3782
[55] Toutanova, K.; Klein, D.; Manning, C. D.; Singer, Y., Feature-rich part-of-speech tagging with a cyclic dependency network, (Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1. Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1, NAACL ’03, Stroudsburg, PA, USA (2003), Association for Computational Linguistics), 173-180
[56] Vinyals, O.; Kaiser, L.; Koo, T.; Petrov, S.; Sutskever, I.; Hinton, G., Grammar as a foreign language, (Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2. Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2, NIPS’15, Cambridge, MA, USA (2015), MIT Press), 2773-2781
[57] Wang, X.; Pham, H.; Yin, P.; Neubig, G., A tree-based decoder for neural machine translation, (Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31-November 4, 2018 (2018)), 4772-4777
[58] Wang, Z.; Mi, H.; Xue, N., Feature optimization for constituent parsing via neural networks, (Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Volume 1: Long Papers. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Volume 1: Long Papers, ACL 2015, July 26-31, 2015, Beijing, China (2015)), 1138-1147
[59] Watanabe, T.; Sumita, E., Transition-based neural constituent parsing, (Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers, ACL 2015, 26-31 July 2015, Bejing, China (2015)), 1169-1179
[60] Wu, S.; Zhang, D.; Yang, N.; Li, M.; Zhou, M., Sequence-to-dependency neural machine translation, (Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (2017), Association for Computational Linguistics), 698-707
[61] Xiao, T.; Zhu, J.; Zhang, C.; Liu, T., Syntactic skeleton-based translation, (Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI’16 (2016), AAAI Press), 2856-2862
[62] Xue, N.; Xia, F.; Chiou, f. D.; Palmer, M., The Penn Chinese treebank: phrase structure annotation of a large corpus, Nat. Lang. Eng., 11, 2, 207-238 (2005)
[63] Yamada, H.; Matsumoto, Y., Statistical dependency analysis with support vector machines, (Proceedings of 8th International Workshop on Parsing Technologies. Proceedings of 8th International Workshop on Parsing Technologies, IWPT 2003 (2003), ACL/SIGPARSE), 195-206
[64] Yu, M.; Gormley, M. R.; Dredze, M., Combining word embeddings and feature embeddings for fine-grained relation extraction, (Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2015), Association for Computational Linguistics), 1374-1379
[65] Zhang, Y.; Clark, S., A tale of two parsers: investigating and combining graph-based and transition-based dependency parsing, (Proceedings of the Conference on Empirical Methods in Natural Language Processing. Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP (2008)), 562-571
[66] Zhang, Y.; Clark, S., Transition-based parsing of the Chinese treebank using a global discriminative model, (Proceedings of the 11th International Conference on Parsing Technologies. Proceedings of the 11th International Conference on Parsing Technologies, IWPT ’09, Stroudsburg, PA, USA (2009), Association for Computational Linguistics), 162-171
[67] Zhu, M.; Zhang, Y.; Chen, W.; Zhang, M.; Zhu, J., Fast and accurate shift-reduce constituent parsing, (Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers, ACL 2013, 4-9 August 2013, Sofia, Bulgaria (2013)), 434-443
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.