zbMATH — the first resource for mathematics

Simple, functional, sound and complete parsing for all context-free grammars. (English) Zbl 1350.68164
Jouannaud, Jean-Pierre (ed.) et al., Certified programs and proofs. First international conference, CPP 2011, Kenting, Taiwan, December 7–9, 2011. Proceedings. Berlin: Springer (ISBN 978-3-642-25378-2/pbk). Lecture Notes in Computer Science 7086, 103-118 (2011).
Summary: Parsers for context-free grammars can be implemented directly and naturally in a functional style known as “combinator parsing”, using recursion following the structure of the grammar rules. However, naive implementations fail to terminate on left-recursive grammars, and despite extensive research the only complete parsers for general context-free grammars are constructed using other techniques such as Earley parsing. Our main contribution is to show how to construct simple, sound and complete parser implementations directly from grammar specifications, for all context-free grammars, based on combinator parsing. We then construct a generic parser generator and show that generated parsers are sound and complete. The formal proofs are mechanized using the HOL4 theorem prover. Memoized parsers based on our approach are polynomial-time in the size of the input. Preliminary real-world performance testing on highly ambiguous grammars indicates our parsers are faster than those generated by the popular Happy parser generator.
For the entire collection see [Zbl 1226.68005].

68Q42 Grammars and rewriting systems
68T15 Theorem proving (deduction, resolution, etc.) (MSC2010)
CompCert; HOL; TRX
Full Text: DOI
[1] Happy, a parser generator for Haskell, http://www.haskell.org/happy/
[2] Barthwal, A., Norrish, M.: Verified, Executable Parsing. In: Castagna, G. (ed.) ESOP 2009. LNCS, vol. 5502, pp. 160–174. Springer, Heidelberg (2009) · Zbl 1234.68359 · doi:10.1007/978-3-642-00590-9_12
[3] Danielsson, N.A.: Total parser combinators. In: Hudak, P., Weirich, S. (eds.) ICFP, pp. 285–296. ACM (2010) · Zbl 1323.68330 · doi:10.1145/1863543.1863585
[4] Earley, J.: An efficient context-free parsing algorithm. Commun. ACM 13(2), 94–102 (1970) · Zbl 0185.43401 · doi:10.1145/362007.362035
[5] Ford, B.: Packrat parsing: simple, powerful, lazy, linear time, functional pearl. In: ICFP 2002: Proceedings of the Seventh ACM SIGPLAN International Conference on Functional Programming, New York, NY, USA, vol. 37/9, pp. 36–47. ACM (2002) · Zbl 1322.68040 · doi:10.1145/581478.581483
[6] Frost, R.A., Hafiz, R., Callaghan, P.: Parser Combinators for Ambiguous Left-Recursive Grammars. In: Hudak, P., Warren, D.S. (eds.) PADL 2008. LNCS, vol. 4902, pp. 167–181. Springer, Heidelberg (2008) · Zbl 05243243 · doi:10.1007/978-3-540-77442-6_12
[7] Frost, R.A., Hafiz, R., Callaghan, P.C.: Modular and efficient top-down parsing for ambiguous left-recursive grammars. In: IWPT 2007: Proceedings of the 10th International Conference on Parsing Technologies, Morristown, NJ, USA, pp. 109–120. Association for Computational Linguistics (2007) · doi:10.3115/1621410.1621425
[8] Hafiz, R., Frost, R.A.: Lazy Combinators for Executable Specifications of General Attribute Grammars. In: Carro, M., Peña, R. (eds.) PADL 2010. LNCS, vol. 5937, pp. 167–182. Springer, Heidelberg (2010) · Zbl 05656277 · doi:10.1007/978-3-642-11503-5_15
[9] Hutton, G.: Higher-order functions for parsing. J. Funct. Program. 2(3), 323–343 (1992) · Zbl 0817.68097 · doi:10.1017/S0956796800000411
[10] Kasami, T.: An efficient recognition and syntax analysis algorithm for context-free languages. Technical Report AFCRL-65-758, Air Force Cambridge Research Laboratory, Bedford, Massachusetts (1965)
[11] Koprowski, A., Binsztok, H.: TRX: A Formally Verified Parser Interpreter. In: Gordon, A.D. (ed.) ESOP 2010. LNCS, vol. 6012, pp. 345–365. Springer, Heidelberg (2010) · Zbl 1260.68194 · doi:10.1007/978-3-642-11957-6_19
[12] Kuno, S.: The predictive analyzer and a path elimination technique. Commun. ACM 8(7), 453–462 (1965) · Zbl 0133.39609 · doi:10.1145/364995.365689
[13] Leroy, X.: Formal verification of a realistic compiler. Communications of the ACM (April 2009) · Zbl 05747873 · doi:10.1145/1538788.1538814
[14] Pratt, V.R.: Top down operator precedence. In: Proceedings ACM Symposium on Principles Prog. Languages (1973) · Zbl 0329.68028 · doi:10.1145/512927.512931
[15] Ridge, T.: Simple, functional, sound and complete parsing for all context-free grammars (2010) unpublished draft, http://www.cs.le.ac.uk/ tr61 · Zbl 1350.68164
[16] Tomita, M.: Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems. Kluwer, Boston (1986) · doi:10.1007/978-1-4757-1885-0
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.