Abstract
We present a new sound and complete axiomatization of regular expression containment. It consists of the conventional axiomatization of concatenation, alternation, empty set and (the singleton set containing) the empty string as an idempotent semiring, the fixed- point rule E* = 1 + E × E* for Kleene-star, and a general coinduction rule as the only additional rule.
Our axiomatization gives rise to a natural computational interpretation of regular expressions as simple types that represent parse trees, and of containment proofs as coercions. This gives the axiom- atization a Curry-Howard-style constructive interpretation: Containment proofs do not only certify a language-theoretic contain- ment, but, under our computational interpretation, constructively transform a membership proof of a string in one regular expression into a membership proof of the same string in another regular expression.
We show how to encode regular expression equivalence proofs in Salomaa's, Kozen's and Grabmayer's axiomatizations into our containment system, which equips their axiomatizations with a computational interpretation and implies completeness of our axiomatization. To ensure its soundness, we require that the computational interpretation of the coinduction rule be a hereditarily total function. Hereditary totality can be considered the mother of syn- tactic side conditions: it "explains" their soundness, yet cannot be used as a conventional side condition in its own right since it turns out to be undecidable.
We discuss application of regular expressions as types to bit coding of strings and hint at other applications to the wide-spread use of regular expressions for substring matching, where classical automata-theoretic techniques are a priori inapplicable.
Neither regular expressions as types nor subtyping interpreted coercively are novel per se. Somewhat surprisingly, this seems to be the first investigation of a general proof-theoretic framework for the latter in the context of the former, however.
Supplemental Material
- M. Abadi and M. P. Fiore. Syntactic considerations on recursive types. In Proc. 1996 IEEE 11th Annual Symp. on Logic in Computer Science (LICS), New Brunswick, New Jersey. IEEE Computer Society Press, June 1996. Google Scholar
Digital Library
- R. M. Amadio and L. Cardelli. Subtyping recursive types. AC. Transactions on Programming Languages and Systems (TOPLAS), 15(4):575--631, September 1993 Google Scholar
Digital Library
- V. Antimirov. Rewriting regular inequalities. In Proc. 10th International Conference, FCT '95 Dresden, Germany, volume 965 of Lecture Notes in Computer Science (LNCS), pages 116--125. Springer-Verlag, August 1995. Google Scholar
Digital Library
- V. Antimirov. Partial derivatives of regular expressions and finite automaton constructions. Theor. Comput. Sci., 155(2):291--319, 1996. ISSN 0304-3975. doi: http://dx.doi.org/10.1016/0304--3975(95)00182-4. Google Scholar
Digital Library
- V. M. Antimirov and P. D. Mosses. Rewriting extended regular expressions. Theor. Comput. Sci., 143(1):51--72, 1995. doi:http://dx.doi.org/10.1016/0304-3975(95)80024-4 Google Scholar
Digital Library
- J. S. Auerbach, C. Barton, M. Chu-Carroll, and M. Raghavachari. Mockingbird: Flexible stub compilation from pairs of declarations. In ICDCS, pages 393--402, 1999. Google Scholar
Digital Library
- C. Brabrand and J. Thomsen. Typed and unambiguous pattern matching on strings using regular expressions. In Proc. 12th International ACM SIGPLAN Symposium on Principles and Practice of Declarative Programming (PPDP), 2010. Google Scholar
Digital Library
- M. Brandt and F. Henglein. Coinductive axiomatization of recursive type equality and subtyping. Fundamenta Informaticae, 33:309--338, 1998. Google Scholar
Digital Library
- J. A. Brzozowski. Derivatives of regular expressions. J. ACM, 11(4):481--494, 1964. ISSN 0004-5411. doi: http://doi.acm.org/10.1145/321239.321249. Google Scholar
Digital Library
- H. Chen and R. Pucella. A coalgebraic approach to Kleene algebra with tests. Theor. Comput. Sci., 327(1--2):23--44, 2004. Google Scholar
Digital Library
- N. Chomsky. On certain formal properties of grammars*. Information and control, 2(2):137--167, 1959.Google Scholar
Cross Ref
- J. H. Conway. Regular Algebra and Finite Machines. Printed in GB by William Clowes & Sons Ltd, 1971. ISBN 0-412-10620-5.Google Scholar
- R. Di Cosmo, F. Pottier, and D. Remy. Subtyping recursive types modulo associative commutative products. In Proc. Seventh International Conference on Typed Lambda Calculi and Applications (TLCA 2005), 2005. Google Scholar
Digital Library
- M. Fiore. Isomorphisms of generic recursive polynomial types. SIGPLAN Not., 39(1):77--88, 2004. ISSN 0362-1340. doi.http://doi.acm.org/10.1145/982962.964008 Google Scholar
Digital Library
- M. P. Fiore. A coinduction principle for recursive data types based on bisimulation. Information and Computation, 127:186--198, 1996. Conference version: Proc. 8th Annual IEEE Symp. on Logic in Computer Science (LICS), 1993, pp. 110--119.Google Scholar
- A. Frisch and L. Cardelli. Greedy regular expression matching. In Proc. 31st International Colloquium on Automata, Languages and Programming (ICALP), volume 3142 of Lecture notes in computer science, pages 618--629, Turku, Finland, July 2004. Springer.Google Scholar
- V. Gapeyev, M. Y. Levin, and B. C. Pierce. Recursive subtyping revealed. J. Funct. Program., 12(6):511--548, 2002. Google Scholar
Digital Library
- A. Ginzburg. A procedure for checking equality of regular expressions. J. ACM, 14(2):355--362, 1967. ISSN 0004-5411. doi:http://doi.acm.org/10.1145/321386.321399 Google Scholar
Digital Library
- C. Grabmayer. Using proofs by coinduction to find "traditional" proofs. In Proc. 1st Conference on Algebra and Coalgebra in Computer Science (CALCO), number 3629 in Lecture Notes in Computer Science (LNCS). Springer, September 2005. Google Scholar
Digital Library
- J. Hopcroft and J. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, 1979. Google Scholar
Digital Library
- H. Hosoya, A. Frisch, and G. Castagna. Parametric polymorphism for XML. In J. Palsberg and M. Abadi, editors, POPL, pages 50--62. ACM, 2005a. ISBN 1-58113-830-X. Google Scholar
Digital Library
- H. Hosoya, J. Vouillon, and B. C. Pierce. Regular expression types for XML. ACM Trans. Program. Lang. Syst., 27(1):46--90, 2005b. Google Scholar
Digital Library
- Institute of Electrical and Electronics Engineers (IEEE). Standard for information technology---Portable Operating System Interface (POSIX) --- Part 2 (Shell and utilities), Section 2.8 (Regular expression notation). New York, 1992. IEEE Standard 1003.2.Google Scholar
- P. Jansson and J. Jeuring. Polyp---a polytypic programming language extension. In Proceedings of the 24th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), page 482. ACM, 1997. Google Scholar
Digital Library
- S. C. Kleene. Representation of events in nerve nets and finite automata. Automata Studies, 1956.Google Scholar
Cross Ref
- D. Kozen. A completeness theorem for Kleene algebras and the algebra of regular events. Information and Computation, 110(2):366--390, May 1994. Google Scholar
Digital Library
- D. Kozen. Kleene algebra with tests. Transactions on Programming Languages and Systems, 19(3):427--443, May 1997. Google Scholar
Digital Library
- D. Kozen. On the coalgebraic theory of Kleene algebra with tests. Technical report, Computing and Information Science, Cornell University, March 2008. URL http://hdl.handle.net/1813/10173.Google Scholar
- D. Kozen, J. Palsberg, and M. Schwartzbach. Efficient recursive subtyping. Mathematical Structures in Computer Science (MSCS), 5(1), 1995. Conference version presented at the 20th Annual ACM SIGPLAN SIGACT Symp. on Principles of Programming Languages (POPL), 1993. Google Scholar
Digital Library
- D. Krob. A complete system of b-rational identities. In M. Paterson, editor, ICALP, volume 443 of Lecture Notes in Computer Science, pages 60--73. Springer, 1990. ISBN 3-540-52826-1. Google Scholar
Digital Library
- K. Z. M. Lu and M. Sulzmann. Rewriting regular inequalities. In Proc. Second Asian Symposium, APLAS 2004, Taipei, Taiwan, November 4-6, 2004, volume 3302 of Lecture Notes in Computer Science (LNCS), pages 57--73. Springer, November 2004.Google Scholar
- R. Milner. A complete inference system for a class of regular behaviours. J. Comput. Syst. Sci., 28(3):439--466, 1984.Google Scholar
Cross Ref
- G. C. Necula and S. P. Rahul. Oracle-based checking of untrusted software. In POPL, pages 142--154, 2001. Google Scholar
Digital Library
- L. Nielsen. A coinductive axiomatization of XML subtyping. Graduate term project report, DIKU, University of Copenhagen, 2008.Google Scholar
- V. Pratt. Action logic and pure induction. In Proc. Logics in AI: European Workshop JELIA, volume 478 of Lecture Notes in Computer Science (LNCS), pages 97--120. Springer, 1990. Google Scholar
Digital Library
- G. Rosu and J. Goguen. Circular coinduction. In Proc. International Joint Conference on Automated Reasoning, 2000.Google Scholar
- G. Rosu and D. Lucanu. Circular coinduction: A proof theoretical foundation. In Proc. 3rd Conference on Algebra and Coalgebra in Computer Science (CALCO), number 5728 in Lecture Notes in Computer Science (LNCS), pages 127--144. Springer, September 2009. ISBN 978-3-642-03740-5. Google Scholar
Digital Library
- J. J. M. M. Rutten. Automata and coinduction (an exercise in coalgebra). In D. Sangiorgi and R. de Simone, editors, CONCUR, volume 1466 of Lecture Notes in Computer Science, pages 194--218. Springer, 1998. ISBN 3-540-64896-8. Google Scholar
Digital Library
- A. Salomaa. Two complete axiom systems for the algebra of regular events. J. ACM, 13(1):158--169, 1966. ISSN 0004-5411. doi:http://doi.acm.org/10.1145/321312.321326. Google Scholar
Digital Library
- J. Seward. Bzip. URL http://www.bzip.org/.Google Scholar
- A. Silva, M. M. Bonsangue, and J. J. M. M. Rutten. Non-deterministic Kleene coalgebras. Logical Methods in Computer Science, 6(3), 2010. URL http://arxiv.org/abs/1007.3769.Google Scholar
- M. Sulzmann and K. Z. M. Lu. XHaskell - adding regular expression types to Haskell. In O. Chitil, Z. Horv´ath, and V. Zsók, editors, IFL, volume 5083 of Lecture Notes in Computer Science, pages 75--92. Springer, 2007. ISBN 978-3-540-85372-5. Google Scholar
Digital Library
- G. Winskel. The Formal Semantics of Programming Languages: An Introduction. Foundations of Computing series. MIT Press, Feb. 1993. ISBN 0-262-23169-7. Google Scholar
Digital Library
Index Terms
Regular expression containment: coinductive axiomatization and computational interpretation
Recommendations
Regular expression containment: coinductive axiomatization and computational interpretation
POPL '11: Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languagesWe present a new sound and complete axiomatization of regular expression containment. It consists of the conventional axiomatization of concatenation, alternation, empty set and (the singleton set containing) the empty string as an idempotent semiring, ...
Coinductive Axiomatization of Recursive Type Equality and Subtyping
We present new sound and complete axiomatizations of type equality and subtype inequality for a first-order type language with regular recursive types. The rules are motivated by coinductive characterizations of type containment and type equality via ...
A regular expression matching circuit: Decomposed non-deterministic realization with prefix sharing and multi-character transition
This paper shows a compact realization of regular expression matching circuits on FPGAs. First, the given regular expression is converted into a non-deterministic finite automaton (NFA) by the modified McNaughton-Yamada method. Second, to reduce the ...







Comments