skip to main content
research-article

Regular expression containment: coinductive axiomatization and computational interpretation

Published:26 January 2011Publication History
Skip Abstract Section

Abstract

We present a new sound and complete axiomatization of regular expression containment. It consists of the conventional axiomatization of concatenation, alternation, empty set and (the singleton set containing) the empty string as an idempotent semiring, the fixed- point rule E* = 1 + E × E* for Kleene-star, and a general coinduction rule as the only additional rule.

Our axiomatization gives rise to a natural computational interpretation of regular expressions as simple types that represent parse trees, and of containment proofs as coercions. This gives the axiom- atization a Curry-Howard-style constructive interpretation: Containment proofs do not only certify a language-theoretic contain- ment, but, under our computational interpretation, constructively transform a membership proof of a string in one regular expression into a membership proof of the same string in another regular expression.

We show how to encode regular expression equivalence proofs in Salomaa's, Kozen's and Grabmayer's axiomatizations into our containment system, which equips their axiomatizations with a computational interpretation and implies completeness of our axiomatization. To ensure its soundness, we require that the computational interpretation of the coinduction rule be a hereditarily total function. Hereditary totality can be considered the mother of syn- tactic side conditions: it "explains" their soundness, yet cannot be used as a conventional side condition in its own right since it turns out to be undecidable.

We discuss application of regular expressions as types to bit coding of strings and hint at other applications to the wide-spread use of regular expressions for substring matching, where classical automata-theoretic techniques are a priori inapplicable.

Neither regular expressions as types nor subtyping interpreted coercively are novel per se. Somewhat surprisingly, this seems to be the first investigation of a general proof-theoretic framework for the latter in the context of the former, however.

Skip Supplemental Material Section

Supplemental Material

35-mpeg-4.mp4

References

  1. M. Abadi and M. P. Fiore. Syntactic considerations on recursive types. In Proc. 1996 IEEE 11th Annual Symp. on Logic in Computer Science (LICS), New Brunswick, New Jersey. IEEE Computer Society Press, June 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. M. Amadio and L. Cardelli. Subtyping recursive types. AC. Transactions on Programming Languages and Systems (TOPLAS), 15(4):575--631, September 1993 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. V. Antimirov. Rewriting regular inequalities. In Proc. 10th International Conference, FCT '95 Dresden, Germany, volume 965 of Lecture Notes in Computer Science (LNCS), pages 116--125. Springer-Verlag, August 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. V. Antimirov. Partial derivatives of regular expressions and finite automaton constructions. Theor. Comput. Sci., 155(2):291--319, 1996. ISSN 0304-3975. doi: http://dx.doi.org/10.1016/0304--3975(95)00182-4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. V. M. Antimirov and P. D. Mosses. Rewriting extended regular expressions. Theor. Comput. Sci., 143(1):51--72, 1995. doi:http://dx.doi.org/10.1016/0304-3975(95)80024-4 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. S. Auerbach, C. Barton, M. Chu-Carroll, and M. Raghavachari. Mockingbird: Flexible stub compilation from pairs of declarations. In ICDCS, pages 393--402, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. Brabrand and J. Thomsen. Typed and unambiguous pattern matching on strings using regular expressions. In Proc. 12th International ACM SIGPLAN Symposium on Principles and Practice of Declarative Programming (PPDP), 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Brandt and F. Henglein. Coinductive axiomatization of recursive type equality and subtyping. Fundamenta Informaticae, 33:309--338, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. A. Brzozowski. Derivatives of regular expressions. J. ACM, 11(4):481--494, 1964. ISSN 0004-5411. doi: http://doi.acm.org/10.1145/321239.321249. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. H. Chen and R. Pucella. A coalgebraic approach to Kleene algebra with tests. Theor. Comput. Sci., 327(1--2):23--44, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. N. Chomsky. On certain formal properties of grammars*. Information and control, 2(2):137--167, 1959.Google ScholarGoogle ScholarCross RefCross Ref
  12. J. H. Conway. Regular Algebra and Finite Machines. Printed in GB by William Clowes & Sons Ltd, 1971. ISBN 0-412-10620-5.Google ScholarGoogle Scholar
  13. R. Di Cosmo, F. Pottier, and D. Remy. Subtyping recursive types modulo associative commutative products. In Proc. Seventh International Conference on Typed Lambda Calculi and Applications (TLCA 2005), 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Fiore. Isomorphisms of generic recursive polynomial types. SIGPLAN Not., 39(1):77--88, 2004. ISSN 0362-1340. doi.http://doi.acm.org/10.1145/982962.964008 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. P. Fiore. A coinduction principle for recursive data types based on bisimulation. Information and Computation, 127:186--198, 1996. Conference version: Proc. 8th Annual IEEE Symp. on Logic in Computer Science (LICS), 1993, pp. 110--119.Google ScholarGoogle Scholar
  16. A. Frisch and L. Cardelli. Greedy regular expression matching. In Proc. 31st International Colloquium on Automata, Languages and Programming (ICALP), volume 3142 of Lecture notes in computer science, pages 618--629, Turku, Finland, July 2004. Springer.Google ScholarGoogle Scholar
  17. V. Gapeyev, M. Y. Levin, and B. C. Pierce. Recursive subtyping revealed. J. Funct. Program., 12(6):511--548, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Ginzburg. A procedure for checking equality of regular expressions. J. ACM, 14(2):355--362, 1967. ISSN 0004-5411. doi:http://doi.acm.org/10.1145/321386.321399 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. C. Grabmayer. Using proofs by coinduction to find "traditional" proofs. In Proc. 1st Conference on Algebra and Coalgebra in Computer Science (CALCO), number 3629 in Lecture Notes in Computer Science (LNCS). Springer, September 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. Hopcroft and J. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. H. Hosoya, A. Frisch, and G. Castagna. Parametric polymorphism for XML. In J. Palsberg and M. Abadi, editors, POPL, pages 50--62. ACM, 2005a. ISBN 1-58113-830-X. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. H. Hosoya, J. Vouillon, and B. C. Pierce. Regular expression types for XML. ACM Trans. Program. Lang. Syst., 27(1):46--90, 2005b. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Institute of Electrical and Electronics Engineers (IEEE). Standard for information technology---Portable Operating System Interface (POSIX) --- Part 2 (Shell and utilities), Section 2.8 (Regular expression notation). New York, 1992. IEEE Standard 1003.2.Google ScholarGoogle Scholar
  24. P. Jansson and J. Jeuring. Polyp---a polytypic programming language extension. In Proceedings of the 24th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), page 482. ACM, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. C. Kleene. Representation of events in nerve nets and finite automata. Automata Studies, 1956.Google ScholarGoogle ScholarCross RefCross Ref
  26. D. Kozen. A completeness theorem for Kleene algebras and the algebra of regular events. Information and Computation, 110(2):366--390, May 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. D. Kozen. Kleene algebra with tests. Transactions on Programming Languages and Systems, 19(3):427--443, May 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. D. Kozen. On the coalgebraic theory of Kleene algebra with tests. Technical report, Computing and Information Science, Cornell University, March 2008. URL http://hdl.handle.net/1813/10173.Google ScholarGoogle Scholar
  29. D. Kozen, J. Palsberg, and M. Schwartzbach. Efficient recursive subtyping. Mathematical Structures in Computer Science (MSCS), 5(1), 1995. Conference version presented at the 20th Annual ACM SIGPLAN SIGACT Symp. on Principles of Programming Languages (POPL), 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. D. Krob. A complete system of b-rational identities. In M. Paterson, editor, ICALP, volume 443 of Lecture Notes in Computer Science, pages 60--73. Springer, 1990. ISBN 3-540-52826-1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. K. Z. M. Lu and M. Sulzmann. Rewriting regular inequalities. In Proc. Second Asian Symposium, APLAS 2004, Taipei, Taiwan, November 4-6, 2004, volume 3302 of Lecture Notes in Computer Science (LNCS), pages 57--73. Springer, November 2004.Google ScholarGoogle Scholar
  32. R. Milner. A complete inference system for a class of regular behaviours. J. Comput. Syst. Sci., 28(3):439--466, 1984.Google ScholarGoogle ScholarCross RefCross Ref
  33. G. C. Necula and S. P. Rahul. Oracle-based checking of untrusted software. In POPL, pages 142--154, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. L. Nielsen. A coinductive axiomatization of XML subtyping. Graduate term project report, DIKU, University of Copenhagen, 2008.Google ScholarGoogle Scholar
  35. V. Pratt. Action logic and pure induction. In Proc. Logics in AI: European Workshop JELIA, volume 478 of Lecture Notes in Computer Science (LNCS), pages 97--120. Springer, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. G. Rosu and J. Goguen. Circular coinduction. In Proc. International Joint Conference on Automated Reasoning, 2000.Google ScholarGoogle Scholar
  37. G. Rosu and D. Lucanu. Circular coinduction: A proof theoretical foundation. In Proc. 3rd Conference on Algebra and Coalgebra in Computer Science (CALCO), number 5728 in Lecture Notes in Computer Science (LNCS), pages 127--144. Springer, September 2009. ISBN 978-3-642-03740-5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. J. J. M. M. Rutten. Automata and coinduction (an exercise in coalgebra). In D. Sangiorgi and R. de Simone, editors, CONCUR, volume 1466 of Lecture Notes in Computer Science, pages 194--218. Springer, 1998. ISBN 3-540-64896-8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. A. Salomaa. Two complete axiom systems for the algebra of regular events. J. ACM, 13(1):158--169, 1966. ISSN 0004-5411. doi:http://doi.acm.org/10.1145/321312.321326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. J. Seward. Bzip. URL http://www.bzip.org/.Google ScholarGoogle Scholar
  41. A. Silva, M. M. Bonsangue, and J. J. M. M. Rutten. Non-deterministic Kleene coalgebras. Logical Methods in Computer Science, 6(3), 2010. URL http://arxiv.org/abs/1007.3769.Google ScholarGoogle Scholar
  42. M. Sulzmann and K. Z. M. Lu. XHaskell - adding regular expression types to Haskell. In O. Chitil, Z. Horv´ath, and V. Zsók, editors, IFL, volume 5083 of Lecture Notes in Computer Science, pages 75--92. Springer, 2007. ISBN 978-3-540-85372-5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. G. Winskel. The Formal Semantics of Programming Languages: An Introduction. Foundations of Computing series. MIT Press, Feb. 1993. ISBN 0-262-23169-7. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Regular expression containment: coinductive axiomatization and computational interpretation

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!