skip to main content
article

Regenerate: a language generator for extended regular expressions

Published:07 April 2020Publication History
Skip Abstract Section

Abstract

Regular expressions are part of every programmer’s toolbox. They are used for a wide variety of language-related tasks and there are many algorithms for manipulating them. In particular, matching algorithms that detect whether a word belongs to the language described by a regular expression are well explored, yet new algorithms appear frequently. However, there is no satisfactory methodology for testing such matchers. We propose a testing methodology which is based on generating positive as well as negative examples of words in the language. To this end, we present a new algorithm to generate the language described by a generalized regular expression with intersection and complement operators. The complement operator allows us to generate both positive and negative example words from a given regular expression. We implement our generator in Haskell and OCaml and show that its performance is more than adequate for testing.

References

  1. Margareta Ackerman and Erkki Mäkinen. 2009. Three New Algorithms for Regular Language Enumeration. In COCOON (Lecture Notes in Computer Science), Vol. 5609. Springer, 178–191. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Margareta Ackerman and Jeffrey Shallit. 2009. Efficient Enumeration of Words in Regular Languages. Theor. Comput. Sci. 410, 37 (2009), 3461– 3470. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Valentin M. Antimirov. 1996. Partial derivatives of regular expressions and finite automaton constructions. Theoretical Computer Science 155, 2 (1996), 291–319. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Janusz A. Brzozowski. 1964. Derivatives of Regular Expressions. J. ACM 11, 4 (1964), 481–494. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Koen Claessen and John Hughes. 2000. QuickCheck: a lightweight tool for random testing of Haskell programs. In Proceedings of the Fifth ACM SIGPLAN International Conference on Functional Programming (ICFP ’00), September 18-21, 2000., Martin Odersky and Philip Wadler (Eds.). ACM, Montreal, Canada, 268–279. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Russ Cox. 2007. Implementing Regular Expressions. (2007). https://swtch. com/~rsc/regexp/ .Google ScholarGoogle Scholar
  7. Russ Cox. 2010. Regular Expression Matching in the Wild. (March 2010). https://swtch.com/~rsc/regexp/regexp3.html .Google ScholarGoogle Scholar
  8. Stephen Dolan and Mindy Preston. 2017. Testing with Crowbar. (2017).Google ScholarGoogle Scholar
  9. Sebastian Fischer, Frank Huch, and Thomas Wilke. 2010. A play on regular expressions: functional pearl. In Proceeding of the 15th ACM SIGPLAN international conference on Functional programming, ICFP 2010, Baltimore, Maryland, USA, September 27-29, 2010, Paul Hudak and Stephanie Weirich (Eds.). ACM, 357–368. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Edward Fredkin. 1960. Trie Memory. 3 (Sept. 1960), 490–499. Issue 9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Benoît Groz and Sebastian Maneth. 2017. Efficient testing and matching of deterministic regular expressions. J. Comput. Syst. Sci. 89 (2017), 372–399.Google ScholarGoogle ScholarCross RefCross Ref
  12. John E. Hopcroft, Rajeev Motwani, and Jeffrey D. Ullman. 2003. Introduction to automata theory, languages, and computation - international edition (2. ed). Addison-Wesley.Google ScholarGoogle Scholar
  13. Ralf Lämmel. 2001. Grammar Testing. In Fundamental Approaches to Software Engineering, 4th International Conference, FASE 2001 Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2001 Genova, Italy, April 2-6, 2001, Proceedings (Lecture Notes in Computer Science), Heinrich Hußmann (Ed.), Vol. 2029. Springer, 201–216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Hu Li, Maozhong Jin, Chao Liu, and Zhongyi Gao. 2004. Test Criteria for Context-Free Grammars. In 28th International Computer Software and Applications Conference (COMPSAC 2004), Design and Assessment of Trustworthy Software-Based Systems, 27-30 September 2004, Hong Kong, China, Proceedings. IEEE Computer Society, 300–305. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Erkki Mäkinen. 1997. On Lexicographic Enumeration of Regular and Context-Free Languages. Acta Cybern. 13, 1 (1997), 55–61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Douglas McIlroy. 2004. Enumerating the Strings of Regular Languages. J. Funct. Program. 14, 5 (2004), 503–518. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jayadev Misra. 2000. Enumerating the Strings of a Regular Expression. (Aug. 2000). https://www.cs.utexas.edu/users/misra/Notes.dir/RegExp.pdf .Google ScholarGoogle Scholar
  18. Max S. New, Burke Fetscher, Robert Bruce Findler, and Jay A. McCarthy. 2017. Fair enumeration combinators. J. Funct. Program. 27 (2017), e19.Google ScholarGoogle ScholarCross RefCross Ref
  19. Chris Okasaki and Andrew Gill. 1998. Fast Mergeable Integer Maps. In In Workshop on ML. 77–86.Google ScholarGoogle Scholar
  20. A. M. Paracha and Frantisek Franek. 2008. Testing Grammars For TopDown Parsers. In Innovations and Advances in Computer Sciences and Engineering, Volume I of the proceedings of the 2008 International Conference on Systems, Computing Sciences and Software Engineering (SCSS), part of the International Joint Conferences on Computer, Information, and Systems Sciences, and Engineering, CISSE 2008, Bridgeport, Connecticut, USA, Tarek M. Sobh (Ed.). Springer, 451–456.Google ScholarGoogle Scholar
  21. François Pottier. 2017. Verifying a Hash Table and its Iterators in HigherOrder Separation Logic. In Proceedings of the 6th ACM SIGPLAN Conference on Certified Programs and Proofs, CPP 2017, Paris, France, January 16-17, 2017, Yves Bertot and Viktor Vafeiadis (Eds.). ACM, 3–16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Paul Purdom. 1972. A Sentence Generator for Testing Parsers. BIT 12, 3 (1972), 366–375.Google ScholarGoogle ScholarCross RefCross Ref
  23. Ken Thompson. 1968. Regular Expression Search Algorithm. Commun. ACM 11, 6 (1968), 419–422. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Jeffrey Scott Vitter. 1987. An efficient algorithm for sequential random sampling. ACM Trans. Math. Softw. 13, 1 (1987), 58–67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Zalewski. 2014. http://lcamtuf.coredump.cx/afl/Google ScholarGoogle Scholar
  26. Lixiao Zheng and Duanyi Wu. 2009. A Sentence Generation Algorithm for Testing Grammars. In Proceedings of the 33rd Annual IEEE International Computer Software and Applications Conference, COMPSAC 2009, Seattle, Washington, USA, July 20-24, 2009. Volume 1, Sheikh Iqbal Ahamed, Elisa Bertino, Carl K. Chang, Vladimir Getov, Lin Liu, Hua Ming, and Rajesh Subramanyan (Eds.). IEEE Computer Society, 130–135. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Regenerate: a language generator for extended regular expressions

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM SIGPLAN Notices
            ACM SIGPLAN Notices  Volume 53, Issue 9
            GPCE '18
            September 2018
            214 pages
            ISSN:0362-1340
            EISSN:1558-1160
            DOI:10.1145/3393934
            Issue’s Table of Contents
            • cover image ACM Conferences
              GPCE 2018: Proceedings of the 17th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences
              November 2018
              214 pages
              ISBN:9781450360456
              DOI:10.1145/3278122

            Copyright © 2018 ACM

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 7 April 2020

            Check for updates

            Qualifiers

            • article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!