Abstract
Regular expressions are part of every programmer’s toolbox. They are used for a wide variety of language-related tasks and there are many algorithms for manipulating them. In particular, matching algorithms that detect whether a word belongs to the language described by a regular expression are well explored, yet new algorithms appear frequently. However, there is no satisfactory methodology for testing such matchers. We propose a testing methodology which is based on generating positive as well as negative examples of words in the language. To this end, we present a new algorithm to generate the language described by a generalized regular expression with intersection and complement operators. The complement operator allows us to generate both positive and negative example words from a given regular expression. We implement our generator in Haskell and OCaml and show that its performance is more than adequate for testing.
- Margareta Ackerman and Erkki Mäkinen. 2009. Three New Algorithms for Regular Language Enumeration. In COCOON (Lecture Notes in Computer Science), Vol. 5609. Springer, 178–191. Google Scholar
Digital Library
- Margareta Ackerman and Jeffrey Shallit. 2009. Efficient Enumeration of Words in Regular Languages. Theor. Comput. Sci. 410, 37 (2009), 3461– 3470. Google Scholar
Digital Library
- Valentin M. Antimirov. 1996. Partial derivatives of regular expressions and finite automaton constructions. Theoretical Computer Science 155, 2 (1996), 291–319. Google Scholar
Digital Library
- Janusz A. Brzozowski. 1964. Derivatives of Regular Expressions. J. ACM 11, 4 (1964), 481–494. Google Scholar
Digital Library
- Koen Claessen and John Hughes. 2000. QuickCheck: a lightweight tool for random testing of Haskell programs. In Proceedings of the Fifth ACM SIGPLAN International Conference on Functional Programming (ICFP ’00), September 18-21, 2000., Martin Odersky and Philip Wadler (Eds.). ACM, Montreal, Canada, 268–279. Google Scholar
Digital Library
- Russ Cox. 2007. Implementing Regular Expressions. (2007). https://swtch. com/~rsc/regexp/ .Google Scholar
- Russ Cox. 2010. Regular Expression Matching in the Wild. (March 2010). https://swtch.com/~rsc/regexp/regexp3.html .Google Scholar
- Stephen Dolan and Mindy Preston. 2017. Testing with Crowbar. (2017).Google Scholar
- Sebastian Fischer, Frank Huch, and Thomas Wilke. 2010. A play on regular expressions: functional pearl. In Proceeding of the 15th ACM SIGPLAN international conference on Functional programming, ICFP 2010, Baltimore, Maryland, USA, September 27-29, 2010, Paul Hudak and Stephanie Weirich (Eds.). ACM, 357–368. Google Scholar
Digital Library
- Edward Fredkin. 1960. Trie Memory. 3 (Sept. 1960), 490–499. Issue 9. Google Scholar
Digital Library
- Benoît Groz and Sebastian Maneth. 2017. Efficient testing and matching of deterministic regular expressions. J. Comput. Syst. Sci. 89 (2017), 372–399.Google Scholar
Cross Ref
- John E. Hopcroft, Rajeev Motwani, and Jeffrey D. Ullman. 2003. Introduction to automata theory, languages, and computation - international edition (2. ed). Addison-Wesley.Google Scholar
- Ralf Lämmel. 2001. Grammar Testing. In Fundamental Approaches to Software Engineering, 4th International Conference, FASE 2001 Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2001 Genova, Italy, April 2-6, 2001, Proceedings (Lecture Notes in Computer Science), Heinrich Hußmann (Ed.), Vol. 2029. Springer, 201–216. Google Scholar
Digital Library
- Hu Li, Maozhong Jin, Chao Liu, and Zhongyi Gao. 2004. Test Criteria for Context-Free Grammars. In 28th International Computer Software and Applications Conference (COMPSAC 2004), Design and Assessment of Trustworthy Software-Based Systems, 27-30 September 2004, Hong Kong, China, Proceedings. IEEE Computer Society, 300–305. Google Scholar
Digital Library
- Erkki Mäkinen. 1997. On Lexicographic Enumeration of Regular and Context-Free Languages. Acta Cybern. 13, 1 (1997), 55–61. Google Scholar
Digital Library
- M. Douglas McIlroy. 2004. Enumerating the Strings of Regular Languages. J. Funct. Program. 14, 5 (2004), 503–518. Google Scholar
Digital Library
- Jayadev Misra. 2000. Enumerating the Strings of a Regular Expression. (Aug. 2000). https://www.cs.utexas.edu/users/misra/Notes.dir/RegExp.pdf .Google Scholar
- Max S. New, Burke Fetscher, Robert Bruce Findler, and Jay A. McCarthy. 2017. Fair enumeration combinators. J. Funct. Program. 27 (2017), e19.Google Scholar
Cross Ref
- Chris Okasaki and Andrew Gill. 1998. Fast Mergeable Integer Maps. In In Workshop on ML. 77–86.Google Scholar
- A. M. Paracha and Frantisek Franek. 2008. Testing Grammars For TopDown Parsers. In Innovations and Advances in Computer Sciences and Engineering, Volume I of the proceedings of the 2008 International Conference on Systems, Computing Sciences and Software Engineering (SCSS), part of the International Joint Conferences on Computer, Information, and Systems Sciences, and Engineering, CISSE 2008, Bridgeport, Connecticut, USA, Tarek M. Sobh (Ed.). Springer, 451–456.Google Scholar
- François Pottier. 2017. Verifying a Hash Table and its Iterators in HigherOrder Separation Logic. In Proceedings of the 6th ACM SIGPLAN Conference on Certified Programs and Proofs, CPP 2017, Paris, France, January 16-17, 2017, Yves Bertot and Viktor Vafeiadis (Eds.). ACM, 3–16. Google Scholar
Digital Library
- Paul Purdom. 1972. A Sentence Generator for Testing Parsers. BIT 12, 3 (1972), 366–375.Google Scholar
Cross Ref
- Ken Thompson. 1968. Regular Expression Search Algorithm. Commun. ACM 11, 6 (1968), 419–422. Google Scholar
Digital Library
- Jeffrey Scott Vitter. 1987. An efficient algorithm for sequential random sampling. ACM Trans. Math. Softw. 13, 1 (1987), 58–67. Google Scholar
Digital Library
- M. Zalewski. 2014. http://lcamtuf.coredump.cx/afl/Google Scholar
- Lixiao Zheng and Duanyi Wu. 2009. A Sentence Generation Algorithm for Testing Grammars. In Proceedings of the 33rd Annual IEEE International Computer Software and Applications Conference, COMPSAC 2009, Seattle, Washington, USA, July 20-24, 2009. Volume 1, Sheikh Iqbal Ahamed, Elisa Bertino, Carl K. Chang, Vladimir Getov, Lin Liu, Hua Ming, and Rajesh Subramanyan (Eds.). IEEE Computer Society, 130–135. Google Scholar
Digital Library
Index Terms
Regenerate: a language generator for extended regular expressions
Recommendations
Regenerate: a language generator for extended regular expressions
GPCE 2018: Proceedings of the 17th ACM SIGPLAN International Conference on Generative Programming: Concepts and ExperiencesRegular expressions are part of every programmer’s toolbox. They are used for a wide variety of language-related tasks and there are many algorithms for manipulating them. In particular, matching algorithms that detect whether a word belongs to the ...
Regex and extended regex
CIAA'02: Proceedings of the 7th international conference on Implementation and application of automataRegex are used in many programs such as Perl, Awk, Python, egrep, vi, emacs etc. It is known that regex are different from regular expressions. In this paper, we give regex a formal treatment. We make a distinction between regex and extended regex; ...
Regular expression patterns
ICFP '04We extend Haskell with regular expression patterns. Regular expression patterns provide means for matching and extracting data which goes well beyond ordinary pattern matching as found in Haskell. It has proven useful for string manipulation and for ...







Comments