Abstract
Cody, Hazel, and Theo, two experienced Haskell programmers and an expert in automata theory, develop an elegant Haskell program for matching regular expressions: (i) the program is purely functional; (ii) it is overloaded over arbitrary semirings, which not only allows to solve the ordinary matching problem but also supports other applications like computing leftmost longest matchings or the number of matchings, all with a single algorithm; (iii) it is more powerful than other matchers, as it can be used for parsing every context-free language by taking advantage of laziness.
The developed program is based on an old technique to turn regular expressions into finite automata which makes it efficient both in terms of worst-case time and space bounds and actual performance: despite its simplicity, the Haskell implementation can compete with a recently published professional C++ program for the same problem.
Supplemental Material
- }}C. Allauzen and M. Mohri. A unified construction of the Glushkov, follow, and Antimirov automata. In R. Kralovic and P. Urzyczyn, editors, phMathematical Foundations of Computer Science 2006 (MFCS 2006), Stará Lesná, Slovakia, volume 4162 of Lecture Notes in Computer Science, pages 110--121. Springer, 2006. Google Scholar
Digital Library
- }}P. Caron and M. Flouret. From Glushkov WFAs to rational expressions. In Z. Ésik and Z. Fülöp, editors, Developments in Language Theory, 7th International Conference (DLT 2003), Szeged, Hungary, volume 2710 of Lecture Notes in Computer Science, pages 183--193. Springer, 2003. Google Scholar
Digital Library
- }}M. Droste, W. Kuich, and H. Vogler. Handbook of Weighted Automata. Springer, New York, 2009. Google Scholar
Digital Library
- }}V. M. Glushkov. On a synthesis algorithm for abstract automata. Ukr. Matem. Zhurnal, 12 (2): 147--156, 1960.Google Scholar
Cross Ref
- }}S. A. Greibach. A new normal-form theorem for context-free phrase structure grammars. J. ACM, 12 (1): 42--52, 1965. Google Scholar
Digital Library
- }}Haskell Wiki. Haskell - regular expressions. http://www.haskell.org/haskellwiki/Regular_expressions.Google Scholar
- }}P. Hudak, J. Hughes, S. L. Peyton-Jones, and P. Wadler. A history of Haskell: being lazy with class. In Third ACM SIGPLAN History of Programming Languages Conference (HOPL-III), San Diego, California, pages 1--55. ACM, 2007. Google Scholar
Digital Library
- }}S. Kleene. Representation of events in nerve nets and finite automata. In C. Shannon and J. McCarthy, editors, Automata Studies, pages 3--42. Princeton University Press, Princeton, N.J., 1956.Google Scholar
- }}R. McNaughton and H. Yamada. Regular expressions and state graphs for automata. IEEE Transactions on Electronic Computers, 9 (1): 39--47, 1960.Google Scholar
Cross Ref
- }}M. O. Rabin and D. Scott. Finite automata and their decision problems. IBM journal of research and development, 3 (2): 114--125, 1959. Google Scholar
Digital Library
- }}M. P. Schützenberger. On the definition of a family of automata. Information and Control, 4 (2--3): 245--270, 1961.Google Scholar
- }}K. Thompson. Programming techniques: Regular expression search algorithm. Commun. ACM, 11 (6): 419--422, 1968. Google Scholar
Digital Library
Index Terms
A play on regular expressions: functional pearl
Recommendations
A play on regular expressions: functional pearl
ICFP '10: Proceedings of the 15th ACM SIGPLAN international conference on Functional programmingCody, Hazel, and Theo, two experienced Haskell programmers and an expert in automata theory, develop an elegant Haskell program for matching regular expressions: (i) the program is purely functional; (ii) it is overloaded over arbitrary semirings, which ...
Translating Regular Expressions into Small ε-Free Nondeterministic Finite Automata
We prove that every regular expression of size n can be converted into an equivalent nondeterministic -free finite automaton (NFA) with O(n(logn)2) transitions in time O(n2logn). The best previously known conversions result in NFAs of worst-case size (...
Construction of fuzzy automata from fuzzy regular expressions
Li and Pedrycz have proved fundamental results that provide different equivalent ways to represent fuzzy languages with membership values in a lattice-ordered monoid, and generalize the well-known results of the classical theory of formal languages. In ...







Comments