skip to main content
research-article

Pattern Matching with Variables: Efficient Algorithms and Complexity Results

Authors Info & Claims
Published:11 February 2020Publication History
Skip Abstract Section

Abstract

A pattern ɑ (i.e., a string of variables and terminals) matches a word w, if w can be obtained by uniformly replacing the variables of ɑ by terminal words. The respective matching problem, i.e., deciding whether or not a given pattern matches a given word, is generally NP-complete, but can be solved in polynomial-time for restricted classes of patterns. We present efficient algorithms for the matching problem with respect to patterns with a bounded number of repeated variables and patterns with a structural restriction on the order of variables. Furthermore, we show that it is NP-complete to decide, for a given number k and a word w, whether w can be factorised into k distinct factors. As an immediate consequence of this hardness result, the injective version (i.e., different variables are replaced by different words) of the matching problem is NP-complete even for very restricted classes of patterns.

References

  1. Amihood Amir and Igor Nor. 2007. Generalized function matching. J. Discrete Algor. 5, 3 (2007), 514--523.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Dana Angluin. 1980. Finding patterns common to a set of strings. J. Comput. System Sci. 21 (1980), 46--62.Google ScholarGoogle ScholarCross RefCross Ref
  3. Brenda S. Baker. 1996. Parameterized pattern matching: Algorithms and applications. J. Comput. System Sci. 52 (1996), 28--42.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Hideo Bannai, Travis Gagie, Shunsuke Inenaga, Juha Kärkkäinen, Dominik Kempa, Marcin Piatkowski, and Shiho Sugimoto. 2018. Diverse palindromic factorization is NP-complete. Int. J. Found. Comput. Sci. 29, 2 (2018), 143--164.Google ScholarGoogle ScholarCross RefCross Ref
  5. Pablo Barceló, Leonid Libkin, Anthony W. Lin, and Peter T. Wood. 2012. Expressive languages for path queries over graph-structured data. ACM Trans. Database Syst. 37 (2012).Google ScholarGoogle Scholar
  6. Cezar Câmpeanu, Kai Salomaa, and Sheng Yu. 2003. A formal study of practical regular expressions. Int. J. Found. Comput. Sci. 14 (2003), 1007--1018.Google ScholarGoogle ScholarCross RefCross Ref
  7. Raphaël Clifford, Aram Wettroth Harrow, Alexandru Popa, and Benjamin Sach. 2009. Generalised matching. In Proceedings of the 16th International Symposium on String Processing and Information Retrieval (SPIRE’09). 295--301.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Anne Condon, Ján Manuch, and Chris Thachuk. 2008. Complexity of a collision-aware string partition problem and its relation to oligo design for gene synthesis. In Proceedings of the 14th Annual International Conference on Computing and Combinatorics (COCOON’08). 265--275.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Anne Condon, Ján Manuch, and Chris Thachuk. 2015. The complexity of string partitioning. J. Discrete Algor. 32 (2015), 24--43.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Maxime Crochemore. 1981. An optimal algorithm for computing the repetitions in a word. Inform. Process. Lett. 12, 5 (1981), 244--250.Google ScholarGoogle ScholarCross RefCross Ref
  11. Maxime Crochemore and Wojciech Rytter. 1991. Usefulness of the Karp-Miller-Rosenberg algorithm in parallel computations on strings and arrays. Theoret. Comput. Sci. 88, 1 (1991), 59--82.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Maxime Crochemore and Wojciech Rytter. 1995. Squares, cubes, and time-space efficient string searching. Algorithmica 13, 5 (1995), 405--425.Google ScholarGoogle ScholarCross RefCross Ref
  13. Thomas Erlebach, Peter Rossmanith, Hans Stadtherr, Angelika Steger, and Thomas Zeugmann. 2001. Learning one-variable pattern languages very efficiently on average, in parallel, and by asking queries. Theoret. Comput. Sci. 261 (2001), 119--156.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Henning Fernau, Florin Manea, Robert Mercas, and Markus L. Schmid. 2015. Pattern matching with variables: Fast algorithms and new hardness results. In Proceedings of the 32nd International Symposium on Theoretical Aspects of Computer Science (STACS’15). 302--315.Google ScholarGoogle Scholar
  15. Henning Fernau, Florin Manea, Robert Mercas, and Markus L. Schmid. 2018. Revisiting Shinohara’s algorithm for computing descriptive patterns. Theoret. Comput. Sci. 733 (2018), 44--54.Google ScholarGoogle ScholarCross RefCross Ref
  16. Henning Fernau and Markus L. Schmid. 2015. Pattern matching with variables: A multivariate complexity analysis. Info. Comput. 242 (2015), 287--305.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Henning Fernau, Markus L. Schmid, and Yngve Villanger. 2015. On the parameterised complexity of string morphism problems. Theory Comput. Syst. (2015).Google ScholarGoogle Scholar
  18. Dominik D. Freydenberger. 2013. Extended regular expressions: Succinctness and decidability. Theory Comput. Syst. 53 (2013), 159--193.Google ScholarGoogle ScholarCross RefCross Ref
  19. Jeffrey E. F. Friedl. 2006. Mastering Regular Expressions (3rd ed.). O’Reilly, Sebastopol, CA.Google ScholarGoogle Scholar
  20. Michael R. Garey and David S. Johnson. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman 8 Co., New York, NY.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Dan Gusfield. 1997. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, New York, NY.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Oscar H. Ibarra, Ting-Chuen Pong, and Stephen M. Sohn. 1995. A note on parsing pattern languages. Pattern Recogn. Lett. 16 (1995), 179--182.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Juhani Karhumäki, Wojciech Plandowski, and Filippo Mignosi. 2000. The expressibility of languages and relations by word equations. J. ACM 47 (2000), 483--505.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Juha Kärkkäinen, Peter Sanders, and Stefan Burkhardt. 2006. Linear work suffix array construction. J. ACM 53, 6 (2006), 918--936.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Michael J. Kearns and Leonard Pitt. 1989. A polynomial-time algorithm for learning k-variable pattern languages from examples. In Proceedings of the 2nd Annual Workshop on Computational Learning Theory (COLT’89). 57--71.Google ScholarGoogle Scholar
  26. Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walen. 2012. Efficient data structures for the factor periodicity problem. In Proceedings of the 19th International Symposium on String Processing and Information Retrieval (SPIRE’12). 284--294.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walen. 2015. Internal pattern matching queries in a text and applications. In Proceedings of the 26th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’15). 532--551.Google ScholarGoogle ScholarCross RefCross Ref
  28. Dmitry Kosolobov, Florin Manea, and Dirk Nowotka. 2017. Detecting one-variable patterns. In Proceedings of the 24th International Symposium on String Processing and Information Retrieval (SPIRE’17). 254--270.Google ScholarGoogle ScholarCross RefCross Ref
  29. M. Lothaire. 1997. Combinatorics on Words. Cambridge University Press.Google ScholarGoogle Scholar
  30. M. Lothaire. 2002. Algebraic Combinatorics on Words. Cambridge University Press, Cambridge/New York.Google ScholarGoogle Scholar
  31. Alexandru Mateescu and Arto Salomaa. 1994. Finite degrees of ambiguity in pattern languages. RAIRO Informatique Théoretique et Applications 28 (1994), 233--253.Google ScholarGoogle ScholarCross RefCross Ref
  32. Yen K. Ng and Takeshi Shinohara. 2008. Developments from enquiries into the learnability of the pattern languages from positive data. Theoret. Comput. Sci. 397 (2008), 150--165.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Sebastian Ordyniak and Alexandru Popa. 2016. A parameterized study of maximum generalized pattern matching problems. Algorithmica 75 (2016), 1--26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Daniel Reidenbach. 2008. Discontinuities in pattern inference. Theoret. Comput. Sci. 397 (2008), 166--193.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Daniel Reidenbach and Markus L. Schmid. 2010. A polynomial time match test for large classes of extended regular expressions. In Proceedings of the 15th International Conference on Implementation and Application of Automata (CIAA’10). 241--250.Google ScholarGoogle Scholar
  36. Daniel Reidenbach and Markus L. Schmid. 2014. Patterns with bounded treewidth. Info. Comput. 239 (2014), 87--99.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Markus L. Schmid. 2013. A note on the complexity of matching patterns with variables. Info. Process. Lett. 113, 19–21 (2013), 729--733.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Markus L. Schmid. 2016. Computing equality-free and repetitive string factorisations. Theoret. Comput. Sci. 618 (2016), 42--51.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Takeshi Shinohara. 1982. Polynomial time inference of pattern languages and its application. In Proceedings of the 7th IBM Symposium on Mathematical Foundations of Computer Science (MFCS’82). 191--209.Google ScholarGoogle Scholar

Index Terms

  1. Pattern Matching with Variables: Efficient Algorithms and Complexity Results

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!