skip to main content
research-article
Open Access

On the Expressive Power of String Constraints

Published:11 January 2023Publication History
Skip Abstract Section

Abstract

We investigate properties of strings which are expressible by canonical types of string constraints. Specifically, we consider a landscape of 20 logical theories, whose syntax is built around combinations of four common elements of string constraints: language membership (e.g. for regular languages), concatenation, equality between string terms, and equality between string-lengths. For a variable x and formula f from a given theory, we consider the set of values for which x may be substituted as part of a satisfying assignment, or in other words, the property f expresses through x. Since we consider string-based logics, this set is a formal language. We firstly consider the relative expressive power of different combinations of string constraints by comparing the classes of languages expressible in the corresponding theories, and are able to establish a mostly complete picture in this regard. Secondly, we consider the question of deciding whether the language or property expressed by a variable/formula in one theory can be expressed in another theory. We establish several negative results which are relevant to preprocessing and normalisation of string constraints in practice. Some of our results have strong connections to important open problems regarding word equations and the theory of string solving.

References

  1. Parosh Aziz Abdulla, Mohamed Faouzi Atig, Yu-Fang Chen, Bui Phi Diep, Julian Dolby, Petr Janku, Hsin-Hung Lin, Lukás Holík, and Wei-Cheng Wu. 2020. Efficient handling of string-number conversion. In Proceedings of the 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation, PLDI 2020, London, UK, June 15-20, 2020, Alastair F. Donaldson and Emina Torlak (Eds.). ACM, 943–957. https://doi.org/10.1145/3385412.3386034 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Parosh Aziz Abdulla, Mohamed Faouzi Atig, Yu-Fang Chen, Lukás Holík, Ahmed Rezine, Philipp Rümmer, and Jari Stenman. 2015. Norn: An SMT Solver for String Constraints. In Computer Aided Verification - 27th International Conference, CAV 2015, San Francisco, CA, USA, July 18-24, 2015, Proceedings, Part I, Daniel Kroening and Corina S. Pasareanu (Eds.) (Lecture Notes in Computer Science, Vol. 9206). Springer, 462–469. https://doi.org/10.1007/978-3-319-21690-4_29 Google ScholarGoogle ScholarCross RefCross Ref
  3. Rajeev Alur, Viraj Kumar, P. Madhusudan, and Mahesh Viswanathan. 2005. Congruences for Visibly Pushdown Languages. In Automata, Languages and Programming, 32nd International Colloquium, ICALP 2005, Lisbon, Portugal, July 11-15, 2005, Proceedings, Luís Caires, Giuseppe F. Italiano, Luís Monteiro, Catuscia Palamidessi, and Moti Yung (Eds.) (Lecture Notes in Computer Science, Vol. 3580). Springer, 1102–1114. https://doi.org/10.1007/11523468_89 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Rajeev Alur and P. Madhusudan. 2004. Visibly pushdown languages. In Proceedings of the 36th Annual ACM Symposium on Theory of Computing, Chicago, IL, USA, June 13-16, 2004, László Babai (Ed.). ACM, 202–211. https://doi.org/10.1145/1007352.1007390 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Rajeev Alur and P. Madhusudan. 2009. Adding nesting structure to words. J. ACM, 56, 3 (2009), 16:1–16:43. https://doi.org/10.1145/1516512.1516518 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Roberto Amadini. 2021. A Survey on String Constraint Solving. ACM Comput. Surv., 55, 1 (2021), Article 16, nov, 38 pages. issn:0360-0300 https://doi.org/10.1145/3484198 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Pablo Barceló, Gaëlle Fontaine, and Anthony Widjaja Lin. 2015. Expressive Path Queries on Graph with Data. Log. Methods Comput. Sci., 11, 4 (2015), https://doi.org/10.2168/LMCS-11(4:1)2015 Google ScholarGoogle ScholarCross RefCross Ref
  8. Pablo Barceló and Pablo Muñoz. 2017. Graph Logics with Rational Relations: The Role of Word Combinatorics. ACM Trans. Comput. Log., 18, 2 (2017), 10:1–10:41. https://doi.org/10.1145/3070822 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Clark W. Barrett, Christopher L. Conway, Morgan Deters, Liana Hadarean, Dejan Jovanovic, Tim King, Andrew Reynolds, and Cesare Tinelli. 2011. CVC4. In Computer Aided Verification - 23rd International Conference, CAV 2011, Snowbird, UT, USA, July 14-20, 2011. Proceedings, Ganesh Gopalakrishnan and Shaz Qadeer (Eds.) (Lecture Notes in Computer Science, Vol. 6806). Springer, 171–177. https://doi.org/10.1007/978-3-642-22110-1_14 Google ScholarGoogle ScholarCross RefCross Ref
  10. Michael Benedikt, Leonid Libkin, Thomas Schwentick, and Luc Segoufin. 2003. Definable relations and first-order query languages over strings. J. ACM, 50, 5 (2003), 694–751. https://doi.org/10.1145/876638.876642 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Murphy Berzish, Joel D. Day, Vijay Ganesh, Mitja Kulczynski, Florin Manea, Federico Mora, and Dirk Nowotka. 2021. String Theories Involving Regular Membership Predicates: From Practice to Theory and Back. In Combinatorics on Words - 13th International Conference, WORDS 2021, Rouen, France, September 13-17, 2021, Proceedings, Thierry Lecroq and Svetlana Puzynina (Eds.) (Lecture Notes in Computer Science, Vol. 12847). Springer, 50–64. https://doi.org/10.1007/978-3-030-85088-3_5 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Murphy Berzish, Mitja Kulczynski, Federico Mora, Florin Manea, Joel D. Day, Dirk Nowotka, and Vijay Ganesh. 2021. An SMT Solver for Regular Expressions and Linear Arithmetic over String Length. In Computer Aided Verification - 33rd International Conference, CAV 2021, Virtual Event, July 20-23, 2021, Proceedings, Part II, Alexandra Silva and K. Rustan M. Leino (Eds.) (Lecture Notes in Computer Science, Vol. 12760). Springer, 289–312. https://doi.org/10.1007/978-3-030-81688-9_14 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Richard Büchi and Steven Senger. 1988. Definability in the Existential Theory of Concatenation and Undecidable Extensions of this Theory. Math. Log. Q., 34, 4 (1988), 337–342. https://doi.org/10.1002/malq.19880340410 Google ScholarGoogle ScholarCross RefCross Ref
  14. Taolue Chen, Yan Chen, Matthew Hague, Anthony W. Lin, and Zhilin Wu. 2018. What is decidable about string constraints with the ReplaceAll function. Proc. ACM Program. Lang., 2, POPL (2018), 3:1–3:29. https://doi.org/10.1145/3158091 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Taolue Chen, Alejandro Flores-Lamas, Matthew Hague, Zhilei Han, Denghang Hu, Shuanglong Kan, Anthony W. Lin, Philipp Rümmer, and Zhilin Wu. 2022. Solving string constraints with Regex-dependent functions through transducers with priorities and variables. Proc. ACM Program. Lang., 6, POPL (2022), 1–31. https://doi.org/10.1145/3498707 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Taolue Chen, Matthew Hague, Anthony W. Lin, Philipp Rümmer, and Zhilin Wu. 2019. Decision procedures for path feasibility of string-manipulating programs with complex operations. Proc. ACM Program. Lang., 3, POPL (2019), 49:1–49:30. https://doi.org/10.1145/3290362 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Joel D. Day, Vijay Ganesh, Paul He, Florin Manea, and Dirk Nowotka. 2018. The Satisfiability of Word Equations: Decidable and Undecidable Theories. In Reachability Problems - 12th International Conference, RP 2018, Marseille, France, September 24-26, 2018, Proceedings, Igor Potapov and Pierre-Alain Reynier (Eds.) (Lecture Notes in Computer Science, Vol. 11123). Springer, 15–29. https://doi.org/10.1007/978-3-030-00250-3_2 Google ScholarGoogle ScholarCross RefCross Ref
  18. V. G. Durnev. 1995. Undecidability of the positive ∀ ∃ ^3-theory of a free semigroup. Siberian Mathematical Journal, 36, 5 (1995), 917–929. https://doi.org/10.1007/BF02112533 Google ScholarGoogle ScholarCross RefCross Ref
  19. Diego Figueira, Artur Jez, and Anthony W. Lin. 2022. Data Path Queries over Embedded Graph Databases. In PODS ’22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022, Leonid Libkin and Pablo Barceló (Eds.). ACM, 189–201. https://doi.org/10.1145/3517804.3524159 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Dominik D. Freydenberger. 2019. A Logic for Document Spanners. Theory Comput. Syst., 63, 7 (2019), 1679–1754. https://doi.org/10.1007/s00224-018-9874-1 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Dominik D. Freydenberger and Mario Holldack. 2018. Document Spanners: From Expressive Power to Decision Problems. Theory Comput. Syst., 62, 4 (2018), 854–898. https://doi.org/10.1007/s00224-017-9770-0 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Dominik D. Freydenberger and Liat Peterfreund. 2021. The Theory of Concatenation over Finite Models. In 48th International Colloquium on Automata, Languages, and Programming, ICALP 2021, July 12-16, 2021, Glasgow, Scotland (Virtual Conference), Nikhil Bansal, Emanuela Merelli, and James Worrell (Eds.) (LIPIcs, Vol. 198). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 130:1–130:17. https://doi.org/10.4230/LIPIcs.ICALP.2021.130 Google ScholarGoogle ScholarCross RefCross Ref
  23. Vijay Ganesh, Mia Minnes, Armando Solar-Lezama, and Martin C. Rinard. 2012. Word Equations with Length Constraints: What’s Decidable? In Hardware and Software: Verification and Testing - 8th International Haifa Verification Conference, HVC 2012, Haifa, Israel, November 6-8, 2012. Revised Selected Papers, Armin Biere, Amir Nahir, and Tanja E. J. Vos (Eds.) (Lecture Notes in Computer Science, Vol. 7857). Springer, 209–226. https://doi.org/10.1007/978-3-642-39611-3_21 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Matthew Hague. 2019. Strings at MOSCA. ACM SIGLOG News, 6, 4 (2019), 4–22. https://doi.org/10.1145/3373394.3373396 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Simon Halfon, Philippe Schnoebelen, and Georg Zetzsche. 2017. Decidability, complexity, and expressiveness of first-order logic over the subword ordering. In 32nd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2017, Reykjavik, Iceland, June 20-23, 2017. IEEE Computer Society, 1–12. https://doi.org/10.1109/LICS.2017.8005141 Google ScholarGoogle ScholarCross RefCross Ref
  26. Lukás Holík, Petr Janku, Anthony W. Lin, Philipp Rümmer, and Tomás Vojnar. 2018. String constraints with concatenation and transducers solved efficiently. Proc. ACM Program. Lang., 2, POPL (2018), 4:1–4:32. https://doi.org/10.1145/3158092 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. John E. Hopcroft and Jeffrey D. Ullman. 1979. Introduction to Automata Theory, Languages and Computation. Addison-Wesley. isbn:0-201-02988-X Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Artur Jez. 2022. Word equations in non-deterministic linear space. J. Comput. Syst. Sci., 123 (2022), 122–142. https://doi.org/10.1016/j.jcss.2021.08.001 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Shuanglong Kan, Anthony Widjaja Lin, Philipp Rümmer, and Micha Schrader. 2022. CertiStr: a certified string solver. In CPP ’22: 11th ACM SIGPLAN International Conference on Certified Programs and Proofs, Philadelphia, PA, USA, January 17 - 18, 2022, Andrei Popescu and Steve Zdancewic (Eds.). ACM, 210–224. https://doi.org/10.1145/3497775.3503691 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Juhani Karhumäki, Filippo Mignosi, and Wojciech Plandowski. 2000. The expressibility of languages and relations by word equations. J. ACM, 47, 3 (2000), 483–505. https://doi.org/10.1145/337244.337255 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Adam Kiezun, Vijay Ganesh, Shay Artzi, Philip J. Guo, Pieter Hooimeijer, and Michael D. Ernst. 2012. HAMPI: A solver for word equations over strings, regular expressions, and context-free grammars. ACM Trans. Softw. Eng. Methodol., 21, 4 (2012), 25:1–25:28. https://doi.org/10.1145/2377656.2377662 Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Adam Kiezun, Vijay Ganesh, Philip J. Guo, Pieter Hooimeijer, and Michael D. Ernst. 2009. HAMPI: a solver for string constraints. In Proceedings of the Eighteenth International Symposium on Software Testing and Analysis, ISSTA 2009, Chicago, IL, USA, July 19-23, 2009, Gregg Rothermel and Laura K. Dillon (Eds.). ACM, 105–116. https://doi.org/10.1145/1572272.1572286 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Mitja Kulczynski, Florin Manea, Dirk Nowotka, and Danny Bøgsted Poulsen. 2021. ZaligVinder: A generic test framework for string solvers. Journal of Software: Evolution and Process, n/a, n/a (2021), e2400. https://doi.org/10.1002/smr.2400 arxiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/smr.2400. Google ScholarGoogle ScholarCross RefCross Ref
  34. Quang Loc Le and Mengda He. 2018. A Decision Procedure for String Logic with Quadratic Equations, Regular Expressions and Length Constraints. In Programming Languages and Systems - 16th Asian Symposium, APLAS 2018, Wellington, New Zealand, December 2-6, 2018, Proceedings, Sukyoung Ryu (Ed.) (Lecture Notes in Computer Science, Vol. 11275). Springer, 350–372. https://doi.org/10.1007/978-3-030-02768-1_19 Google ScholarGoogle ScholarCross RefCross Ref
  35. Tianyi Liang, Nestan Tsiskaridze, Andrew Reynolds, Cesare Tinelli, and Clark W. Barrett. 2015. A Decision Procedure for Regular Membership and Length Constraints over Unbounded Strings. In Frontiers of Combining Systems - 10th International Symposium, FroCoS 2015, Wroclaw, Poland, September 21-24, 2015. Proceedings, Carsten Lutz and Silvio Ranise (Eds.) (Lecture Notes in Computer Science, Vol. 9322). Springer, 135–150. https://doi.org/10.1007/978-3-319-24246-0_9 Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Anthony Widjaja Lin and Pablo Barceló. 2016. String solving with word equations and transducers: towards a logic for analysing mutation XSS. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2016, St. Petersburg, FL, USA, January 20 - 22, 2016, Rastislav Bodík and Rupak Majumdar (Eds.). ACM, 123–136. https://doi.org/10.1145/2837614.2837641 Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Anthony W. Lin and Rupak Majumdar. 2018. Quadratic Word Equations with Length Constraints, Counter Systems, and Presburger Arithmetic with Divisibility. In Automated Technology for Verification and Analysis - 16th International Symposium, ATVA 2018, Los Angeles, CA, USA, October 7-10, 2018, Proceedings, Shuvendu K. Lahiri and Chao Wang (Eds.) (Lecture Notes in Computer Science, Vol. 11138). Springer, 352–369. https://doi.org/10.1007/978-3-030-01090-4_21 Google ScholarGoogle ScholarCross RefCross Ref
  38. M. Lothaire. 1997. Combinatorics on words, Second Edition. Cambridge University Press. isbn:978-0-521-59924-5 Google ScholarGoogle Scholar
  39. M. Lothaire. 2002. Algebraic combinatorics on words. Cambridge University Press. isbn:0521812208 Google ScholarGoogle Scholar
  40. G. S. Makanin. 1977. The problem of solvability of equations in a free semigroup. Mathematics of the USSR-Sbornik, 32, 2 (1977), feb, 129. https://doi.org/10.1070/SM1977v032n02ABEH002376 Google ScholarGoogle ScholarCross RefCross Ref
  41. Federico Mora, Murphy Berzish, Mitja Kulczynski, Dirk Nowotka, and Vijay Ganesh. 2021. Z3str4: A Multi-armed String Solver. In Formal Methods - 24th International Symposium, FM 2021, Virtual Event, November 20-26, 2021, Proceedings, Marieke Huisman, Corina S. Pasareanu, and Naijun Zhan (Eds.) (Lecture Notes in Computer Science, Vol. 13047). Springer, 389–406. https://doi.org/10.1007/978-3-030-90870-6_21 Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Wojciech Plandowski. 1999. Satisfiability of Word Equations with Constants is in PSPACE. In 40th Annual Symposium on Foundations of Computer Science, FOCS ’99, 17-18 October, 1999, New York, NY, USA. IEEE Computer Society, 495–500. https://doi.org/10.1109/SFFCS.1999.814622 Google ScholarGoogle ScholarCross RefCross Ref
  43. W. V. Quine. 1946. Concatenation as a Basis for Arithmetic. The Journal of Symbolic Logic, 11, 4 (1946), 105–114. issn:00224812 http://www.jstor.org/stable/2268308 Google ScholarGoogle ScholarCross RefCross Ref
  44. Klaus U. Schulz. 1990. Makanin’s Algorithm for Word Equations - Two Improvements and a Generalization. In Word Equations and Related Topics, First International Workshop, IWWERT ’90, Tübingen, Germany, October 1-3, 1990, Proceedings, Klaus U. Schulz (Ed.) (Lecture Notes in Computer Science, Vol. 572). Springer, 85–150. https://doi.org/10.1007/3-540-55124-7_4 Google ScholarGoogle ScholarCross RefCross Ref
  45. Minh-Thai Trinh, Duc-Hiep Chu, and Joxan Jaffar. 2016. Progressive Reasoning over Recursively-Defined Strings. In Computer Aided Verification - 28th International Conference, CAV 2016, Toronto, ON, Canada, July 17-23, 2016, Proceedings, Part I, Swarat Chaudhuri and Azadeh Farzan (Eds.) (Lecture Notes in Computer Science, Vol. 9779). Springer, 218–240. https://doi.org/10.1007/978-3-319-41528-4_12 Google ScholarGoogle ScholarCross RefCross Ref
  46. Detlef Wotschke. 1973. The Boolean Closures of the Deterministic and Nondeterministic Context-Free Languages. In Gesellschaft für Informatik e.V., 3. Jahrestagung, Hamburg, Deutschland, 8.-10. Oktober 1973, Wilfried Brauer (Ed.) (Lecture Notes in Computer Science, Vol. 1). Springer, 113–121. https://doi.org/10.1007/3-540-06473-7_11 Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. On the Expressive Power of String Constraints

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image Proceedings of the ACM on Programming Languages
          Proceedings of the ACM on Programming Languages  Volume 7, Issue POPL
          January 2023
          2196 pages
          EISSN:2475-1421
          DOI:10.1145/3554308
          • Editor:
          Issue’s Table of Contents

          Copyright © 2023 Owner/Author

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 11 January 2023
          Published in pacmpl Volume 7, Issue POPL

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
        • Article Metrics

          • Downloads (Last 12 months)244
          • Downloads (Last 6 weeks)33

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!