Abstract
We investigate properties of strings which are expressible by canonical types of string constraints. Specifically, we consider a landscape of 20 logical theories, whose syntax is built around combinations of four common elements of string constraints: language membership (e.g. for regular languages), concatenation, equality between string terms, and equality between string-lengths. For a variable x and formula f from a given theory, we consider the set of values for which x may be substituted as part of a satisfying assignment, or in other words, the property f expresses through x. Since we consider string-based logics, this set is a formal language. We firstly consider the relative expressive power of different combinations of string constraints by comparing the classes of languages expressible in the corresponding theories, and are able to establish a mostly complete picture in this regard. Secondly, we consider the question of deciding whether the language or property expressed by a variable/formula in one theory can be expressed in another theory. We establish several negative results which are relevant to preprocessing and normalisation of string constraints in practice. Some of our results have strong connections to important open problems regarding word equations and the theory of string solving.
- Parosh Aziz Abdulla, Mohamed Faouzi Atig, Yu-Fang Chen, Bui Phi Diep, Julian Dolby, Petr Janku, Hsin-Hung Lin, Lukás Holík, and Wei-Cheng Wu. 2020. Efficient handling of string-number conversion. In Proceedings of the 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation, PLDI 2020, London, UK, June 15-20, 2020, Alastair F. Donaldson and Emina Torlak (Eds.). ACM, 943–957. https://doi.org/10.1145/3385412.3386034
Google Scholar
Digital Library
- Parosh Aziz Abdulla, Mohamed Faouzi Atig, Yu-Fang Chen, Lukás Holík, Ahmed Rezine, Philipp Rümmer, and Jari Stenman. 2015. Norn: An SMT Solver for String Constraints. In Computer Aided Verification - 27th International Conference, CAV 2015, San Francisco, CA, USA, July 18-24, 2015, Proceedings, Part I, Daniel Kroening and Corina S. Pasareanu (Eds.) (Lecture Notes in Computer Science, Vol. 9206). Springer, 462–469. https://doi.org/10.1007/978-3-319-21690-4_29
Google Scholar
Cross Ref
- Rajeev Alur, Viraj Kumar, P. Madhusudan, and Mahesh Viswanathan. 2005. Congruences for Visibly Pushdown Languages. In Automata, Languages and Programming, 32nd International Colloquium, ICALP 2005, Lisbon, Portugal, July 11-15, 2005, Proceedings, Luís Caires, Giuseppe F. Italiano, Luís Monteiro, Catuscia Palamidessi, and Moti Yung (Eds.) (Lecture Notes in Computer Science, Vol. 3580). Springer, 1102–1114. https://doi.org/10.1007/11523468_89
Google Scholar
Digital Library
- Rajeev Alur and P. Madhusudan. 2004. Visibly pushdown languages. In Proceedings of the 36th Annual ACM Symposium on Theory of Computing, Chicago, IL, USA, June 13-16, 2004, László Babai (Ed.). ACM, 202–211. https://doi.org/10.1145/1007352.1007390
Google Scholar
Digital Library
- Rajeev Alur and P. Madhusudan. 2009. Adding nesting structure to words. J. ACM, 56, 3 (2009), 16:1–16:43. https://doi.org/10.1145/1516512.1516518
Google Scholar
Digital Library
- Roberto Amadini. 2021. A Survey on String Constraint Solving. ACM Comput. Surv., 55, 1 (2021), Article 16, nov, 38 pages. issn:0360-0300 https://doi.org/10.1145/3484198
Google Scholar
Digital Library
- Pablo Barceló, Gaëlle Fontaine, and Anthony Widjaja Lin. 2015. Expressive Path Queries on Graph with Data. Log. Methods Comput. Sci., 11, 4 (2015), https://doi.org/10.2168/LMCS-11(4:1)2015
Google Scholar
Cross Ref
- Pablo Barceló and Pablo Muñoz. 2017. Graph Logics with Rational Relations: The Role of Word Combinatorics. ACM Trans. Comput. Log., 18, 2 (2017), 10:1–10:41. https://doi.org/10.1145/3070822
Google Scholar
Digital Library
- Clark W. Barrett, Christopher L. Conway, Morgan Deters, Liana Hadarean, Dejan Jovanovic, Tim King, Andrew Reynolds, and Cesare Tinelli. 2011. CVC4. In Computer Aided Verification - 23rd International Conference, CAV 2011, Snowbird, UT, USA, July 14-20, 2011. Proceedings, Ganesh Gopalakrishnan and Shaz Qadeer (Eds.) (Lecture Notes in Computer Science, Vol. 6806). Springer, 171–177. https://doi.org/10.1007/978-3-642-22110-1_14
Google Scholar
Cross Ref
- Michael Benedikt, Leonid Libkin, Thomas Schwentick, and Luc Segoufin. 2003. Definable relations and first-order query languages over strings. J. ACM, 50, 5 (2003), 694–751. https://doi.org/10.1145/876638.876642
Google Scholar
Digital Library
- Murphy Berzish, Joel D. Day, Vijay Ganesh, Mitja Kulczynski, Florin Manea, Federico Mora, and Dirk Nowotka. 2021. String Theories Involving Regular Membership Predicates: From Practice to Theory and Back. In Combinatorics on Words - 13th International Conference, WORDS 2021, Rouen, France, September 13-17, 2021, Proceedings, Thierry Lecroq and Svetlana Puzynina (Eds.) (Lecture Notes in Computer Science, Vol. 12847). Springer, 50–64. https://doi.org/10.1007/978-3-030-85088-3_5
Google Scholar
Digital Library
- Murphy Berzish, Mitja Kulczynski, Federico Mora, Florin Manea, Joel D. Day, Dirk Nowotka, and Vijay Ganesh. 2021. An SMT Solver for Regular Expressions and Linear Arithmetic over String Length. In Computer Aided Verification - 33rd International Conference, CAV 2021, Virtual Event, July 20-23, 2021, Proceedings, Part II, Alexandra Silva and K. Rustan M. Leino (Eds.) (Lecture Notes in Computer Science, Vol. 12760). Springer, 289–312. https://doi.org/10.1007/978-3-030-81688-9_14
Google Scholar
Digital Library
- J. Richard Büchi and Steven Senger. 1988. Definability in the Existential Theory of Concatenation and Undecidable Extensions of this Theory. Math. Log. Q., 34, 4 (1988), 337–342. https://doi.org/10.1002/malq.19880340410
Google Scholar
Cross Ref
- Taolue Chen, Yan Chen, Matthew Hague, Anthony W. Lin, and Zhilin Wu. 2018. What is decidable about string constraints with the ReplaceAll function. Proc. ACM Program. Lang., 2, POPL (2018), 3:1–3:29. https://doi.org/10.1145/3158091
Google Scholar
Digital Library
- Taolue Chen, Alejandro Flores-Lamas, Matthew Hague, Zhilei Han, Denghang Hu, Shuanglong Kan, Anthony W. Lin, Philipp Rümmer, and Zhilin Wu. 2022. Solving string constraints with Regex-dependent functions through transducers with priorities and variables. Proc. ACM Program. Lang., 6, POPL (2022), 1–31. https://doi.org/10.1145/3498707
Google Scholar
Digital Library
- Taolue Chen, Matthew Hague, Anthony W. Lin, Philipp Rümmer, and Zhilin Wu. 2019. Decision procedures for path feasibility of string-manipulating programs with complex operations. Proc. ACM Program. Lang., 3, POPL (2019), 49:1–49:30. https://doi.org/10.1145/3290362
Google Scholar
Digital Library
- Joel D. Day, Vijay Ganesh, Paul He, Florin Manea, and Dirk Nowotka. 2018. The Satisfiability of Word Equations: Decidable and Undecidable Theories. In Reachability Problems - 12th International Conference, RP 2018, Marseille, France, September 24-26, 2018, Proceedings, Igor Potapov and Pierre-Alain Reynier (Eds.) (Lecture Notes in Computer Science, Vol. 11123). Springer, 15–29. https://doi.org/10.1007/978-3-030-00250-3_2
Google Scholar
Cross Ref
- V. G. Durnev. 1995. Undecidability of the positive ∀ ∃ ^3-theory of a free semigroup. Siberian Mathematical Journal, 36, 5 (1995), 917–929. https://doi.org/10.1007/BF02112533
Google Scholar
Cross Ref
- Diego Figueira, Artur Jez, and Anthony W. Lin. 2022. Data Path Queries over Embedded Graph Databases. In PODS ’22: International Conference on Management of Data, Philadelphia, PA, USA, June 12 - 17, 2022, Leonid Libkin and Pablo Barceló (Eds.). ACM, 189–201. https://doi.org/10.1145/3517804.3524159
Google Scholar
Digital Library
- Dominik D. Freydenberger. 2019. A Logic for Document Spanners. Theory Comput. Syst., 63, 7 (2019), 1679–1754. https://doi.org/10.1007/s00224-018-9874-1
Google Scholar
Digital Library
- Dominik D. Freydenberger and Mario Holldack. 2018. Document Spanners: From Expressive Power to Decision Problems. Theory Comput. Syst., 62, 4 (2018), 854–898. https://doi.org/10.1007/s00224-017-9770-0
Google Scholar
Digital Library
- Dominik D. Freydenberger and Liat Peterfreund. 2021. The Theory of Concatenation over Finite Models. In 48th International Colloquium on Automata, Languages, and Programming, ICALP 2021, July 12-16, 2021, Glasgow, Scotland (Virtual Conference), Nikhil Bansal, Emanuela Merelli, and James Worrell (Eds.) (LIPIcs, Vol. 198). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 130:1–130:17. https://doi.org/10.4230/LIPIcs.ICALP.2021.130
Google Scholar
Cross Ref
- Vijay Ganesh, Mia Minnes, Armando Solar-Lezama, and Martin C. Rinard. 2012. Word Equations with Length Constraints: What’s Decidable? In Hardware and Software: Verification and Testing - 8th International Haifa Verification Conference, HVC 2012, Haifa, Israel, November 6-8, 2012. Revised Selected Papers, Armin Biere, Amir Nahir, and Tanja E. J. Vos (Eds.) (Lecture Notes in Computer Science, Vol. 7857). Springer, 209–226. https://doi.org/10.1007/978-3-642-39611-3_21
Google Scholar
Digital Library
- Matthew Hague. 2019. Strings at MOSCA. ACM SIGLOG News, 6, 4 (2019), 4–22. https://doi.org/10.1145/3373394.3373396
Google Scholar
Digital Library
- Simon Halfon, Philippe Schnoebelen, and Georg Zetzsche. 2017. Decidability, complexity, and expressiveness of first-order logic over the subword ordering. In 32nd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2017, Reykjavik, Iceland, June 20-23, 2017. IEEE Computer Society, 1–12. https://doi.org/10.1109/LICS.2017.8005141
Google Scholar
Cross Ref
- Lukás Holík, Petr Janku, Anthony W. Lin, Philipp Rümmer, and Tomás Vojnar. 2018. String constraints with concatenation and transducers solved efficiently. Proc. ACM Program. Lang., 2, POPL (2018), 4:1–4:32. https://doi.org/10.1145/3158092
Google Scholar
Digital Library
- John E. Hopcroft and Jeffrey D. Ullman. 1979. Introduction to Automata Theory, Languages and Computation. Addison-Wesley. isbn:0-201-02988-X
Google Scholar
Digital Library
- Artur Jez. 2022. Word equations in non-deterministic linear space. J. Comput. Syst. Sci., 123 (2022), 122–142. https://doi.org/10.1016/j.jcss.2021.08.001
Google Scholar
Digital Library
- Shuanglong Kan, Anthony Widjaja Lin, Philipp Rümmer, and Micha Schrader. 2022. CertiStr: a certified string solver. In CPP ’22: 11th ACM SIGPLAN International Conference on Certified Programs and Proofs, Philadelphia, PA, USA, January 17 - 18, 2022, Andrei Popescu and Steve Zdancewic (Eds.). ACM, 210–224. https://doi.org/10.1145/3497775.3503691
Google Scholar
Digital Library
- Juhani Karhumäki, Filippo Mignosi, and Wojciech Plandowski. 2000. The expressibility of languages and relations by word equations. J. ACM, 47, 3 (2000), 483–505. https://doi.org/10.1145/337244.337255
Google Scholar
Digital Library
- Adam Kiezun, Vijay Ganesh, Shay Artzi, Philip J. Guo, Pieter Hooimeijer, and Michael D. Ernst. 2012. HAMPI: A solver for word equations over strings, regular expressions, and context-free grammars. ACM Trans. Softw. Eng. Methodol., 21, 4 (2012), 25:1–25:28. https://doi.org/10.1145/2377656.2377662
Google Scholar
Digital Library
- Adam Kiezun, Vijay Ganesh, Philip J. Guo, Pieter Hooimeijer, and Michael D. Ernst. 2009. HAMPI: a solver for string constraints. In Proceedings of the Eighteenth International Symposium on Software Testing and Analysis, ISSTA 2009, Chicago, IL, USA, July 19-23, 2009, Gregg Rothermel and Laura K. Dillon (Eds.). ACM, 105–116. https://doi.org/10.1145/1572272.1572286
Google Scholar
Digital Library
- Mitja Kulczynski, Florin Manea, Dirk Nowotka, and Danny Bøgsted Poulsen. 2021. ZaligVinder: A generic test framework for string solvers. Journal of Software: Evolution and Process, n/a, n/a (2021), e2400. https://doi.org/10.1002/smr.2400 arxiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/smr.2400.
Google Scholar
Cross Ref
- Quang Loc Le and Mengda He. 2018. A Decision Procedure for String Logic with Quadratic Equations, Regular Expressions and Length Constraints. In Programming Languages and Systems - 16th Asian Symposium, APLAS 2018, Wellington, New Zealand, December 2-6, 2018, Proceedings, Sukyoung Ryu (Ed.) (Lecture Notes in Computer Science, Vol. 11275). Springer, 350–372. https://doi.org/10.1007/978-3-030-02768-1_19
Google Scholar
Cross Ref
- Tianyi Liang, Nestan Tsiskaridze, Andrew Reynolds, Cesare Tinelli, and Clark W. Barrett. 2015. A Decision Procedure for Regular Membership and Length Constraints over Unbounded Strings. In Frontiers of Combining Systems - 10th International Symposium, FroCoS 2015, Wroclaw, Poland, September 21-24, 2015. Proceedings, Carsten Lutz and Silvio Ranise (Eds.) (Lecture Notes in Computer Science, Vol. 9322). Springer, 135–150. https://doi.org/10.1007/978-3-319-24246-0_9
Google Scholar
Digital Library
- Anthony Widjaja Lin and Pablo Barceló. 2016. String solving with word equations and transducers: towards a logic for analysing mutation XSS. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2016, St. Petersburg, FL, USA, January 20 - 22, 2016, Rastislav Bodík and Rupak Majumdar (Eds.). ACM, 123–136. https://doi.org/10.1145/2837614.2837641
Google Scholar
Digital Library
- Anthony W. Lin and Rupak Majumdar. 2018. Quadratic Word Equations with Length Constraints, Counter Systems, and Presburger Arithmetic with Divisibility. In Automated Technology for Verification and Analysis - 16th International Symposium, ATVA 2018, Los Angeles, CA, USA, October 7-10, 2018, Proceedings, Shuvendu K. Lahiri and Chao Wang (Eds.) (Lecture Notes in Computer Science, Vol. 11138). Springer, 352–369. https://doi.org/10.1007/978-3-030-01090-4_21
Google Scholar
Cross Ref
- M. Lothaire. 1997. Combinatorics on words, Second Edition. Cambridge University Press. isbn:978-0-521-59924-5
Google Scholar
- M. Lothaire. 2002. Algebraic combinatorics on words. Cambridge University Press. isbn:0521812208
Google Scholar
- G. S. Makanin. 1977. The problem of solvability of equations in a free semigroup. Mathematics of the USSR-Sbornik, 32, 2 (1977), feb, 129. https://doi.org/10.1070/SM1977v032n02ABEH002376
Google Scholar
Cross Ref
- Federico Mora, Murphy Berzish, Mitja Kulczynski, Dirk Nowotka, and Vijay Ganesh. 2021. Z3str4: A Multi-armed String Solver. In Formal Methods - 24th International Symposium, FM 2021, Virtual Event, November 20-26, 2021, Proceedings, Marieke Huisman, Corina S. Pasareanu, and Naijun Zhan (Eds.) (Lecture Notes in Computer Science, Vol. 13047). Springer, 389–406. https://doi.org/10.1007/978-3-030-90870-6_21
Google Scholar
Digital Library
- Wojciech Plandowski. 1999. Satisfiability of Word Equations with Constants is in PSPACE. In 40th Annual Symposium on Foundations of Computer Science, FOCS ’99, 17-18 October, 1999, New York, NY, USA. IEEE Computer Society, 495–500. https://doi.org/10.1109/SFFCS.1999.814622
Google Scholar
Cross Ref
- W. V. Quine. 1946. Concatenation as a Basis for Arithmetic. The Journal of Symbolic Logic, 11, 4 (1946), 105–114. issn:00224812 http://www.jstor.org/stable/2268308
Google Scholar
Cross Ref
- Klaus U. Schulz. 1990. Makanin’s Algorithm for Word Equations - Two Improvements and a Generalization. In Word Equations and Related Topics, First International Workshop, IWWERT ’90, Tübingen, Germany, October 1-3, 1990, Proceedings, Klaus U. Schulz (Ed.) (Lecture Notes in Computer Science, Vol. 572). Springer, 85–150. https://doi.org/10.1007/3-540-55124-7_4
Google Scholar
Cross Ref
- Minh-Thai Trinh, Duc-Hiep Chu, and Joxan Jaffar. 2016. Progressive Reasoning over Recursively-Defined Strings. In Computer Aided Verification - 28th International Conference, CAV 2016, Toronto, ON, Canada, July 17-23, 2016, Proceedings, Part I, Swarat Chaudhuri and Azadeh Farzan (Eds.) (Lecture Notes in Computer Science, Vol. 9779). Springer, 218–240. https://doi.org/10.1007/978-3-319-41528-4_12
Google Scholar
Cross Ref
- Detlef Wotschke. 1973. The Boolean Closures of the Deterministic and Nondeterministic Context-Free Languages. In Gesellschaft für Informatik e.V., 3. Jahrestagung, Hamburg, Deutschland, 8.-10. Oktober 1973, Wilfried Brauer (Ed.) (Lecture Notes in Computer Science, Vol. 1). Springer, 113–121. https://doi.org/10.1007/3-540-06473-7_11
Google Scholar
Cross Ref
Index Terms
On the Expressive Power of String Constraints
Recommendations
Word Equations in the Context of String Solving
Developments in Language TheoryAbstractString solvers are tools for automatically reasoning about words over some finite alphabet. They are commonly used to perform analyses of string manipulating programs. A fundamental problem which string solvers need to address is solving word ...
Towards more efficient methods for solving regular-expression heavy string constraints
AbstractWidespread use of string solvers in the formal analysis of string-heavy programs has led to a growing demand for more efficient and reliable techniques which can be applied in this context. Designing practical algorithms for the (...
What is decidable about string constraints with the ReplaceAll function
The theory of strings with concatenation has been widely argued as the basis of constraint solving for verifying string-manipulating programs. However, this theory is far from adequate for expressing many string constraints that are also needed in ...






Comments