skip to main content
research-article
Open Access

What is decidable about string constraints with the ReplaceAll function

Authors Info & Claims
Published:27 December 2017Publication History
Skip Abstract Section

Abstract

The theory of strings with concatenation has been widely argued as the basis of constraint solving for verifying string-manipulating programs. However, this theory is far from adequate for expressing many string constraints that are also needed in practice; for example, the use of regular constraints (pattern matching against a regular expression), and the string-replace function (replacing either the first occurrence or all occurrences of a ``pattern'' string constant/variable/regular expression by a ``replacement'' string constant/variable), among many others. Both regular constraints and the string-replace function are crucial for such applications as analysis of JavaScript (or more generally HTML5 applications) against cross-site scripting (XSS) vulnerabilities, which motivates us to consider a richer class of string constraints. The importance of the string-replace function (especially the replace-all facility) is increasingly recognised, which can be witnessed by the incorporation of the function in the input languages of several string constraint solvers.

Recently, it was shown that any theory of strings containing the string-replace function (even the most restricted version where pattern/replacement strings are both constant strings) becomes undecidable if we do not impose some kind of straight-line (aka acyclicity) restriction on the formulas. Despite this, the straight-line restriction is still practically sensible since this condition is typically met by string constraints that are generated by symbolic execution. In this paper, we provide the first systematic study of straight-line string constraints with the string-replace function and the regular constraints as the basic operations. We show that a large class of such constraints (i.e. when only a constant string or a regular expression is permitted in the pattern) is decidable. We note that the string-replace function, even under this restriction, is sufficiently powerful for expressing the concatenation operator and much more (e.g. extensions of regular expressions with string variables). This gives us the most expressive decidable logic containing concatenation, replace, and regular constraints under the same umbrella. Our decision procedure for the straight-line fragment follows an automata-theoretic approach, and is modular in the sense that the string-replace terms are removed one by one to generate more and more regular constraints, which can then be discharged by the state-of-the-art string constraint solvers. We also show that this fragment is, in a way, a maximal decidable subclass of the straight-line fragment with string-replace and regular constraints. To this end, we show undecidability results for the following two extensions: (1) variables are permitted in the pattern parameter of the replace function, (2) length constraints are permitted.

Skip Supplemental Material Section

Supplemental Material

whatsdecidable.webm

References

  1. Parosh Aziz Abdulla, Mohamed Faouzi Atig, Yu-Fang Chen, Bui Phi Diep, Lukás Holík, Ahmed Rezine, and Philipp Rümmer. 2017. Flatten and conquer: a framework for efficient analysis of string constraints. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017. 602–617. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Parosh Aziz Abdulla, Mohamed Faouzi Atig, Yu-Fang Chen, Lukás Holík, Ahmed Rezine, Philipp Rümmer, and Jari Stenman. 2014. String Constraints for Verification. In CAV. 150–166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Rajeev Alur and Pavol Cerný. 2010. Expressiveness of streaming string transducers. In IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2010, December 15-18, 2010, Chennai, India. 1–12.Google ScholarGoogle Scholar
  4. Christel Baier and Joost-Pieter Katoen. 2008. Principles of Model Checking (Representation and Mind Series). The MIT Press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Davide Balzarotti, Marco Cova, Viktoria Felmetsger, Nenad Jovanovic, Engin Kirda, Christopher Kruegel, and Giovanni Vigna. 2008. Saner: Composing Static and Dynamic Analysis to Validate Sanitization in Web Applications. In 2008 IEEE Symposium on Security and Privacy (S&P 2008), 18-21 May 2008, Oakland, California, USA. 387–401. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Nikolaj Bjørner, Nikolai Tillmann, and Andrei Voronkov. 2009. Path feasibility analysis for string-manipulating programs. In TACAS. 307–321. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J Richard Büchi and Steven Senger. 1990. Definability in the existential theory of concatenation and undecidable extensions of this theory. In The Collected Works of J. Richard Büchi. Springer, 671–683.Google ScholarGoogle Scholar
  8. Cristian Cadar, Vijay Ganesh, Peter M. Pawlowski, David L. Dill, and Dawson R. Engler. 2006. EXE: Automatically Generating Inputs of Death. In Proceedings of the 13th ACM Conference on Computer and Communications Security (CCS ’06). ACM, New York, NY, USA, 322–335. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Przemyslaw Daca, Thomas A. Henzinger, and Andrey Kupriyanov. 2016. Array Folds Logic. In Computer Aided Verification -28th International Conference, CAV 2016, Toronto, ON, Canada, July 17-23, 2016, Proceedings, Part II. 230–248. Google ScholarGoogle ScholarCross RefCross Ref
  10. Loris D’Antoni and Margus Veanes. 2013. Static Analysis of String Encoders and Decoders. In VMCAI. 209–228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Vijay Ganesh, Mia Minnes, Armando Solar-Lezama, and Martin C. Rinard. 2012. Word Equations with Length Constraints: What’s Decidable?. In Hardware and Software: Verification and Testing - 8th International Haifa Verification Conference, HVC 2012, Haifa, Israel, November 6-8, 2012. Revised Selected Papers. 209–226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Patrice Godefroid, Nils Klarlund, and Koushik Sen. 2005. DART: Directed Automated Random Testing. SIGPLAN Not. 40, 6 (June 2005), 213–223. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Google. 2015. Closure Templates. https://developers.google.com/closure/templates/ . Referred July 2017.Google ScholarGoogle Scholar
  14. Pieter Hooimeijer, Benjamin Livshits, David Molnar, Prateek Saxena, and Margus Veanes. 2011. Fast and Precise Sanitizer Analysis with BEK. In USENIX Security Symposium.Google ScholarGoogle Scholar
  15. John E. Hopcroft and Jeffrey D. Ullman. 1979. Introduction to Automata Theory, Languages and Computation. Addison-Wesley.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Artur Jez. 2017. Word equations in linear space. CoRR abs/1702.00736 (2017). http://arxiv.org/abs/1702.00736Google ScholarGoogle Scholar
  17. Christoph Kern. 2014. Securing the tangled web. Commun. ACM 57, 9 (2014), 38–47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Adam Kiezun et al. 2012. HAMPI: A solver for word equations over strings, regular expressions, and context-free grammars. ACM Trans. Softw. Eng. Methodol. 21, 4 (2012), 25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. James C. King. 1976. Symbolic Execution and Program Testing. Commun. ACM 19, 7 (1976), 385–394. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jan Lehnardt and contributors. 2015. mustache.js. https://github.com/janl/mustache.js/ . Referred July 2017.Google ScholarGoogle Scholar
  21. Tianyi Liang, Andrew Reynolds, Cesare Tinelli, Clark Barrett, and Morgan Deters. 2014. A DPLL(T) Theory Solver for a Theory of Strings and Regular Expressions. In CAV. 646–662.Google ScholarGoogle Scholar
  22. Anthony W. Lin and Pablo Barceló. 2016. String Solving with Word Equations and Transducers: Towards a Logic for Analysing Mutation XSS. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’16). Springer, 123–136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Gennady S Makanin. 1977. The problem of solvability of equations in a free semigroup. Sbornik: Mathematics 32, 2 (1977), 129–198. Google ScholarGoogle ScholarCross RefCross Ref
  24. Yuri V. Matiyasevich. 1993. Hilbert’s Tenth Problem. MIT Press, Cambridge, MA, USA.Google ScholarGoogle Scholar
  25. K. L. McMillan. 1993. Symbolic model checking. Kluwer. Google ScholarGoogle ScholarCross RefCross Ref
  26. Wojciech Plandowski. 2004. Satisfiability of word equations with constants is in PSPACE. J. ACM 51, 3 (2004), 483–496. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Prateek Saxena, Devdatta Akhawe, Steve Hanna, Feng Mao, Stephen McCamant, and Dawn Song. 2010. A Symbolic Execution Framework for JavaScript. In 31st IEEE Symposium on Security and Privacy, S&P 2010, 16-19 May 2010, Berleley/Oakland, California, USA. 513–528. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Klaus U. Schulz. 1990. Makanin’s Algorithm for Word Equations - Two Improvements and a Generalization. In Word Equations and Related Topics, First International Workshop, IWWERT ’90, Tübingen, Germany, October 1-3, 1990, Proceedings. 85–150. Google ScholarGoogle ScholarCross RefCross Ref
  29. Koushik Sen, Swaroop Kalasapur, Tasneem G. Brutch, and Simon Gibbs. 2013. Jalangi: a selective record-replay and dynamic analysis framework for JavaScript. In Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE’13, Saint Petersburg, Russian Federation, August 18-26, 2013. 488–498. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Minh-Thai Trinh, Duc-Hiep Chu, and Joxan Jaffar. 2014. S3: A Symbolic String Solver for Vulnerability Detection in Web Applications. In CCS. 1232–1243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Minh-Thai Trinh, Duc-Hiep Chu, and Joxan Jaffar. 2016. Progressive Reasoning over Recursively-Defined Strings. In Computer Aided Verification - 28th International Conference, CAV 2016, Toronto, ON, Canada, July 17-23, 2016, Proceedings, Part I. Springer, 218–240. Google ScholarGoogle ScholarCross RefCross Ref
  32. Margus Veanes, Pieter Hooimeijer, Benjamin Livshits, David Molnar, and Nikolaj Bjørner. 2012. Symbolic finite state transducers: algorithms and applications. In POPL. 137–150. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Hung-En Wang, Tzung-Lin Tsai, Chun-Han Lin, Fang Yu, and Jie-Hong R. Jiang. 2016. String Analysis via Automata Manipulation with Logic Circuit Representation. In Computer Aided Verification - 28th International Conference, CAV 2016, Toronto, ON, Canada, July 17-23, 2016, Proceedings, Part I (Lecture Notes in Computer Science), Vol. 9779. Springer, 241–260. Google ScholarGoogle ScholarCross RefCross Ref
  34. Chris Wanstrath. 2009. Mustache: Logic-less Templates. https://mustache.github.io/ . Referred July 2017.Google ScholarGoogle Scholar
  35. Jeff Williams, Jim Manico, and Neil Mattatall. 2017. XSS Prevention Cheat Sheet. https://www.owasp.org/index.php/XSS_ (Cross_Site_Scripting)_Prevention_Cheat_Sheet . Referred July 2017.Google ScholarGoogle Scholar
  36. Fang Yu, Muath Alkhalaf, Tevfik Bultan, and Oscar H. Ibarra. 2014. Automata-based Symbolic String Analysis for Vulnerability Detection. Form. Methods Syst. Des. 44, 1 (2014), 44–70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Yunhui Zheng, Xiangyu Zhang, and Vijay Ganesh. 2013. Z3-str: a Z3-based string solver for web application analysis. In ESEC/SIGSOFT FSE. 114–124. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. What is decidable about string constraints with the ReplaceAll function

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              Full Access

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader
              About Cookies On This Site

              We use cookies to ensure that we give you the best experience on our website.

              Learn more

              Got it!