Abstract
We study the fundamental issue of decidability of satisfiability over string logics with concatenations and finite-state transducers as atomic operations. Although restricting to one type of operations yields decidability, little is known about the decidability of their combined theory, which is especially relevant when analysing security vulnerabilities of dynamic web pages in a more realistic browser model. On the one hand, word equations (string logic with concatenations) cannot precisely capture sanitisation functions (e.g. htmlescape) and implicit browser transductions (e.g. innerHTML mutations). On the other hand, transducers suffer from the reverse problem of being able to model sanitisation functions and browser transductions, but not string concatenations. Naively combining word equations and transducers easily leads to an undecidable logic. Our main contribution is to show that the "straight-line fragment" of the logic is decidable (complexity ranges from PSPACE to EXPSPACE). The fragment can express the program logics of straight-line string-manipulating programs with concatenations and transductions as atomic operations, which arise when performing bounded model checking or dynamic symbolic executions. We demonstrate that the logic can naturally express constraints required for analysing mutation XSS in web applications. Finally, the logic remains decidable in the presence of length, letter-counting, regular, indexOf, and disequality constraints.
- BEK website (referred in Nov 2015). http://research. microsoft.com/en-us/projects/bek/.Google Scholar
- OWASP XSS cheat sheet (referred in Nov 2015). https: //www.owasp.org/index.php/XSS_(Cross_Site_Scripting) _Prevention_Cheat_Sheet.Google Scholar
- SAT competition (referred in Nov 2015). http://www. satcompetition.org/.Google Scholar
- SMT competition (referred in Nov 2015). http://www.smtcomp. org/.Google Scholar
- Google Closure Library (referred in Nov 2015). https:// developers.google.com/closure/library/.Google Scholar
- HTML5 Security cheat sheet (referred in Nov 2015). http:// html5sec.org/.Google Scholar
- P. A. Abdulla, M. F. Atig, Y. Chen, L. Holík, A. Rezine, P. Rümmer, and J. Stenman. String constraints for verification. In CAV, pages 150–166, 2014. Google Scholar
Digital Library
- D. Balzarotti, M. Cova, V. Felmetsger, N. Jovanovic, E. Kirda, C. Kruegel, and G. Vigna. Saner: Composing static and dynamic analysis to validate sanitization in web applications. In S&P, pages 387––401, 2008. Google Scholar
Digital Library
- P. Barceló, L. Libkin, A. W. Lin, and P. T. Wood. Expressive languages for path queries over graph-structured data. ACM Trans. Database Syst., 37(4):31, 2012. Google Scholar
Digital Library
- P. Barceló, D. Figueira, and L. Libkin. Graph logics with rational relations. Logical Methods in Computer Science, 9(3), 2013..Google Scholar
- C. W. Barrett, R. Sebastiani, S. A. Seshia, and C. Tinelli. Satisfiability modulo theories. In Biere et al. {15}, pages 825–885..Google Scholar
- W. Bekker and V. Goranko. Symbolic model checking of tense logics on rational Kripke models. In Infinity in Logic and Computation, International Conference, ILC 2007, Cape Town, South Africa, November 3-5, 2007, Revised Selected Papers, pages 2–20, 2007.. Google Scholar
Digital Library
- W. Bekker and V. Goranko. Symbolic model checking of tense logics on rational Kripke models. CoRR, abs/0810.5516, 2008.Google Scholar
- J. Berstel. Transductions and Context-Free Languages. Teubner-Verlag, 1979.Google Scholar
Cross Ref
- A. Biere, M. Heule, H. van Maaren, and T. Walsh, editors. Handbook of Satisfiability, volume 185 of Frontiers in Artificial Intelligence and Applications, 2009. IOS Press. Google Scholar
Digital Library
- N. Bjørner, N. Tillmann, and A. Voronkov. Path feasibility analysis for string-manipulating programs. In TACAS, pages 307–321, 2009.Google Scholar
Digital Library
- A. Blumensath and E. Grädel. Automatic structures. In LICS, pages 51–62, 2000.. Google Scholar
Digital Library
- A. Blumensath and E. Grädel. Finite Presentations of Infinite Structures: Automata and Interpretations. Theory Comput. Syst., 37(6):641– 674, 2004.Google Scholar
- J. R. Büchi and S. Senger. Definability in the existential theory of concatenation and undecidable extensions of this theory. In The Collected Works of J. Richard Büchi, pages 671–683. Springer, 1990.Google Scholar
- O. Carton, C. Choffrut, and S. Grigorieff. Decision problems among the main subfamilies of rational relations. ITA, 40(2):255–275, 2006.Google Scholar
- A. S. Christensen, A. Møller, and M. I. Schwartzbach. Precise analysis of string expressions. In SAS, pages 1–18, 2003. Google Scholar
Digital Library
- T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to Algorithms, Third Edition. The MIT Press, 3rd edition, 2009. ISBN 0262033844, 9780262033848. Google Scholar
Digital Library
- L. D’Antoni and M. Veanes. Static analysis of string encoders and decoders. In VMCAI, pages 209–228, 2013.Google Scholar
Digital Library
- L. De Moura and N. Bjørner. Satisfiability modulo theories: introduction and applications. Commun. ACM, 54(9):69–77, 2011. Google Scholar
Digital Library
- V. Diekert. Makanin’s Algorithm. In M. Lothaire, editor, Algebraic Combinatorics on Words, volume 90 of Encyclopedia of Mathematics and its Applications, chapter 12, pages 387–442. Cambridge University Press, 2002.Google Scholar
- V. D’Silva, D. Kroening, and G. Weissenbacher. A survey of automated techniques for formal software verification. IEEE Trans. on CAD of Integrated Circuits and Systems, 27(7):1165–1178, 2008. Google Scholar
Digital Library
- J. Esparza, P. Ganty, S. Kiefer, and M. Luttenberger. Parikh’s theorem: A simple and direct automaton construction. Inf. Process. Lett., 111 (12):614–619, 2011. Google Scholar
Digital Library
- X. Fu and C. Li. Modeling regular replacement for string constraint solving. In NFM, pages 67–76, 2010.Google Scholar
- X. Fu, M. C. Powell, M. Bantegui, and C. Li. Simple linear string constraints. Formal Asp. Comput., 25(6):847–891, 2013.Google Scholar
Cross Ref
- V. Ganesh, M. Minnes, A. Solar-Lezama, and M. Rinard. Word equations with length constraints: whats decidable? In Hardware and Software: Verification and Testing, pages 209–226. Springer, 2013. Google Scholar
Digital Library
- C. Gould, Z. Su, and P. T. Devanbu. Static checking of dynamically generated queries in database applications. In ICSE, pages 645–654, 2004. Google Scholar
Digital Library
- M. Heiderich, J. Schwenk, T. Frosch, J. Magazinius, and E. Z. Yang. mxss attacks: attacking well-secured web-applications by using innerhtml mutations. In CCS, pages 777–788, 2013. Google Scholar
Digital Library
- P. Hooimeijer and M. Veanes. An evaluation of automata algorithms for string analysis. In VMCAI, pages 248–262, 2011. Google Scholar
Digital Library
- P. Hooimeijer and W. Weimer. StrSolve: solving string constraints lazily. Autom. Softw. Eng., 19(4):531–559, 2012.Google Scholar
Cross Ref
- P. Hooimeijer, B. Livshits, D. Molnar, P. Saxena, and M. Veanes. Fast and precise sanitizer analysis with BEK. In USENIX Security Symposium, 2011. URL http://static.usenix.org/events/ sec11/tech/full_papers/Hooimeijer.pdf. Google Scholar
Digital Library
- O. H. Ibarra. Reversal-bounded multicounter machines and their decision problems. J. ACM, 25(1):116–133, 1978. Google Scholar
Digital Library
- C. Kern. Securing the tangled web. Commun. ACM, 57(9):38–47, Sept. 2014. Google Scholar
Digital Library
- A. Kiezun et al. HAMPI: A solver for word equations over strings, regular expressions, and context-free grammars. ACM Trans. Softw. Eng. Methodol., 21(4):25, 2012. Google Scholar
Digital Library
- N. Klarlund, A. Møller, and M. I. Schwartzbach. MONA implementation secrets. International Journal of Foundations of Computer Science, 13(04):571–586, 2002.Google Scholar
Cross Ref
- E. Kopczynski and A. W. To. Parikh images of grammars: Complexity and applications. In LICS, 2010. Google Scholar
Digital Library
- D. Kozen. Lower bounds for natural proof systems. In FOCS, pages 254–266, 1977. Google Scholar
Digital Library
- D. Kroening and O. Strichman. Decision Procedures. Springer, 2008.Google Scholar
- T. Liang, A. Reynolds, C. Tinelli, C. Barrett, and M. Deters. A DPLL(T) theory solver for a theory of strings and regular expressions. In CAV, pages 646–662, 2014. Google Scholar
Digital Library
- A. W. Lin and P. Barceló. String Solving with Word Equations and Transducers: Towards a Logic for Analysing Mutation XSS (Full Version). http://arxiv.org/abs/1511.01633 (cited in 2015). Google Scholar
Digital Library
- G. S. Makanin. The problem of solvability of equations in a free semigroup. Sbornik: Mathematics, 32(2):129–198, 1977.Google Scholar
- S. Malik and L. Zhang. Boolean satisfiability from theoretical hardness to practical success. Commun. ACM, 52(8):76–82, 2009. Google Scholar
Digital Library
- K. L. McMillan. Symbolic model checking. Kluwer, 1993. Google Scholar
Digital Library
- Y. Minamide. Static approximation of dynamically generated web pages. In WWW, pages 432–441, 2005. Google Scholar
Digital Library
- C. Morvan. On rational graphs. In FoSSaCS, pages 252–266, 2000. Google Scholar
Digital Library
- W. Plandowski. Satisfiability of word equations with constants is in PSPACE. In FOCS, pages 495–500, 1999. Google Scholar
Digital Library
- W. Plandowski. Satisfiability of word equations with constants is in PSPACE. J. ACM, 51(3):483–496, 2004. Google Scholar
Digital Library
- W. Plandowski. An efficient algorithm for solving word equations. In STOC, pages 467–476, 2006. Google Scholar
Digital Library
- G. Redelinghuys, W. Visser, and J. Geldenhuys. Symbolic execution of programs with strings. In SAICSIT, pages 139–148, 2012. Google Scholar
Digital Library
- J. Sakarovitch. Elements of automata theory. Cambridge University Press, 2009. Google Scholar
Digital Library
- Y. Sakuma, Y. Minamide, and A. Voronkov. Translating regular expression matching into transducers. J. Applied Logic, 10(1):32–51, 2012. Google Scholar
Digital Library
- W. J. Savitch. Relationships between nondeterministic and deterministic tape complexities. J. Comput. Syst. Sci., 4(2):177–192, 1970. Google Scholar
Digital Library
- P. Saxena, D. Akhawe, S. Hanna, F. Mao, S. McCamant, and D. Song. A symbolic execution framework for javascript. In S&P, pages 513–– 528, 2010. Google Scholar
Digital Library
- P. Saxena, D. Molnar, and B. Livshits. SCRIPTGARD: automatic context-sensitive sanitization for large-scale legacy web applications. In CCS, pages 601–614, 2011. Google Scholar
Digital Library
- B. Scarpellini. Complexity of subcases of presburger arithmetic. Trans. of AMS, 284(1):203–218, 1984.Google Scholar
Cross Ref
- S. Schwoon. Model-Checking Pushdown Systems. PhD thesis, Technischen Universität München, 2002.Google Scholar
- M. Sipser. Introduction to the Theory of Computation. PWS Publishing Company, 1997. Google Scholar
Digital Library
- B. Stock, S. Lekies, T. Mueller, P. Spiegel, and M. Johns. Precise client-side protection against dom-based cross-site scripting. In USENIX Security, pages 655–670, 2014. Google Scholar
Digital Library
- A. W. To. Model Checking Infinite-State Systems: Generic and Specific Approaches. PhD thesis, LFCS, School of Informatics, University of Edinburgh, 2010.Google Scholar
- A. W. To and L. Libkin. Algorithmic metatheorems for decidable LTL model checking over infinite systems. In FOSSACS, 2010. Google Scholar
Digital Library
- M. Trinh, D. Chu, and J. Jaffar. S3: A symbolic string solver for vulnerability detection in web applications. In CCS, pages 1232–1243, 2014. Google Scholar
Digital Library
- M. Veanes, P. Hooimeijer, B. Livshits, D. Molnar, and N. Bjørner. Symbolic finite state transducers: algorithms and applications. In POPL, pages 137–150, 2012. Google Scholar
Digital Library
- G. Wassermann and Z. Su. Sound and precise analysis of web applications for injection vulnerabilities. In PLDI, pages 32–41, 2007. Google Scholar
Digital Library
- G. Wassermann and Z. Su. Static detection of cross-site scripting vulnerabilities. In ICSE, pages 171–180, 2008. Google Scholar
Digital Library
- G. Wassermann, D. Yu, A. Chander, D. Dhurjati, H. Inamura, and Z. Su. Dynamic test input generation for web applications. In ISSTA, pages 249–260, 2008. Google Scholar
Digital Library
- J. Weinberger, P. Saxena, D. Akhawe, M. Finifter, E. C. R. Shin, and D. Song. A systematic analysis of XSS sanitization in web application frameworks. In ESORICS, pages 150–171, 2011. Google Scholar
Digital Library
- F. Yu, T. Bultan, and O. H. Ibarra. Symbolic string verification: Combining string analysis and size analysis. In TACAS, pages 322– 336, 2009. Google Scholar
Digital Library
- F. Yu, M. Alkhalaf, and T. Bultan. Stranger: An automata-based string analysis tool for PHP. In TACAS, pages 154–157, 2010. Benchmark can be found at http://www.cs.ucsb.edu/~vlab/stranger/. Google Scholar
Digital Library
- F. Yu, M. Alkhalaf, and T. Bultan. Patching vulnerabilities with sanitization synthesis. In ICSE, pages 251–260, 2011. Google Scholar
Digital Library
- F. Yu, T. Bultan, and O. H. Ibarra. Relational string verification using multi-track automata. Int. J. Found. Comput. Sci., 22(8):1909–1924, 2011.Google Scholar
- F. Yu, M. Alkhalaf, T. Bultan, and O. H. Ibarra. Automata-based symbolic string analysis for vulnerability detection. Formal Methods in System Design, 44(1):44–70, 2014. Google Scholar
Digital Library
- Y. Zheng, X. Zhang, and V. Ganesh. Z3-str: a Z3-based string solver for web application analysis. In ESEC/SIGSOFT FSE, pages 114–124, 2013. Google Scholar
Digital Library
Index Terms
String solving with word equations and transducers: towards a logic for analysing mutation XSS
Recommendations
String solving with word equations and transducers: towards a logic for analysing mutation XSS
POPL '16: Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming LanguagesWe study the fundamental issue of decidability of satisfiability over string logics with concatenations and finite-state transducers as atomic operations. Although restricting to one type of operations yields decidability, little is known about the ...
Copyful Streaming String Transducers
Special Issue on the 11th International Workshop on Reachability Problems (RP 2017)Copyless streaming string transducers (copyless SST) have been introduced by R. Alur and P. Černý in 2010 as a one-way deterministic automata model to define transductions of finite strings. Copyless SST extend deterministic finite state automata with a ...
Regular Transducer Expressions for Regular Transformations
LICS '18: Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer ScienceFunctional MSO transductions, deterministic two-way transducers, as well as streaming string transducers are all equivalent models for regular functions. In this paper, we show that every regular function, either on finite words or on infinite words, ...






Comments