skip to main content
research-article

A decision procedure for subset constraints over regular languages

Published:15 June 2009Publication History
Skip Abstract Section

Abstract

Reasoning about string variables, in particular program inputs, is an important aspect of many program analyses and testing frameworks. Program inputs invariably arrive as strings, and are often manipulated using high-level string operations such as equality checks, regular expression matching, and string concatenation. It is difficult to reason about these operations because they are not well-integrated into current constraint solvers.

We present a decision procedure that solves systems of equations over regular language variables. Given such a system of constraints, our algorithm finds satisfying assignments for the variables in the system. We define this problem formally and render a mechanized correctness proof of the core of the algorithm. We evaluate its scalability and practical utility by applying it to the problem of automatically finding inputs that cause SQL injection vulnerabilities.

References

  1. S. Adams, T. Ball, M. Das, S. Lerner, S. K. Rajamani, M. Seigle, and W. Weimer. Speeding up dataflow analysis using flow-insensitive pointer analysis. In Static Analysis Symposium, pages 230--246, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Bala. Regular language matching and other decidable cases of the satisfiability problem for constraints between regular open terms. In STACS, pages 596--607, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  3. T. Ball, B. Cook, S. K. Lahiri, and L. Zhang. Zapato: Automatic theorem proving for predicate abstraction refinement. In Computer Aided Verification, pages 457--461, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  4. T. Ball, M. Naik, and S. K. Rajamani. From symptom to cause: localizing errors in counterexample traces. SIGPLAN Not., 38(1):97--105, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. T. Ball and S. K. Rajamani. Automatically validating temporal safety properties of interfaces. In SPIN Workshop on Model Checking of Software, pages 103--122, May 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Y. Bertot and P. Casteran. Interactive Theorem Proving and Program Development. SpringerVerlag, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. N. Bjørner, N. Tillmann, and A. Voronkov. Path feasibility analysis for string-manipulating programs. In Tools and Algorithms for the Construction and Analysis of Systems, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. British Broadcasting Corporation. UN's website breached by hackers. In http://news.bbc.co.uk/2/hi/technology/6943385.stm, Aug. 2007.Google ScholarGoogle Scholar
  9. R. E. Bryant, D. Kroening, J. Ouaknine, S. A. Seshia, O. Strichman, and B. Brady. Deciding bit-vector arithmetic with abstraction. In Tools and Algorithms for the Construction and Analysis of Systems, pages 358--372, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Cadar, V. Ganesh, P. M. Pawlowski, D. L. Dill, and D. R. Engler. EXE: automatically generating inputs of death. In Computer and Communications Security, pages 322--335, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. S. Christensen, A. Møller, and M. I. Schwartzbach. Precise analysis of string expressions. In International Symposium on Static Analysis, pages 1--18, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. Coquand and G. P. Huet. The calculus of constructions. Inf. Comput., 76(2/3):95--120, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. L. M. de Moura and N. Bjørner. Z3: An efficient SMT solver. In Tools and Algorithms for the Construction and Analysis of Systems, pages 337--340, 2008. Google ScholarGoogle ScholarCross RefCross Ref
  14. D. Detlefs, G. Nelson, and J. B. Saxe. Simplify: a theorem prover for program checking. J. ACM, 52(3):365--473, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. V. Ganesh and D. L. Dill. A decision procedure for bit-vectors and arrays. In Computer-Aided Verification, pages 519--531, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. P. Godefroid, A. Kie|un, and M. Y. Levin. Grammar-based whitebox fuzzing. In Programming Language Design and Implementation, Tucson, AZ, USA, June 9--11, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. P. Godefroid, N. Klarlund, and K. Sen. DART: directed automated random testing. In Programming Language Design and Implementation, pages 213--223, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. Godefroid, M. Levin, and D. Molnar. Automated whitebox fuzz testing. In Network Distributed Security Symposium (NDSS), 2008.Google ScholarGoogle Scholar
  19. T. A. Henzinger, R. Jhala, R. Majumdar, G. C. Necula, G. Sutre, and W. Weimer. Temporal-safety proofs for systems code. In Computer Aided Verification, pages 526--538, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. T. A. Henzinger, R. Jhala, R. Majumdar, and G. Sutre. Lazy abstraction. In Principles of Programming Languages, pages 58--70, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. K. J. Higgins. Cross-site scripting: attackers' new favorite flaw. Technical report, http://www.darkreading.com/document.asp?doc_id=103774&WT.svl=news1_1, Sept. 2006.Google ScholarGoogle Scholar
  22. P. Hooimeijer and W. Weimer. Modeling bug report quality. In International Conference on Automated Software Engineering, pages 73--82, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. R. Jhala and R. Majumdar. Path slicing. In Programming Language Design and Implementation, pages 38--47, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. N. Jovanovic, C. Kruegel, and E. Kirda. Pixy: A static analysis tool for detecting web application vulnerabilities (short paper). In Symposium on Security and Privacy, pages 258--263, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Kie|un, V. Ganesh, P. J. Guo, P. Hooimeijer, and M. D. Ernst. HAMPI: A solver for string constraints. technical report, Massachusetts Institute of Technology Computer Science and Artificial Intelligence Laboratory.Google ScholarGoogle Scholar
  26. J. Kodumal and A. Aiken. Banshee: A scalable constraint-based analysis toolkit. In Static Analysis Symposium, pages 218--234, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Kunc. The power of commuting with finite sets of words. Theory Comput. Syst., 40(4):521--551, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. Kunc. What do we know about language equations? In Developments in Language Theory, pages 23--27, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. K. Lahiri, T. Ball, and B. Cook. Predicate abstraction via symbolic decision procedures. Logical Methods in Computer Science, 3(2), 2007.Google ScholarGoogle Scholar
  30. R. Majumdar and R.-G. Xu. Directed test generation using symbolic grammars. In Automated Software Engineering, pages 134--143, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M. C. Martin, V. B. Livshits, and M. S. Lam. Finding application errors and security flaws using PQL: a program query language. In Object-Oriented Programming, Systems, Languages, and Applications, pages 365--383, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Y. Minamide. Static approximation of dynamically generated web pages. In International Conference on the World Wide Web, pages 432--441, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. W. Moskewicz, C. F. Madigan, Y. Zhao, L. Zhang, and S. Malik. Chaff: engineering an efficient SAT solver. In Design Automation Conference, pages 530--535, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. Naik and A. Aiken. Conditional must not aliasing for static race detection. In Principles of Programming Languages, pages 327--338, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. G. C. Necula. Proof-carrying code. In Principles of Programming Languages, pages 106--119, New York, NY, USA, 1997. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. G. Nelson and D. C. Oppen. Simplification by cooperating decision procedures. ACM Trans. Program. Lang. Syst., 1(2):245--257, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. A. Salomaa, K. Salomaa, and S. Yu. State complexity of combined operations. Theor. Comput. Sci., 383(2--3):140--152, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. K. Sen. Race directed random testing of concurrent programs. In Programming Language Design and Implementation, pages 11--21, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. B. Steensgaard. Points-to analysis in almost linear time. In Principles of Programming Languages, pages 32--41, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. A. Stump, C. W. Barrett, and D. L. Dill. Cvc: A cooperating validity checker. In Computer Aided Verification, pages 500--504, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Z. Su and G. Wassermann. The essence of command injection attacks in web applications. In Principles of Programming Languages, pages 372--382, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. P. Thiemann. Grammar-based analysis of string expressions. In Workshop on Types in Languages Design and Implementation, pages 59--70, New York, NY, USA, 2005. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. G. Wassermann and Z. Su. Sound and precise analysis of web applications for injection vulnerabilities. In Programming Language Design and Implementation, pages 32--41, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. G. Wassermann and Z. Su. Static detection of cross-site scripting vulnerabilities. In International Conference on Software Engineering, pages 171--180, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. G. Wassermann, D. Yu, A. Chander, D. Dhurjati, H. Inamura, and Z. Su. Dynamic test input generation for web applications. In International Symposium on Software testing and analysis, pages 249--260, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. W. Weimer. Patches as better bug reports. In Generative Programming and Component Engineering, pages 181--190, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Y. Xie and A. Aiken. Static detection of security vulnerabilities in scripting languages. In Usenix Security Symposium, pages 179--192, July 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Y. Xie and A. Aiken. Saturn: A scalable framework for error detection using boolean satisfiability. ACM Trans. Program. Lang. Syst., 29(3): 16, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. F. Yu, T. Bultan, M. Cova, and O. H. Ibarra. Symbolic string verification: An automata-based approach. In SPIN'08: Proceedings of the 15th international workshop on Model Checking Software, pages 306--324, Berlin, Heidelberg, 2008. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. F. Yu, T. Bultan, and O. H. Ibarra. Symbolic string verification: Combining string analysis and size analysis. In Tools and Algorithms for the Construction and Analysis of Systems, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A decision procedure for subset constraints over regular languages

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              Full Access

              • Published in

                cover image ACM SIGPLAN Notices
                ACM SIGPLAN Notices  Volume 44, Issue 6
                PLDI '09
                June 2009
                478 pages
                ISSN:0362-1340
                EISSN:1558-1160
                DOI:10.1145/1543135
                Issue’s Table of Contents
                • cover image ACM Conferences
                  PLDI '09: Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation
                  June 2009
                  492 pages
                  ISBN:9781605583921
                  DOI:10.1145/1542476

                Copyright © 2009 ACM

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 15 June 2009

                Check for updates

                Qualifiers

                • research-article

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader
              About Cookies On This Site

              We use cookies to ensure that we give you the best experience on our website.

              Learn more

              Got it!