Abstract
We introduce a general way to locate programmer mistakes that are detected by static analyses. The program analysis is expressed in a general constraint language that is powerful enough to model type checking, information flow analysis, dataflow analysis, and points-to analysis. Mistakes in program analysis result in unsatisfiable constraints. Given an unsatisfiable system of constraints, both satisfiable and unsatisfiable constraints are analyzed to identify the program expressions most likely to be the cause of unsatisfiability. The likelihood of different error explanations is evaluated under the assumption that the programmer’s code is mostly correct, so the simplest explanations are chosen, following Bayesian principles. For analyses that rely on programmer-stated assumptions, the diagnosis also identifies assumptions likely to have been omitted. The new error diagnosis approach has been implemented as a tool called SHErrLoc, which is applied to three very different program analyses, such as type inference for a highly expressive type system implemented by the Glasgow Haskell Compiler—including type classes, Generalized Algebraic Data Types (GADTs), and type families. The effectiveness of the approach is evaluated using previously collected programs containing errors. The results show that when compared to existing compilers and other tools, SHErrLoc consistently identifies the location of programmer errors significantly more accurately, without any language-specific heuristics.
- Alexander Aiken. 1999. Introduction to set constraint-based program analysis. Sci. Comput. Program. 35, 2--3 (1999), 79--111 Google Scholar
Digital Library
- Alexander Aiken and Edward L. Wimmers. 1993. Type inclusion constraints and type inference. In Proceedings of the Conference on Functional Programming Languages and Computer Architecture. 31--41. Google Scholar
Digital Library
- Lars Ole Andersen. 1994. Program Analysis and Specialization for the C Programming Language. Ph.D. Dissertation. DIKU, University of Copenhagen.Google Scholar
- Owen Arden, Michael D. George, Jed Liu, K. Vikram, Aslan Askarov, and Andrew C. Myers. 2012. Sharing mobile code securely with information flow control. In Proceedings of the IEEE Symposium on Security and Privacy. 191--205. Google Scholar
Digital Library
- Thomas Ball, Mayur Naik, and Sriram Rajamani. 2003. From symptom to cause: Localizing errors in counterexample traces. In Proceedings of the ACM Symposium on Principles of Programming Languages (POPL’03). 97--105. Google Scholar
Digital Library
- Chris Barrett, Riko Jacob, and Madhav Marathe. 2000. Formal-language-constrained path problems. SIAM J. Comput. 30, 3 (2000), 809--837. Google Scholar
Digital Library
- Sam Blackshear and Shuvendu K. Lahiri. 2013. Almost-correct specifications: A modular semantic framework for assigning confidence to warnings. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’13). 209--218. Google Scholar
Digital Library
- Shen Chen. 2014. Accuracy of CF-Typing. Private communication. (2014).Google Scholar
- Sheng Chen and Martin Erwig. 2014. Better Type-Error Messages Through Lazy Typing. Technical Report. Oregon State University.Google Scholar
- Sheng Chen and Martin Erwig. 2014. Counter-factual typing for debugging type errors. In Proceedings of the ACM Symposium on Principles of Programming Languages (POPL’14). Google Scholar
Digital Library
- Venkatesh Choppella and Christopher T. Haynes. 1995. Diagnosis of Ill-typed Programs. Technical Report. Indiana University.Google Scholar
- Luis Manuel Martins Damas. 1985. Type Assignment in Programming Languages. Ph.D. Dissertation. Department of Computer Science, University of Edinburgh.Google Scholar
- Dorothy E. Denning. 1976. A Lattice model of secure information flow. Commun. ACM 19, 5 (1976), 236--243. Google Scholar
Digital Library
- Isil Dillig, Thomas Dillig, and Alex Aiken. 2012. Automated error diagnosis using abductive inference. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’12). 181--192. Google Scholar
Digital Library
- EasyOCaml. 2009. EasyOCaml. Retrieved from http://easyocaml.forge.ocamlcore.org.Google Scholar
- Jeffrey S. Foster, Manuel Fahndrich, and Alexander Aiken. 1997. Flow-Insensitive Points-to Analysis with Term and Set Constraints. Technical Report. Berkeley, CA, USA. Google Scholar
Digital Library
- Jeffrey S. Foster, Robert Johnson, John Kodumal, and Alex Aiken. 2006. Flow-insensitive type qualifiers. ACM Trans. Program. Lang. Syst. 28, 6 (Nov. 2006), 1035--1087. Google Scholar
Digital Library
- Andrew Gelman, John B. Carlin, Hal S. Stern, and Donald B. Rubin. 2004. Bayesian Data Analysis (2nd ed.). Chapman 8 Hall/CRC.Google Scholar
- Christian Haack and J. B. Wells. 2004. Type error slicing in implicitly typed higher-order languages. Sci. Comput. Program. 50, 1--3 (2004), 189--224. Google Scholar
Digital Library
- Jurriaan Hage. 2014. Helium benchmark programs (2002--2005). Private communication. (2014).Google Scholar
- Jurriaan Hage and Bastiaan Heeren. 2007. Heuristics for type error discovery and recovery. In Implementation and Application of Functional Languages, Zoltán Horváth, Viktória Zsók, and Andrew Butterfield (Eds.). Lecture Notes in Computer Science, Vol. 4449. Springer, 199--216. Google Scholar
Digital Library
- P. E. Hart, N. J. Nilsson, and B. Raphael. 1968. A formal basis for the heuristic determination of minimum cost paths. IEEE Trans Syst. Sci. Cybernet. 4, 2 (1968), 100--107.Google Scholar
Cross Ref
- Bastiaan Heeren, Daan Leijen, and Arjan van IJzendoorn. 2003. Helium, for learning Haskell. In Proceedings of the 2003 ACM SIGPLAN Workshop on Haskell. 62--71. Google Scholar
Digital Library
- Bastiaan J. Heeren. 2005. Top Quality Type Error Messages. Ph.D. Dissertation. Universiteit Utrecht, The Netherlands.Google Scholar
- Helium 1.8(2014) 2014. Helium (ver. 1.8). Retrieved from https://hackage.haskell.org/package/helium.Google Scholar
- Paul Hudak, Simon Peyton Jones, and Philip Wadler. 1992. Report on the programming language Haskell. SIGPLAN Not. 27, 5 (May 1992). Google Scholar
Digital Library
- Gregory F. Johnson and Janet A. Walz. 1986. A maximum flow approach to anomaly isolation in unification-based incremental type inference. In Proceedings of the ACM Symposium on Principles of Programming Languages (POPL’86). 44--57. Google Scholar
Digital Library
- Dave King, Trent Jaeger, Somesh Jha, and Sanjit A. Seshia. 2008. Effective blame for information-flow violations. In Proceedings of the International Symposium on Foundations of Software Engineering. 250--260. Google Scholar
Digital Library
- Ted Kremenek, Paul Twohey, Godmar Back, Andrew Ng, and Dawson Engler. 2006. From uncertainty to belief: Inferring the specification within. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI’06). 161--176. http://dl.acm.org/citation.cfm?id=1298455.1298471 Google Scholar
Digital Library
- Oukseh Lee and Kwangkeun Yi. 1998. Proofs about a folklore let-polymorphic type inference algorithm. ACM Trans. Program. Lang. Syst. 20, 4 (July 1998), 707--723. Google Scholar
Digital Library
- Benjamin S. Lerner, Matthew Flower, Dan Grossman, and Craig Chambers. 2007. Searching for type-error messages. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’07). 425--434. Google Scholar
Digital Library
- Ben Liblit, Mayur Naik, Alice X. Zheng, Alex Aiken, and Michael I. Jordan. 2005. Scalable statistical bug isolation. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’05). 15--26. Google Scholar
Digital Library
- Benjamin Livshits, Aditya V. Nori, Sriram K. Rajamani, and Anindya Banerjee. 2009. Merlin: Specification inference for explicit information flow problems. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’09). 75--86. Google Scholar
Digital Library
- Calvin Loncaric, Satish Chandra, Cole Schlesinger, and Manu Sridharan. 2016. A practical framework for type inference error explanation. 781--799. Google Scholar
Digital Library
- Simon Marlow and Simon Peyton-Jones. 1993. The Glasgow Haskell Compiler. Retrieved from http://www.aosabook.org/en/ghc.html.Google Scholar
- Bruce James McAdam. 1998. On the unification of substitutions in type inference. In Implementation of Functional Languages. 139--154.Google Scholar
- Bruce James McAdam. 2001. Repairing Type Errors in Functional Programs. Ph.D. Dissertation. Laboratory for Foundations of Computer Science, The University of Edinburgh.Google Scholar
- David Melski and Thomas Reps. 2000. Interconvertibility of a class of set constraints and context-free language reachability. Theoret. Comput. Sci. 248, 1--2 (2000), 29--98. Google Scholar
Digital Library
- Robin Milner, Mads Tofte, and Robert Harper. 1990. The Definition of Standard ML. MIT Press, Cambridge, MA. Google Scholar
Digital Library
- Andrew C. Myers and Barbara Liskov. 1997. A decentralized model for information flow control. In Proceedings of the ACM Symposium on Operating System Principles (SOSP’97). 129--142. Google Scholar
Digital Library
- Andrew C. Myers, Lantian Zheng, Steve Zdancewic, Stephen Chong, and Nathaniel Nystrom. 2006. Jif 3.0: Java Information Flow. Software release, www.cs.cornell.edu/jif. (July 2006).Google Scholar
- Anil Nerode and Richard Shore. 1997. Logic for Applications (2nd ed.). Springer, New York, NY. Google Scholar
Digital Library
- OCaml. 2016. OCaml programming language. Retrieved from http://ocaml.org.Google Scholar
- Martin Odersky, Martin Sulzmann, and Martin Wehr. 1999. Type inference with constrained types. Theor. Pract. Object Syst. 5, 1 (Jan. 1999), 35--55. 2-4 Google Scholar
Digital Library
- Zvonimir Pavlinovic, Tim King, and Thomas Wies. 2014. Finding minimum type error sources. In Proceedings of the 2014 ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages and Applications (OOPSLA’14). 525--542. Google Scholar
Digital Library
- Simon Peyton Jones, Dimitrios Vytiniotis, Stephanie Weirich, and Mark Shields. 2007. Practical type inference for arbitrary-rank types. J. Funct. Program. 17, 1 (Jan. 2007), 1--82. 0956-7968 Google Scholar
Digital Library
- Francois Pottier and Didier Rémy. 2005. The essence of ML type inference. In Advanced Topics in Types and Programming Languages, Benjamin C. Pierce (Ed.). MIT Press, 389--489.Google Scholar
- Vincent Rahli, J. B. Wells, and Fairouz Kamareddine. 2010. A Constraint System for a SML Type Error Slicer. Technical Report HW-MACS-TR-0079. Heriot-Watt University.Google Scholar
- Thomas Reps. 1998. Program analysis via graph reachability. Info. Softw. Technol. 40, 11--12 (1998), 701--726.Google Scholar
Cross Ref
- SHErrLoc. 2014. SHErrLoc (Static Holistic Error Locator) Tool Release (ver 1.0). Retrieved from http://www.cs.cornell.edu/projects/sherrloc.Google Scholar
- Bjarne Steensgaard. 1996. Points-to analysis in almost linear time. In Proceedings of the ACM Symposium on Principles of Programming Languages (POPL’96). 32--41. Google Scholar
Digital Library
- Frank Tip and T. B. Dinesh. 2001. A slicing-based approach for locating type errors. ACM Trans. Softw. Eng. Methodol. 10, 1 (2001), 5--55. Google Scholar
Digital Library
- Dimitrios Vytiniotis, Simon Peyton Jones, Tom Schrijvers, and Martin Sulzmann. 2011. OutsideIn(X): Modular type inference with local assumptions. J. Funct. Program. 21, 4--5 (2011), 333–412. Google Scholar
Digital Library
- Dimitrios Vytiniotis, Simon Peyton Jones, and Tom Schrijvers. 2010. Let should not be generalized. In Proceedings of the 5th ACM SIGPLAN Workshop on Types in Language Design and Implementation. ACM, New York, NY, 39--50. Google Scholar
Digital Library
- Mitchell Wand. 1986. Finding the source of type errors. In Proceedings of the ACM Symposium on Principles of Programming Languages (POPL’86). Google Scholar
Digital Library
- Mitchell Wand. 1987. A simple algorithm and proof for type inference. Fundam. Inform. 10, 2 (1987), 115--122.Google Scholar
- Jeroen Weijers, Jurriaan Hage, and Stefan Holdermans. 2013. Security type error diagnosis for higher-order, polymorphic languages. In Proceedings of the ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation. 3--12. Google Scholar
Digital Library
- Danfeng Zhang and Andrew C. Myers. 2014. Toward general diagnosis of static errors. In Proceedings of the ACM Symposium on Principles of Programming Languages (POPL’14). 569--581. http://www.cs.cornell.edu/andru/papers/diagnostic. Google Scholar
Digital Library
- Danfeng Zhang, Andrew C. Myers, Dimitrios Vytiniotis, and Simon Peyton Jones. 2015. Diagnosing type errors with class. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’15). 12--21. Google Scholar
Digital Library
- Alice X. Zheng, Ben Liblit, and Mayur Naik. 2006. Statistical debugging: Simultaneous identification of multiple bugs. In Proceedings of the International Conference on Machine Learning (ICML’06). 1105--1112. Google Scholar
Digital Library
Index Terms
SHErrLoc: A Static Holistic Error Locator
Recommendations
Toward general diagnosis of static errors
POPL '14: Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming LanguagesWe introduce a general way to locate programmer mistakes that are detected by static analyses such as type checking. The program analysis is expressed in a constraint language in which mistakes result in unsatisfiable constraints. Given an unsatisfiable ...
Toward general diagnosis of static errors
POPL '14We introduce a general way to locate programmer mistakes that are detected by static analyses such as type checking. The program analysis is expressed in a constraint language in which mistakes result in unsatisfiable constraints. Given an unsatisfiable ...
Diagnosing type errors with class
PLDI '15: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and ImplementationType inference engines often give terrible error messages, and the more sophisticated the type system the worse the problem. We show that even with the highly expressive type system implemented by the Glasgow Haskell Compiler (GHC)--including type ...






Comments