skip to main content
research-article
Open Access

Generating precise error specifications for C: a zero shot learning approach

Published:10 October 2019Publication History
Skip Abstract Section

Abstract

In C programs, error specifications, which specify the value range that each function returns to indicate failures, are widely used to check and propagate errors for the sake of reliability and security. Various kinds of C analyzers employ error specifications for different purposes, e.g., to detect error handling bugs, yet a general approach for generating precise specifications is still missing. This limits the applicability of those tools.

In this paper, we solve this problem by developing a machine learning-based approach named MLPEx. It generates error specifications by analyzing only the source code, and is thus general. We propose a novel machine learning paradigm based on transfer learning, enabling MLPEx to require only one-time minimal data labeling from us (as the tool developers) and zero manual labeling efforts from users. To improve the accuracy of generated error specifications, MLPEx extracts and exploits project-specific information. We evaluate MLPEx on 10 projects, including 6 libraries and 4 applications. An investigation of 3,443 functions and 17,750 paths reveals that MLPEx generates error specifications with a precision of 91% and a recall of 94%, significantly higher than those of state-of-the-art approaches. To further demonstrate the usefulness of the generated error specifications, we use them to detect 57 bugs in 5 tested projects.

Skip Supplemental Material Section

Supplemental Material

a160-wu

Presentation at OOPSLA '19

References

  1. 2007. OWASP TOP 10. https://www.owasp.org/images/e/e8/OWASP_Top_10_2007.pdfGoogle ScholarGoogle Scholar
  2. 2019. CVE-2019-12818. https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-12818Google ScholarGoogle Scholar
  3. 2019. MongoDB. https://www.mongodb.com/Google ScholarGoogle Scholar
  4. 2019. Neo4j. https://neo4j.com/Google ScholarGoogle Scholar
  5. Mithun Acharya and Tao Xie. 2009. Mining API Error-Handling Specifications from Source Code. In Proceedings of the 12th International Conference on Fundamental Approaches to Software Engineering: Held As Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009 (FASE ’09). Springer-Verlag, Berlin, Heidelberg, 370–384.Google ScholarGoogle Scholar
  6. Miltiadis Allamanis, Earl T Barr, Premkumar Devanbu, and Charles Sutton. 2018a. A survey of machine learning for big code and naturalness. ACM Computing Surveys (CSUR) 51, 4 (2018), 81.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. 2018b. Learning to represent programs with graphs. In Proceedings of the 6th International Conference on Learning Representations (ICLR ’18).Google ScholarGoogle Scholar
  8. Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2018. A General Path-based Representation for Predicting Program Properties. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). ACM, New York, NY, USA, 404–419. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2019. code2vec: Learning distributed representations of code. Proceedings of the ACM on Programming Languages 3, POPL (2019), 40.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ethem Alpaydin. 2009. Introduction to machine learning. MIT press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Robert S Arnold. 1996. Software change impact analysis. IEEE Computer Society Press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Matej Balog, Alexander L Gaunt, Marc Brockschmidt, Sebastian Nowozin, and Daniel Tarlow. 2017. Deepcoder: Learning to write programs. In 5th International Conference on Learning Representations (ICLR ’17).Google ScholarGoogle Scholar
  13. Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence 35, 8 (2013), 1798–1828.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Leo Breiman. 2001. Random forests. Machine learning 45, 1 (2001), 5–32.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Leo Breiman. 2017. Classification and regression trees. Routledge.Google ScholarGoogle Scholar
  16. Kwonsoo Chae, Hakjoo Oh, Kihong Heo, and Hongseok Yang. 2017. Automatically Generating Features for Learning Program Analysis Heuristics for C-like Languages. Proc. ACM Program. Lang. 1, OOPSLA, Article 101 (Oct. 2017), 25 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Yoonsik Cheon. 2007. Automated random testing to detect specification-code inconsistencies. Technical Report UTEP-CS-07-07 (2007).Google ScholarGoogle Scholar
  18. Flaviu Cristian. 1982. Exception handling and software fault tolerance. IEEE Trans. Comput. 6 (1982), 531–540.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Yaniv David, Nimrod Partush, and Eran Yahav. 2016. Statistical Similarity of Binaries. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’16). ACM, New York, NY, USA, 266–280. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Daniel DeFreez, Aditya V Thakur, and Cindy Rubio-González. 2018. Path-based function embedding and its application to error-handling specification mining. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 423–433.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Isil Dillig, Thomas Dillig, and Alex Aiken. 2007. Static Error Detection Using Semantic Inconsistency Inference. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’07). ACM, New York, NY, USA, 435–445. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Isil Dillig, Thomas Dillig, and Alex Aiken. 2008. Sound, Complete and Scalable Path-sensitive Analysis. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’08). ACM, New York, NY, USA, 270–280. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Stuart Geman, Elie Bienenstock, and René Doursat. 1992. Neural networks and the bias/variance dilemma. Neural computation 4, 1 (1992), 1–58.Google ScholarGoogle Scholar
  24. John B. Goodenough. 1975. Structured Exception Handling. In Proceedings of the 2Nd ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL ’75). ACM, New York, NY, USA, 204–224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Haryadi S Gunawi, Cindy Rubio-González, Andrea C Arpaci-Dusseau, Remzi H Arpaci-Dusseau, and Ben Liblit. 2008. EIO: Error Handling is Occasionally Correct. In 6th USENIX Conference on File and Storage Technologies (FAST ’08), Vol. 8. 1–16.Google ScholarGoogle Scholar
  26. Trevor Hastie, Robert Tibshirani, and Jerome Friedman. 2009. Overview of supervised learning. In The elements of statistical learning. Springer, 9–41.Google ScholarGoogle Scholar
  27. Jianping Hua, Zixiang Xiong, James Lowey, Edward Suh, and Edward R Dougherty. 2004. Optimal number of features as a function of sample size for various classification rules. Bioinformatics 21, 8 (2004), 1509–1515.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Jiayuan Huang, Arthur Gretton, Karsten Borgwardt, Bernhard Schölkopf, and Alex J Smola. 2007. Correcting sample selection bias by unlabeled data. In Advances in neural information processing systems. 601–608.Google ScholarGoogle Scholar
  29. Suman Jana, Yuan Jochen Kang, Samuel Roth, and Baishakhi Ray. 2016. Automatically Detecting Error Handling Bugs Using Error Specifications. In USENIX Security Symposium. 345–362.Google ScholarGoogle Scholar
  30. Yuan Kang, Baishakhi Ray, and Suman Jana. 2016. APEx: Automated Inference of Error Specifications for C APIs. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE 2016). ACM, New York, NY, USA, 472–482. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. David G Kleinbaum, K Dietz, M Gail, Mitchel Klein, and Mitchell Klein. 2002. Logistic regression. Springer.Google ScholarGoogle Scholar
  32. Ioannis Kopanas, Nikolaos Avouris, and Sophia Daskalaki. 2002. The role of domain knowledge in a large scale data mining project. Methods and Applications of Artificial Intelligence (2002), 746–746.Google ScholarGoogle Scholar
  33. Eric L. Seidel, Huma Sibghat, Kamalika Chaudhuri, Westley Weimer, and Ranjit Jhala. 2017. Learning to Blame: Localizing Novice Type Errors with Data-Driven Diagnosis. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA ’17). ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Jundong Li, Kewei Cheng, Suhang Wang, Fred Morstatter, Robert P Trevino, Jiliang Tang, and Huan Liu. 2018. Feature selection: A data perspective. ACM Computing Surveys (CSUR) 50, 6 (2018), 94.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Bin Liang, Pan Bian, Yan Zhang, Wenchang Shi, Wei You, and Yan Cai. 2016. AntMiner: mining more bugs by reducing noise interference. In Software Engineering (ICSE), 2016 IEEE/ACM 38th International Conference on. IEEE, 333–344.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Huan Liu and Rudy Setiono. 1995. Chi2: Feature selection and discretization of numeric attributes. In Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence. IEEE, 388–391.Google ScholarGoogle Scholar
  37. Fan Long and Martin Rinard. 2016. Automatic Patch Generation by Learning Correct Code. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’16). ACM, New York, NY, USA, 298–312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, Nov (2008), 2579–2605.Google ScholarGoogle Scholar
  39. Paul D Marinescu and George Candea. 2011. Efficient testing of recovery code using fault injection. ACM Transactions on Computer Systems (TOCS) 29, 4 (2011), 11.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Steven S Muchnick. 1997. Advanced compiler design implementation. Morgan Kaufmann.Google ScholarGoogle Scholar
  41. Kevin P Murphy. 2012. Machine learning: a probabilistic perspective. MIT press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Brad A Myers and Jeffrey Stylos. 2016. Improving API usability. Commun. ACM 59, 6 (2016), 62–69.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Meiyappan Nagappan, Romain Robbes, Yasutaka Kamei, Éric Tanter, Shane McIntosh, Audris Mockus, and Ahmed E. Hassan. 2015. An Empirical Study of Goto in C Code from GitHub Repositories. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2015). ACM, New York, NY, USA, 404–414. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Brian A. Nejmeh. 1988. NPATH: A Measure of Execution Path Complexity and Its Applications. Commun. ACM 31, 2 (Feb. 1988), 188–200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Kamal Nigam, John Lafferty, and Andrew McCallum. 1999. Using maximum entropy for text classification. In IJCAI-99 workshop on machine learning for information filtering, Vol. 1. 61–67.Google ScholarGoogle Scholar
  46. Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Transactions on knowledge and data engineering 22, 10 (2010), 1345–1359.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825–2830.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Suzette Person, Guowei Yang, Neha Rungta, and Sarfraz Khurshid. 2011. Directed Incremental Symbolic Execution. In Proceedings of the 32Nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’11). ACM, New York, NY, USA, 504–515. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Michael Pradel and Koushik Sen. 2018. DeepBugs: A learning approach to name-based bug detection. Proceedings of the ACM on Programming Languages 2, OOPSLA (2018), 147.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Veselin Raychev, Martin Vechev, and Eran Yahav. 2014. Code Completion with Statistical Language Models. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). ACM, New York, NY, USA, 419–428. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Herbert Robbins and Sutton Monro. 1951. A stochastic approximation method. The annals of mathematical statistics (1951), 400–407.Google ScholarGoogle Scholar
  52. Cindy Rubio-González, Haryadi S. Gunawi, Ben Liblit, Remzi H. Arpaci-Dusseau, and Andrea C. Arpaci-Dusseau. 2009. Error Propagation Analysis for File Systems. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’09). ACM, New York, NY, USA, 270–280. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Cindy Rubio-González and Ben Liblit. 2010. Expect the unexpected: error code mismatches between documentation and the real world. In Proceedings of the 9th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering. ACM, 73–80.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. 2015. Imagenet large scale visual recognition challenge. International journal of computer vision 115, 3 (2015), 211–252.Google ScholarGoogle Scholar
  55. Victor S. Sheng, Foster Provost, and Panagiotis G. Ipeirotis. 2008. Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’08). ACM, New York, NY, USA, 614–622. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Hidetoshi Shimodaira. 2000. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of statistical planning and inference 90, 2 (2000), 227–244.Google ScholarGoogle Scholar
  57. Masashi Sugiyama, Shinichi Nakajima, Hisashi Kashima, Paul V Buenau, and Motoaki Kawanabe. 2008. Direct importance estimation with model selection and its application to covariate shift adaptation. In Advances in neural information processing systems. 1433–1440.Google ScholarGoogle Scholar
  58. Martin Susskraut and Christof Fetzer. 2006. Automatically finding and patching bad error handling. In Dependable Computing Conference, 2006. EDCC’06. Sixth European. IEEE, 13–22.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Yuchi Tian and Baishakhi Ray. 2017. Automatically Diagnosing and Repairing Error Handling Bugs in C. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2017). ACM, New York, NY, USA, 752–762. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Vladimir Vapnik. 2013. The nature of statistical learning theory. Springer science & business media.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Wei Wang, Vincent W Zheng, Han Yu, and Chunyan Miao. 2019. A survey of zero-shot learning: Settings, methods, and applications. ACM Transactions on Intelligent Systems and Technology (TIST) 10, 2 (2019), 13.Google ScholarGoogle ScholarCross RefCross Ref
  62. Westley Weimer and George Necula. 2005. Mining temporal specifications for error detection. Tools and Algorithms for the Construction and Analysis of Systems (2005), 461–476.Google ScholarGoogle Scholar
  63. Baijun Wu, John Peter Campora III, Yi He, Alexander Schlecht, and Sheng Chen. 2019a. Generating Precise Error Specifications for C: A Zero Shot Learning Approach. Available at https://people.cmix.louisiana.edu/schen/ws/techreport/ errspec.pdf .Google ScholarGoogle Scholar
  64. Baijun Wu, John P. Campora III, and Sheng Chen. 2017. Learning User Friendly Type Error Messages. In ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA ’17). ACM.Google ScholarGoogle Scholar
  65. Mike Wu, Milan Mosse, Noah Goodman, and Chris Piech. 2019b. Zero shot learning for code education: Rubric sampling with deep learning inference. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19).Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Fabian Yamaguchi, Nico Golde, Daniel Arp, and Konrad Rieck. 2014. Modeling and discovering vulnerabilities with code property graphs. In Security and Privacy (SP), 2014 IEEE Symposium on. IEEE, 590–604.Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Bianca Zadrozny. 2004. Learning and Evaluating Classifiers Under Sample Selection Bias. In Proceedings of the Twenty-first International Conference on Machine Learning (ICML ’04). ACM, New York, NY, USA, 114–. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Ying Zhang and Chen Ling. 2018. A strategy to apply machine learning to small datasets in materials science. Npj Computational Materials 4, 1 (2018), 25.Google ScholarGoogle ScholarCross RefCross Ref
  69. He Zhu, Gustavo Petri, and Suresh Jagannathan. 2016. Automatically Learning Shape Specifications. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’16). ACM, New York, NY, USA, 491–507. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Xiaojin Zhu and Andrew B Goldberg. 2009. Introduction to semi-supervised learning. Synthesis lectures on artificial intelligence and machine learning 3, 1 (2009), 1–130.Google ScholarGoogle Scholar

Index Terms

  1. Generating precise error specifications for C: a zero shot learning approach

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!