Abstract
C tools, such as source browsers, bug finders, and automated refactorings, need to process two languages: C itself and the preprocessor. The latter improves expressivity through file includes, macros, and static conditionals. But it operates only on tokens, making it hard to even parse both languages. This paper presents a complete, performant solution to this problem. First, a configuration-preserving preprocessor resolves includes and macros yet leaves static conditionals intact, thus preserving a program's variability. To ensure completeness, we analyze all interactions between preprocessor features and identify techniques for correctly handling them. Second, a configuration-preserving parser generates a well-formed AST with static choice nodes for conditionals. It forks new subparsers when encountering static conditionals and merges them again after the conditionals. To ensure performance, we present a simple algorithm for table-driven Fork-Merge LR parsing and four novel optimizations. We demonstrate the effectiveness of our approach on the x86 Linux kernel.
- B. Adams et al. Can we refactor conditional compilation into aspects? In Proc. 8th AOSD, pp. 243--254, Mar. 2009. Google Scholar
Digital Library
- A. V. Aho et al. Compilers: Principles, Techniques, and Tools. Addison-Wesley, 2nd edition, Aug. 2006. Google Scholar
Digital Library
- R. L. Akers et al. Re-engineering C++ component models via automatic program transformation. In Proc. 12th WCRE, pp. 13--22, Nov. 2005. Google Scholar
Digital Library
- G. J. Badros and D. Notkin. A framework for preprocessor-aware C source code analyses. SPE, 30(8):907--924, July 2000. Google Scholar
Digital Library
- I. D. Baxter and M. Mehlich. Preprocessor conditional removal by simple partial evaluation. In Proc. 8th WCRE, pp. 281--290, Oct. 2001. Google Scholar
Digital Library
- A. Bessey et al. A few billion lines of code later: Using static analysis to find bugs in the real world. CACM, 53(2):66--75, Feb. 2010. Google Scholar
Digital Library
- A. Birman and J. D. Ullman. Parsing algorithms with backtrack. Information and Control, 23(1):1--34, Aug. 1973.Google Scholar
Cross Ref
- A. M. Bishop. C cross referencing and documenting tool. http://www.gedanken.demon.co.uk/cxref/.Google Scholar
- B. Blanchet et al. A static analyzer for large safety-critical software. In Proc. PLDI, pp. 196--207, June 2003. Google Scholar
Digital Library
- R. Bowdidge. Performance trade-offs implementing refactoring support for Objective-C. In Proc. 3rd WRT, Oct. 2009.Google Scholar
- M. Bravenboer and E. Visser. Concrete syntax for objects. In Proc. 19th OOPSLA, pp. 365--383, Oct. 2004. Google Scholar
Digital Library
- R. E. Bryant. Graph-based algorithms for boolean function manipulation. TOC, C-35(8):677--691, Aug. 1986. Google Scholar
Digital Library
- F. DeRemer and T. Pennello. Efficient computation of LALR(1) lookahead sets. TOPLAS, 4(4):615--649, Oct. 1982. Google Scholar
Digital Library
- M. D. Ernst et al. An empirical analysis of C preprocessor use. TSE, 28(12):1146--1170, Dec. 2002. Google Scholar
Digital Library
- J.-M. Favre. Understanding-in-the-large. In Proc. 5th IWPC, pp. 29--38, Mar. 1997. Google Scholar
Digital Library
- B. Ford. Parsing expression grammars: A recognition-based syntactic foundation. In Proc. 31st POPL, pp. 111--122, Jan. 2004. Google Scholar
Digital Library
- Free Software Foundation. Bison. http://www.gnu.org/ software/bison/.Google Scholar
- E. Gagnon. SableCC, an object-oriented compiler framework. Master's thesis, McGill University, Mar. 1998.Google Scholar
- A. Garrido and R. Johnson. Analyzing multiple configurations of a C program. In Proc. 21st ICSM, pp. 379--388, Sept. 2005. Google Scholar
Digital Library
- A. G. Gleditsch and P. K. Gjermshus. The LXR project. http://lxr.sourceforge.net/.Google Scholar
- E. Graf et al. Refactoring support for the C++ development tooling. In Companion 22nd OOPSLA, pp. 781--782, Oct. 2007. Google Scholar
Digital Library
- R. Grimm. Better extensibility through modular syntax. In Proc. PLDI, pp. 38--51, June 2006. Google Scholar
Digital Library
- java.net. JJTree reference documentation. http://javacc.java.net/doc/JJTree.html.Google Scholar
- V. Kabanets and R. Impagliazzo. Derandomizing polynomial identity tests means proving circuit lower bounds. In Proc. 35th STOC, pp. 355--364, June 2003. Google Scholar
Digital Library
- C. Kästner et al. Partial preprocessing C code for variability analysis. In Proc. 5th VaMoS, pp. 127--136, Jan. 2011. Google Scholar
Digital Library
- C. Kästner et al. Variability-aware parsing in the presence of lexical macros and conditional compilation. In Proc. 26th OOPSLA, pp. 805--824, Oct. 2011. Google Scholar
Digital Library
- G. Klein et al. JFlex: The fast scanner generator for Java. http://jflex.de/.Google Scholar
- D. E. Knuth. On the translation of languages from left to right. Information and Control, 8(6):607--639, Dec. 1965.Google Scholar
Cross Ref
- B. McCloskey and E. Brewer. ASTEC: A new approach to refactoring C. In Proc. 10th ESEC, pp. 21--30, Sept. 2005. Google Scholar
Digital Library
- S. McPeak and G. C. Necula. Elkhound: A fast, practical GLR parser generator. In Proc. 13th CC, vol. 2985 of LNCS, pp. 73--88, Mar. 2004.Google Scholar
Cross Ref
- Y. Padioleau. Parsing C/C++ code without pre-processing. In Proc. 18th CC, vol. 5501 of LNCS, pp. 109--125, Mar. 2009. Google Scholar
Digital Library
- T. Parr and K. Fisher. LL(*): The foundation of the ANTLR parser generator. In Proc. PLDI, pp. 425--436, June 2011. Google Scholar
Digital Library
- M. Platoff et al. An integrated program representation and toolkit for the maintenance of C programs. In Proc. ICSM, pp. 129--137, Oct. 1991.Google Scholar
Cross Ref
- D. J. Rosenkrantz and R. E. Stearns. Properties of deterministic top down grammars. In Proc. 1st STOC, pp. 165--180, May 1969. Google Scholar
Digital Library
- J. Roskind. Parsing C, the last word. The comp.compilers newgroup, Jan. 1992. http://groups.google.com/group/comp.compilers/msg/c0797b5b668605b4.Google Scholar
- D. Spinellis. Global analysis and transformations in preprocessed languages. TSE, 29(11):1019--1030, Nov. 2003. Google Scholar
Digital Library
- R. Tartler et al. Configuration coverage in the analysis of large-scale system software. OSR, 45(3):10--14, Dec. 2011. Google Scholar
Digital Library
- R. Tartler et al. Feature consistency in compile-time configurable system software: Facing the Linux 10,000 feature problem. In Proc. 6th EuroSys, pp. 47--60, Apr. 2011. Google Scholar
Digital Library
- M. Tomita, ed. Generalized LR Parsing. Kluwer, 1991. Google Scholar
Digital Library
- E. Visser. Syntax Definition for Language Prototyping. PhD thesis, University of Amsterdam, Sept. 1997.Google Scholar
- M. Vittek. Refactoring browser with preprocessor. In Proc. 7th CSMR, pp. 101--110, Mar. 2003. Google Scholar
Digital Library
- J. Whaley. JavaBDD. http://javabdd.sourceforge.net/.Google Scholar
Index Terms
SuperC: parsing all of C by taming the preprocessor
Recommendations
Variability-aware parsing in the presence of lexical macros and conditional compilation
OOPSLA '11: Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applicationsIn many projects, lexical preprocessors are used to manage different variants of the project (using conditional compilation) and to define compile-time code transformations (using macros). Unfortunately, while being a simple way to implement variability,...
SuperC: parsing all of C by taming the preprocessor
PLDI '12: Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and ImplementationC tools, such as source browsers, bug finders, and automated refactorings, need to process two languages: C itself and the preprocessor. The latter improves expressivity through file includes, macros, and static conditionals. But it operates only on ...
Variability-aware parsing in the presence of lexical macros and conditional compilation
OOPSLA '11In many projects, lexical preprocessors are used to manage different variants of the project (using conditional compilation) and to define compile-time code transformations (using macros). Unfortunately, while being a simple way to implement variability,...







Comments