Clog: A Declarative Language for C Static Code Checkers

We present Clog, a declarative language for describing static code checkers for C. Unlike other extensible state-of-the-art checker frameworks, Clog enables powerful interprocedural checkers without exposing the underlying program representation: Clog checkers consist of Datalog-style recursive rules that access the program under analysis via syntactic pattern matching and control flow edges only. We have implemented Clog on top of Clang, using a custom Datalog evaluation strategy that piggy-backs on Clang's AST matching facilities while working around Clang's limitations to achieve our design goal of representation independence. Our experiments demonstrate that Clog can concisely express a wide variety of checkers for different security vulnerabilities, with performance that is similar to Clang's own analyses and highly competitive on real-world programs.


Introduction
While the C programming language enforces certain correctness properties that all C programs must satisfy, C programmers have been utilizing supplementary static checkers to enforce additional constraints for most of the language's existence -the release of C in 1973 [26] was followed by the release of lint only 4 years later [15].
Since then, static program analysis has made substantial advances.Modern software engineers can draw from frameworks that offer interprocedural data flow analyses (e.g., Phasar [28]), complex call graph and points-to analyses with intricate interdependencies (e.g., cclyzer [4]) and efficient memory models via separation logic (Infer [25]).In practical software development, developers may also adopt checker frameworks that trade precision or soundness for speed, for easier integration into the build workflow, such as the Clang Static Analyzer1 , clang-tidy2 and CppCheck3 .
However, most of these tools (Phasar, Infer, Clang Static Analyzer, clang-tidy, CppCheck) are not easily extensible: they only supply a fixed set of built-in general-purpose checkers.Thus, they are not designed to tackle the increasing reliance of modern software on external libraries, to support custom checks for internal APIs, or to incorporate custom, project-specific tweaks to existing analyses.Adding a new analysis or customizing an existing one depends on the internal representation of the program used by the tool (usually an abstract syntax tree (AST) or a 3-address intermediate representation (IR)), and these internal representations often evolve as the underlying tool itself evolves.For example, Clang offers a command-line interface for AST pattern matching (clang-query) that can find program locations corresponding to a user-defined pattern, but users must express these patterns in Clang's idiosyncratic pattern language.
Several recent tools, Coccinelle [18], CodeQL [3], and cclyzer, therefore explicitly offer domain-specific languages that enable software engineers to supply their own bug patterns or to tweak existing ones.We observe that these tools split bug detection into two phases: 1. First phase: moving from the concrete domain (AST or IR) to an abstract domain 2. Second phase: combine information in the abstract domain to derive the conclusions, computing fixpoints as needed Both cclyzer and CodeQL provide a declarative, Datalog-style, analysis description that facilitates the development of custom analyses, but only in the second phase of the analyzer.(c) Figure 1.Examples for the goto check.Case (a) is a backjump.Case (b) is a forward jump to the single label in the function.Case (c) is a forward jump to a label followed by a return.
The analyses are still dependent on the internal representation of the analyzed program, be it an AST or an IR.Coccinelle's code pattern language, meanwhile, allows developers to specify syntactic patterns that resemble C code, with various pattern extensions e.g. for intraprocedural control flow dependencies between patterns.More complex connections between patterns (e.g., those that require fixpoints) require custom Python or OCaml code.
In this paper, we introduce Clog, a declarative language that combines Datalog-style reasoning with Coccinelle-style syntactic pattern matching over the C language.For the first analysis phase, syntactic patterns allow us to describe C code patterns, without exposing the internal representation, unlike cclyzer and CodeQL.For the second analysis phase, Datalog-style reasoning allow us to freely combine syntactic patterns across arbitrary flow and dependency edges, avoiding the need to escape to scripting languages, as in Coccinelle, and enabling integration with the existing body of work on points-to analysis and context-sensitivity in Datalog.
To illustrate our approach, consider a common warning between static code checkers: the use of goto statement (Figure 1).Although there is variation in what the tools consider an admissible use of gotos, both clang-tidy and CodeQL agree on discouraging back-jumps (Figure 1a).
In Figure 2 we show a possible approach for implementing a back-jump checker in CodeQL.The checker collects all goto statements in a relation GotoStmt and all label statements in LabelStmt (line 1), from these relations selects the goto and label statements that refer to the same label (line 3), compares their source locations (line 4), and, if successful, generates a report (line 6).While this checker provides a concise description of the check, it still relies on the GotoStmt and LabelStmt relations, which are defined externally.The   In contrast to CodeQL, Clog aims to hide implementation details internal to the analysis system such as the AST nodes and their attributes and introduces syntactic patterns to match arbitrary terms of the analyzed programs, including, but not limited to, terms representing single AST nodes.
Figure 3 depicts the Clog implementation of the same goto check, where we use syntactic patterns instead of built-in relations.Following Datalog notation, :-represents logical right-to-left implication (⇐) and commas represents conjunction.The syntactic pattern in line 2 matches all goto statements in the program, binds them to variable g and their labels to $label.Analogously, the pattern in line 3 matches all label statements l, with the same label, $label.To distinguish between program names and Datalog variables, we prefix variables used inside patterns with the $ sign, thus $label can bind to any label in the program and $s to any statement.Since these are variables of the analysis program, we call them metavariables.On line 4, the program compares source line numbers, and if, this succeeds, it adds the tuple (g, l) to the WarnBackwardGoto relation.
Syntactic patterns are not limited to single statements, but we can freely compose them as long as the result is a statement, expression, declaration or definition.For example, a program that detects labels before a return statement is: Such a program may warn about missing cleanup code or hint that a return can be used directly instead of a goto (Figure 1c).
To enable our framework to handle industrial-quality code, including large code bases and C language extensions, we have implemented its frontend on top of Clang.This allows us to combine state-of-the-art Datalog evaluation techniques with a custom pattern embedding strategy that offloads parts of syntactic pattern matching and semantic analysis to Clang's own pattern matching and analysis facilities, on demand.1. Clog, a declarative language that combines syntactic patterns and Datalog-style reasoning for program analysis of C programs; 2. A prototype implementation of Clog 4 ; 3.An execution strategy that allows us to automatically offload parts of most Clog analyses to Clang; 4. A comparison of speed and quality between 5 analyses implemented in Clog and the Clang Static Analyzer, with a validated artifact [11].

The Clog Language
We introduce the Clog language through two examples that illustrate both the language and the workflow that we use to develop static checkers in Clog.We first show how we can construct recursive inclusion-based analyses (Section 2.1) to catch misuses of a typical internal API, and then show how Clog's built-in knowledge about control flow (Section 2.2) can help identify violations of an API protocol for an external library.Sections 2.3-2.6 then give a full overview over the language.

Recursive Patterns: Arena Allocators
To illustrate how Clog can help developers find bugs related to internal APIs, we look at arena allocators [7].These custom allocators can speed up memory allocation and (bulk) deallocation.Since arenas can be used to store temporary data structures that have a shorter lifetime than the program, pointers allocated in them may coexist along pointers allocated using malloc.However, calling free on an arenaallocated pointer is undefined behavior, so it is important that the two kinds of pointers are not confused in the program.In Figure 4 we provide three examples of such API misuse. 4h ps://github.com/lu-cs-sde/clog 1 ExprPointsToArena(e) :-e ⟨. .aalloc(..). .⟩.
In Clog, ( ) may also represent a syntactic pattern (for ≠ 0).For example, in Figure 5, line 1 states that if e is a specific element of the program under analysis that has the syntactic form ⟨. .aalloc(..). .⟩ (where '..' is a wildcard that matches any sequence of elements), then we must conclude that ExprPointsToArena(e) is true.Syntactic patterns like e ⟨. .aalloc(..). .⟩ behave like predicates quantified over the entire program under analysis, that is: a syntactic pattern matches if there exists a substitution of the pattern's metavariables with terms from the analyzed program such that the resulting term is present in the analyzed program.
We follow most Datalog dialects in further extending the language with comparison predicates, represented inline by the operators (e.g.>, ==, etc.), pure functions (e.g.src_line_start which maps a program term to its source location) and stratified negation (using the '!' operator).
To catch the first misuse case (line 5), we describe an intraprocedural, flow-insensitive analysis in Figure 5.
In line 2, we also add the cast expressions to the same relation.Line 3 introduces a new relation, VarPointsToArena, which contains all the pointer variables pointing to arenaallocated memory.First, we add all variables that are initialized by an expression that points to an arena.In line 4, we handle variable assignments, which are matched by the ⟨. .$p = $e. .⟩ pattern, where $p is bound to an identifier and we use the built-in function decl to look-up the corresponding declaration, p.If the look-up is successful, and the righthand side of the assignment points to an arena, then we deduce that the variable p also points to an arena.Line 6 contains the dual of the previous rule, stating that if variable points to an arena, then a reference to that variable, ⟨. .$p. .⟩, is an expression pointing to an arena.We observe that there is a circular dependency between the VarPointsToArena and ExprPointsToArena relations -this is handled by the fix-point semantics of Datalog.Finally, in line 8, we collect all the problematic calls to free on an arena-allocated pointer in the relation FreeOfArenaPtr.
As hinted earlier, this checker does not catch cases where the allocation and the call to free are happening in different functions.To catch the error on line 9, we need to track the flow of values through function calls.We achieve this by appending two rules to the program.9 Call(call, $actual, $formal) :-call ⟨. .$c(.., $actual, ..) . .⟩, The first rule (line 9) defines the Call predicate which contains the formal and actual argument pairs for each call expression.We use the index function to retrieve the index of a term in a list, in this case the index of the actual and formal argument.The second rule (line 13) states that if an actual argument is an expression pointing to an arena, then the formal argument points to an arena.
The rule on line 14 defines the Return predicate which maps all call expressions to the expressions returned by the $callee.On line 16 we illustrate the use of restricted syntactic patterns, where the pattern @$callee ⟨. .return $val;. .⟩ matches only the return statements from $callee.
We observe that we can develop the checker incrementally, by appending new rules, to transform it into an interprocedural analysis.Although the checker is far from being sound or complete, it already covers interesting cases and it is a starting point for incremental development, for example by adding call-site sensitivity.Adding various flavors of context-sensitivity to Datalog analyses has already been demonstrated by tools like Doop [6], thus we do not explore this direction as part of this work.

Control Flow: API Protocol for MPI
A common restriction is that the API functions need to be called in a certain order.For example, the OpenMPI library expects that each call to the non-blocking send function MPI_Isend is followed by a call to MPI_Wait (or similar).In Figure 6 we show 4 examples of API misuse, where the calls to MPI_Isend are not paired with calls to MPI_Wait with the same request handle, req.
In Figure 7 we implement a checker to detect these cases of API misuse.First, the checker identifies the involved API calls using syntactic patterns (lines 1 and 3) and defines two predicates, ISend and Wait which enumerate these call expressions (c) together with the variable that stores the request handle (req).In the next two rules (lines 5-7), we inductively define the ISendChain relation, which maps an Isend call to its control-flow successors that are not Iwait instructions with the same request handle.For defining the ISendChain relation we rely on the CFG_SUCC built-in predicate, that maps s to all its control-flow successors t.On line 9, we emit a warning whenever there exists a controlflow path on which two Isend instructions with the same handle occur without a Wait in between (Figure 6, bad2).On line 10 we emit a warning if, starting from an ISend call, we can build a path that reaches a function exit, but does not contain a matching Wait call.To identify the function exits we rely on the enclosing_function built-in function, which maps a term to its enclosing function definition, and on the CFG_EXIT predicate which maps a function definition (f) to all its exits (exit).

Language Overview
As we have seen earlier, the Clog language is an extension of standard Datalog.We extend the syntax for body literals to allow for pattern literals, , and the syntax for terms, where we allow for function application ( ( )) and for a special In Figure 8 we give the full syntax of the language.
The Clog language lacks explicit type declarations, but its predicates are statically typed.We rely instead on monomorphic type inference to deduce the type of predicates and variables.The possible types for variables are: Integer, String, ASTNode and PredRef.The ASTNode type represents program terms, thus all metavariables have this type.The PredRef is the type of predicate references, ′ .

Pattern Literals
Pattern literals are predicates over the abstract syntax tree of the analyzed program.A pattern literal @s r ⟨. .C. .⟩ consists of a syntactic pattern, ⟨. .C. .⟩, and two optional nodes: the root, r, and the subterm restriction @s.The syntactic pattern matches terms in the analyzed program.When a match occurs, the metavariables in the pattern are bound and so is the optional root variable, r, which binds the whole matched term.
The optional subterm restriction, @s, restricts the matching to a strict subterm of the term s.For example, 1 ⟨. .while ($cond) $body. .⟩, @$body ret ⟨. .return $e;. .⟩ matches all return statements occurring in the body of a while loop, and binds them to variable ret and their respective return expression to $e.The pattern, ⟨. .C. .⟩, can have any of the following syntactic categories: expression, statement, declaration or function definition.Terms in the pattern are either concrete, following the C grammar, or they are left abstract and replaced with a metavariable.Metavariables can be used in place of concrete terms from the following syntactic categories: identifier, init-declarator, parameter-declaration, expression, statement.In places where a list of these is required, but the list elements are not relevant, a gap (..) can be used.For example, ⟨. .printf(.., $e, ..). .⟩ matches any function call to printf, and binds its arguments, in turn, to $e.The occurrence of the call expression printf("Hello %s!", name) results in two pattern matches, one when $e binds the string literal "Hello %s!" and another, when it binds the identifier name.

Built-in Predicates
Our implementation assigns special semantics to a set of predicates.These can be grouped into three categories: control-flow predicates, I/O and infinite predicates.
Control-flow predicates expose the intra-procedural control-flow graph to the Clog program.While traversing the control-flow graph is achievable by using only syntactic patterns, this increases the verbosity of the code, therefore we expose the following predicates: • CFG_SUCC ( , ) maps the term to its successors in the control-flow graph, .Since the C language leaves unspecified the order of evaluation in some cases (e.g.subexpressions, function arguments), the CFG_SUCC relation represents only one of the possible orderings.The variable must be bound by other literals in the same clause.• CFG_EXIT ( , ) maps the function definition to all its exits, .The variable must be bound by other literals in the same clause.The I/O predicates enable the Clog program to read or write relations to a tabular format (CSV or SQLite3) and they are most frequently used to read analysis parameters or to output analysis results.As the I/O predicates, the infinite predicates are identical to the ones used by JavaDL, and therefore we refer the reader to [12].As syntactic sugar, Clog provides infix notations for comparison (==, <=, etc.) and variable binding (=).

Built-in Functions
Clog provides a set of predefined functions.These functions are free of side-effects and their use is allowed inside the operands of the comparison predicates or as the right operand of the = predicate.Clog requires that the arguments to the functions are either other expressions (i.e.arithmetic or function application) or that they are bound variables.The purpose of these functions is to expose properties of the analyzed program that are not expressible through syntactic patterns.
2.6.1 Type and Name Analysis Functions.While type and name analysis for C can be expressed as a Datalog program, doing so bloats the analysis program and hinders readability.Therefore, Clog exposes these semantic properties through predefined functions 5 .
• type( ) -the type of the expression .
• decl( ) -the declaration of the the identifier .Because not all C constructs have a type and, for incomplete C programs, not all identifiers have a declaration, the type and name functions are partial.In such cases, they evaluate to a special value, undef.

Program Structure
Functions.Clog provides convenience functions that enable the traversal of the program structure: • parent( ) retrieves the parent term of , if it exists.
• enclosing_function( ) retrieves the enclosing function of term , if it exists.
2.6.3Names.The Clog programs are general over the set of names, and thus identifiers can be replaced with metavariables inside syntactic patterns.However, metavariables can bind terms that may or may not have a name, for example the pattern ⟨. .$l + $r. .⟩ matches the expression a + 1 and only the metavariable $l binds an identifier.To retrieve the variable name, we introduce a function name( ) that maps the term to its name if it is an identifier or a named declaration, or to the empty string otherwise.
2.6.4Control-Flow Functions.In addition to the controlflow predicates, Clog defines the cfg_entry( ) function that maps a term to the entry node of its control-flow graph.
Terms that have a CFG are function definitions, statements and expressions.
2.6.5 Source Location Functions.To support report generation, Clog defines a set of functions that retrieve the source location of a program term : src_line_start( ), src_col_start( ), src_file( ), etc.

Overview
We built the Clog prototype using two major components: 1.A Datalog implementation that we extended with support for syntactic patterns, functions and built-in predicates.2. A Clang library, (Clang-Clog), which provides support for pattern matching, CFG predicates and built-in functions.We chose to use Clang as the parser for the analyzed program over using our own C parser because we wanted Clog to be able to analyze real-world C programs that may use language extensions beyond the standard C grammar used by our parser.Moreover, by choosing a mature compiler such as Clang, we also gain access to other standard compiler analyses, such as name and type analysis and CFG construction.
In its implementation, Clog reuses infrastructure from previous systems that combine syntactic pattern matching and Datalog: MetaDL [10] and JavaDL [12].Both these tools implement syntactic pattern matching in Datalog and translate the AST of the entire analyzed program to Datalog relations.We have experimented with the same approach in an early Clog prototype, but it proved impractical, because for C programs the AST contains nodes for all included files (transitively) and serializing the AST into Datalog relations dominated the running time of Clog.Instead, we opted for using Clang's own AST matching infrastructure, provided by the LibASTMatchers6 library.This way, Clog could perform pattern matching directly on the Clang AST and only serialize to relations the AST fragments that match.
The LibASTMatchers provides a domain-specific language (DSL) for building up a tree of matchers from a predefined set of base matchers.For example, an exact matcher for the int x; variable declaration is varDecl(hasName("x"), hasType(qualType(isInteger())), unless(hasInitializer(anything()))).bind("$m") where varDecl matches a VarDecl AST node, hasName and hasType are predicates on its name and type; unless is a matcher that succeeds when its inner matcher fails (in this case, hasInitializer).The bind attribute specifies a name for the matching AST nodes.
To use the LibASTMatchers library, Clog translates the syntactic patterns to AST matchers.However, the Clog pattern grammar and the Clang abstract grammar have been designed for different purposes.On one hand, the goal for the Clog pattern grammar is to be close to the C grammar even after adding metavariables.On the other hand, the Clang grammar is optimized for compiler analyses.
To a C programmer, the pattern ⟨. .$t $f, *$g. .⟩ matches two variable declarations, a variable $f with type $t and another variable, $g, with type pointer-to-$t, without any initializer.In terms of the C grammar, this means that the type of $g is split between two non-terminals: the type specifier $t and the declarator, *$g.However, the Clang AST represents this using two disjoint entities: a VarDecl which has a QualType as its type attribute.This is reflected in the LibASTMatchers DSL as the matcher: varDecl(hasType(qualType(pointsTo(qualType().bind("$t")))),unless(hasInitializer(anything()))).bind("$g") In the AST matcher, as in the Clang abstract grammar, the qualType matcher is an attribute of varDecl, while in the C grammar these nodes reside in different subtrees of the declaration non-terminal.This means that building the Clang AST matcher in the semantic actions is not feasible, thus Clog uses three intermediate translation steps.First, it parses the syntactic patterns to an internal AST, which contains the same non-terminals as the C11 grammar extended with metavariables.Secondly, it translates the internal AST to a pseudo-Clang AST, which contains the same AST nodes as the Clang AST, but it also allows concrete nodes from abstract grammatical categories such as expression (clang::Expr), statement (clang::Stmt) declaration (clang::Decl) or qualified type (clang::QualType), which are marked as being a metavariable or a gap.Thirdly, Clog traverses the pseudo-Clang AST and generates the AST matchers.

Limitations.
A fundamental limitation of Clog is that it matches the syntactic patterns against Clang ASTs, which means the metavariables always bind Clang AST nodes.Thus, we configured the pattern grammar generation to allow metavariables only for the terminals and nonterminals that have a corresponding Clang AST node.For example, Clog rejects patterns that have metavariables in place of qualifiers, such as ⟨. .$q int *$f. .⟩ because Clang represents qualifier as fields of the clang::QualType node.However, the pattern ⟨. .const int *$f. .⟩ is valid, because Clog can check that the AST node bound by $f is a pointer to a const-qualified integer.While the use of Clang AST can enable future extensions of Clog to analyze C++ programs, this comes with challenges.The most significant challenge is that the language must introduce a distinction between terms existing in the source and terms arising from template instantiation, auto type deduction and default methods.

Clog at Runtime
Figure 9 provides a runtime view of the Clog implementation.The evaluation of a Clog program proceeds with the parsing of the analysis code.The parsing of syntactic patterns is deferred to the Pattern parser.Then the semantic analysis phase ensures that the program is well formed.This is followed by a plan generation phase, where Clog generates an evaluation plan according to the semi-naive evaluation strategy [31]..This Clang-Clog library handles the parsing of the analyzed sources and builds the ASTs for all sources.In addition to building the ASTs, it supports pattern matching through Clang's LibASTMatchers library, the evaluation of built-in functions and CFG queries.

Pattern
Parser.The Clog parser defers the parsing of patterns to an Earley parser [29], which is capable of handling general context-free grammars and ambiguity.Support for ambiguity is necessary, since the patterns lack context to disambiguate cases such as t * a; or f(x);8 .The pattern grammar follows the C grammar given in the Annex A of the C11 language standard [1].This grammar is automatically extended when Clog is built, to accept metavariables and gaps for a configurable set of non-terminals.The result of the pattern parsing phase is an AST with nodes from the internal grammar.

Pattern Translation.
During the pattern translation phase, Clog translates the pattern ASTs from the internal grammar to a pseudo-Clang grammar.

Matcher Generation.
In this phase, Clog traverses the pseudo-Clang AST and generates matchers in the LibAST-Matchers DSL.
The translation from the pattern ASTs to the AST matchers is not always one-to-one.Due to ambiguities in the C language, Clog generates two AST matchers for the pattern ⟨. .$t *$p. .⟩, one for the pointer declaration and another for the multiplication.This is reflected in the evaluation plan as a disjunction of two literals.

Matcher Registration.
In this phase, Clog registers the matchers with the Clang-Clog library.The Clang-Clog library parses the matcher DSL, builds the AST matchers and returns a unique identifier for each matcher, that the Datalog engine uses during the plan evaluation phase for retrieving the results of the match.Clog registers a matcher a single time to avoid the cost of parsing the matcher DSL each time it evaluates a pattern literal.
The Clang-Clog library supports three kinds of pattern matchers: • node matcher -attempts to match one given node; • subtree matcher -finds all matches in all descendants of one given node; • global matcher -finds all matches across all ASTs.All pattern literals without a subtree restriction or a root node, of the shape ⟨. .C. .⟩, correspond to global matchers.For the ones that have a root literal, but not a subtree restriction, r ⟨. .C. .⟩, Clog generates a node matcher if the root variable r is bound by other literals in the clause.Otherwise, it generates a global matcher.Patterns with a subtree restriction, @s r ⟨. .C. .⟩, result in subtree matchers.

Plan Generation and Evaluation.
In this phase, Clog translates the Datalog rules to an evaluation plan.The operations of this plan are similar to the Relational Algebra Machine used by the Soufflé Datalog engine [27].
In contrast to the approach in [12], where the entire AST of the analyzed program is materialized as Datalog relations, in Clog we have implemented an on-demand approach, where we only materialize the AST nodes that are matched by the syntactic patterns and the ones that are exposed through the CFG predicates and built-in functions.
In addition, for pattern literals, we optimize for cases where the root variable of the pattern is bound by a literal occurring earlier in the clause.In this case, Clog runs a node matcher on the root node, instead of a global matcher on the entire AST.In the current implementation, the plan generator preserves the order of literals in the clause, with the exception of the infinite and binding predicates, which it may reorder.
For the CFG_SUCC and CFG_EXIT predicates, the Clang-Clog library lazily computes the CFGs only for functions for which these predicates are queried.In effect, this means that the CFG is computed only for functions relevant to the analysis.
The evaluation of the analysis proceeds with running the global matchers.This is handled by the Clang-Clog library using Clang's AST matching API, which runs all the global matchers in a single traversal over the whole AST.This stage is followed by the execution of the Datalog plan.In the final stage, Clog writes the output relations to disk.

Evaluation
In this section we ask the following questions: 1. RQ1 How capable is Clog to express code checkers with good precision and recall rates? 2. RQ2 How does the execution time of Clog compare to other tools?To answer these question we implemented several checkers in Clog and ran them on synthetic benchmarks and on real programs.We compared our results against the Clang Static Analyzer (CSA).We chose CSA because both itself and Clog have access to the same underlying AST and we wanted to assess the analysis expressivity and speed, without comparing the underlying AST representations.To compare with CSA 9 , we ran the relevant checkers using the clang-tidy frontend.We used an AMD EPYC 7713P 64-Core Processor with 504 GB RAM for our evaluation.

Synthetic Benchmarks
In line with earlier work [8,32], we adopted the Juliet 1.310 test suite in our evaluation.Juliet is a collection of synthetic tests aimed at assessing static analysis tools.

Experimental Setup.
To evaluate the expressivity of Clog we attempted to implement checkers for the first 15 weaknesses listed in the 2023 CWE Top 25 Most Dangerous Software Weaknesses published by MITRE 11 and run these checkers on Juliet.For the cases where the Juliet suite did not contain tests for a specific weakness, we exploited the hierarchical organization of the CWE database and we used the tests for the weakness' direct descendants.We present our mapping from listed CWEs to Juliet test sets in Table 1.
For each Juliet test set corresponding to a CWE we implemented a checker in Clog.We explicitly list the enabled checkers in Table 2.In our analysis development, we followed an iterative process in which we aimed for increasing the checker's recall, without having a precision that is lower than the precision of the Clang checker.
4.1.2Discussion.In Table 2 we present a comparison regarding precision, recall and running times of the CSA and Clog checkers.
To detect CWE-78 OS Command Injection we implemented an inter-procedural data-flow analysis.CSA did not produce any warnings, even though we configured the taint analysis to use the same propagation rules as Clog.We adapted the same checker for CWE-134 Uncontrolled Format String, with a different set of taint sources and sinks.
To achieve high recall on the test sets CWE-121 Stack Based Buffer Overflow, CWE-122 Heap Based Buffer Overflow, CWE-124 Buffer Underwrite, CWE-126 Buffer Overread, CWE-127 Buffer Underread the checkers need to perform constant propagation or numerical domain analysis.Both these analyses are difficult to encode in Datalog without general lattice support, which is a limitation of our implementation.
These checkers also brought to light a mismatch between the pattern grammar and the Clang AST grammar.The pattern for matching array declarations ⟨. .$t $a[$n]. .⟩ allows an expression for the array length, $n.However, the Clang AST contains two nodes for arrays: one for fixed-length and another for variable-length arrays.The fixed-length array node does not have a child node for encoding the length, but it encodes it explicitly, as an integer field.In effect, this means that there is no AST node that the variable $n can bind, and this is not compatible with our assumption, that each metavariable binds an AST node.Fortunately, this is not a fundamental limitation of our translation scheme, since it can be circumvented by adding a custom AST matcher which instantiates a constant integer AST node when matching the size of an array.
In the CWE-416 Use a er Free checker we implemented an intra-procedural data-flow analysis.An intra-procedural analysis proved sufficient, because freed pointers are reported when passed as arguments or return values.
To detect CWE-476 Null Pointer Dereference we implemented an inter-procedural data-flow analysis.To reduce false positives, we added rules that exclude paths dominated by a null test of a variable.One of these rules is where we define a predicate, NotNullPath, to mark that a variable d is not null on paths starting from s.
To achieve good precision and recall on the CWE-190 Integer Overflow test set, the analysis needs to perform numerical domain analysis, a Clog limitation we have also seen In Table 3 we list the sizes of Clog programs, counted as number of rules.We were able to implement all checkers concisely, with no more than 35 rules and most of them with 20-30 rules.We also note that, for all checkers, the number of syntactic patterns is close to the number of rules, which shows that they are a well-utilized language feature.
Looking back at RQ1, we conclude that Clog is expressive enough to encode high precision checkers for typical dataflow analyses, even inter-procedural.However, we encountered difficulties in achieving good recall on the checkers for buffer accesses.This class of vulnerabilities highlighted a limitation of Clog: it can't express analyses that use lattices other than the power set.Fortunately this is a known limitation of Datalog and is addressed by approaches orthogonal to ours [22,30].
In reply to RQ2, in Table 2, we observe that the running times of Clog are about 2-3 times slower than CSA, but this is not surprising if we consider that we implemented Clog partly as a Java application, partly as a native library and the two do not share a heap, so all the results of pattern matching must be copied from a native heap to the Java heap.In spite of this, the fact that Clog is fully declarative enables other optimization approaches, such as parallelizing the evaluation engine or incrementalization, in the style of IncA [30] or JavaDL [12].

Realistic Workloads
Inspired by a study on the effectiveness of static C code analyzers [21], we have reused programs from the Magma v1.1 fuzzing benchmark [14] as targets for our testing.Magma contains a set of programs with known vulnerabilities and their respective fixes.Conveniently, these fixes can be enabled or disabled through a preprocessor symbol.To avoid assumptions about how the location of a fix corresponds to the location of a report, we adopt a differential approach: we compare the reports between the fixed and the faulty versions of a program.
From the checkers we have implemented for the Juliet benchmark, we selected the ones with good recall and full precision that also have corresponding vulnerabilities in the Magma programs: CWE-416 and CWE-476.
From the Magma programs, we selected openssl, sqlite, libxml2 and libpng.We were not able to properly extract the compilation commands for libtiff and php and we discarded poppler for being mostly a C++ project.In Table 4 we present the results of running Clog and CSA analyses on the selected Magma programs.For CWE-476, both checkers discovered a real issue in sqlite, while for openssl only CSA succeeded.For the sqlite issue, CSA generates one report, while Clog generates four.The difference is that CSA reports only the first dereference of a null pointer, while Clog reports all.In the openssl case, our implementation of the null pointer dereference checker does not handle uninitialized variables, while CSA does.This is a limitation of our analysis set, but not of Clog itself, since an uninitialized variable checker is another instance of dataflow analysis, a class of analyses we have showed that Clog can express.For CWE-416, both CSA and Clog fail to find any cases of use-after-free vulnerabilities, even though one vulnerability is present in each of the libxml2, sqlite3 and openssl programs.Contrary to results we have seen on the Juliet test suite, the running times for Clog are significantly better than for CSA.

Related Work
Declarative approaches to the static analysis of program semantics have a rich history based on a variety of techniques, including attribute grammars [17], pattern-matching on algebraic data types [2], term rewriting [16], logical queries [23], flow-sensitive types [13], and combinations such as Cobalt's use of integrated modal logic and term rewriting [20].
Most approaches operate on some representation of the AST and/or CFG, often involving intermediate code; custom extensions must then follow these (generally tool-specific) abstractions [3,24,28].This is a well-understood challenge in the field of API protocol checking [5,13], which attempts to identify API-specific bugs.Some tools try to address this challenge by encoding analysis rules in an internal DSLs (e.g., embedding them inside API code via Java annotations [5]).However, these approaches are limited by the host language's annotation facilities.
External DSLs based on syntactic pattern matching therefore offer an appealing alternative.For C, Coccinelle [18] offers facilities for code matching and transformation, based on syntactic patterns.In contrast to Clog, the Coccinelle DSL that describes the syntactic patterns requires the user to explicitly declare metavariables and their syntactic categories.Coccinelle provides a scripting interface (Python and OCaml) that the users can use to combine pattern matching with their custom analysis and build bug finding tools [19].The scripting interface serves approximately the same purpose as the Datalog language in Clog.
Other tools that combine syntactic pattern matching with logical queries include SOUL [9], which combines AST pattern matching with Prolog-style logic programs to analyze Java programs, though SOUL restricts patterns to five predefined syntactic categories, and JavaDL [12], which is closest to our work in spirit in that it offers Datalog-style rules and syntactic pattern matching, on Java programs.Clog reuses parts of the JavaDL infrastructure, specifically the grammar transformation mechanism and JavaDL's Datalog implementation.Unlike Clog, JavaDL offers no built-in predicates for CFG traversal, which limits its ability to express flow-sensitive analyses.Internally, JavaDL performs pattern matching by encoding ASTs as tables of Datalog facts and relying on an external Datalog engine, while Clog translates syntactic patterns to queries for the Clang AST matching mechanism and materializes only the results of these queries as Datalog tuples.
Clog is not the first Datalog-style tool for C: cclyzer [4] implements declarative points-to analyses for C and C++ in Datalog.Compared to Clog, which works on the AST, the cclyzer analyses are implemented in terms of LLVM IR instructions.Like JavaDL, cclyzer relies on the high-performance Datalog engine Soufflé [27].Despite its wide adoption for program analysis tasks [4,6], Soufflé itself does not extend the Datalog language with features particularly geared towards program analysis.

Conclusions
We have shown that the Clog language can express powerful custom checkers for C code without exposing program representation internals.While our experiences suggest that implementing a language like Clog on top of Clang may require nontrivial internal plumbing to align program structure and control flow, we find that the cost for this abstraction is modest: Clog-based checkers may even run faster than Clang's own checkers, despite delivering competitive results.Like most Datalog-based approaches, Clog is limited to boolean (product) lattices but scales to a wide variety of practical analyses, including interprocedural and dataflow analyses.Overall, we argue that Clog's combination of language simplicity, expressivity, and performance make it uniquely suited for building custom C code checkers.

Figure 2 .
Figure 2. Implementation of a goto check in CodeQL

Figure 3 .
Figure 3. Implementation of a goto check in Clog

Figure 4 .
Figure 4. Wrong uses of an arena allocator API

Figure 8 .
Figure 8.The syntax of the Clog language

Figure 9 .
Figure 9. Phases of a Clog checker evaluation

Table 1 .
A mapping from common weaknesses (CWEs) to Juliet test sets.We omit CWEs that are not relevant to C programs.
earlier, on the tests sets for CWE-121,122,124,126,127. Detecting cases of CWE-123 Write What Where Condition requires heap modeling, which out of the scope of this work.CSA also fails to report any warning on this test set.The test sets CWE-114, CWE-785 use the Windows API, while CWE-23 and CWE-36 contain only C++ sources.We did not implement checkers for these test sets.

Table 3 .
Predicate, rule and pattern literal counts for Clog programs

Table 4 .
CSA and Clog report numbers and running times on Magma test programs.V -columns refer to the vulnerable version, F-columns refer to the fixed version.