ParDiff: Practical Static Differential Analysis of Network Protocol Parsers

Countless devices all over the world are connected by networks and communicated via network protocols. Just like common software, protocol implementations suffer from bugs, many of which only cause silent data corruption instead of crashes. Hence, existing automated bug-finding techniques focused on memory safety, such as fuzzing, can hardly detect them. In this work, we propose a static differential analysis called ParDiff to find protocol implementation bugs, especially silent ones hidden in message parsers. Our key observation is that a network protocol often has multiple implementations and any semantic discrepancy between them may indicate bugs. However, different implementations are often written in disparate styles, e.g., using different data structures or written with different control structures, making it challenging to directly compare two implementations of even the same protocol. To exploit this observation and effectively compare multiple protocol implementations, ParDiff (1) automatically extracts finite state machines from programs to represent protocol format specifications, and (2) then leverages bisimulation and SMT solvers to find fine-grained and semantic inconsistencies between them. We have extensively evaluated ParDiff using 14 network protocols. The results show that ParDiff outperforms both differential symbolic execution and differential fuzzing tools. To date, we have detected 41 bugs with 25 confirmed by developers.


INTRODUCTION
Network protocols are crucial for systems that require communication, such as robotic systems and the Internet of ings, to name a few.Network protocols specify the formats of communication messages and the steps that multiple parties must follow in order to communicate.Di erent implementations have been independently developed for the same network protocol, because of either historical reasons or speci c design requirements such as reducing the energy consumption for an embedded system.For example, there are hundreds of di erent implementations for more than 15 protocols in the Bluetooth family. 1,2 Message parser is a critical component of a protocol implementation, responsible for parsing network messages and checking their validity.Unfortunately, network message parsers are errorprone.Numerous bugs have been discovered, which o en result in severe system failures, security breaches, and data loss.For example, Heartbleed [Heartbleed 2020], a bu er over-read bug discovered in the Transport Layer Security (TLS) protocol parser of OpenSSL in 2014, a ects millions of web servers around the world. 3is vulnerability has allowed a ackers to extract sensitive information from servers due to a missing check of the message bu er bounds.
Among numerous bugs in message parsers, a number of them are silent data corruptions such that they violate protocol-speci c properties but do not cause crashes.For example, the vulnerability, CVE-2022-26129, 4 which is detected by our approach and detailed in the next section, reads data beyond a protocol-speci ed range in an oversized bu er but does not access data beyond the bu er bounds.As such, it is not a common bu er over-read bug that can cause system crashes.Traditional static or dynamic bug-nding techniques, e.g., symbolic execution [Cadar et al. 2008;Shi et al. 2018;Wei et al. 2023;Xie and Aiken 2005], model checking [Ball et al. 2011;Cho et al. 2013;Musuvathi and Engler 2004], or fuzzing [Godefroid et al. 2012;Haller et al. 2013;Huang et al. 2020], cannot detect it unless they are provided with the protocol-speci c oracles.Unfortunately, such speci c oracles are o en not available or entail substantial and error-prone manual e orts, because network protocols are usually speci ed in natural language documents and lack formal speci cations.
To address the oracle problem, di erential analysis [Arnaboldi 2023;Badihi et al. 2020;Johnson et al. 2011;Ma et al. 2018;Mora et al. 2018;Person et al. 2008;Verdoolaege et al. 2012] and testing [Bao et al. 2016;Zou et al. 2021] could be useful as they can e ectively nd domain-speci c bugs by comparing di erent versions or implementations of a system.However, existing approaches have notable limitations.First, in terms of static analysis, approaches like graph di erentiation [Johnson et al. 2011] and di erential symbolic analysis [Person et al. 2008;Rutledge and Orso 2022;Verdoolaege et al. 2012] are limited to analyzing syntactically similar programs, e.g., programs evolved from the same base version.If programs are substantially di erent in their syntactic structures, these approaches may produce considerable false positives.Second, in terms of dynamic analysis, while existing di erential testing (or fuzzing) [Arnaboldi 2023;Churchill et al. 2019] is capable of nding semantic di erences in independent projects, their e ectiveness strongly depends on the quality of their inputs and o en su ers from low code coverage, leading to many false negatives.Additionally, using di erential testing needs signi cant manual e ort in bug diagnosing, since they are not designed to provide hints on bug locations.erefore, developers must closely examine execution discrepancies and identify problematic code locations for each input.
the limitations of prior work by abstracting protocol implementations to a high-level and uni ed representation.e uni ed representation leads to the precise comparison of ne-grained program behaviors, further enabling low-e ort bug localization.
To identify semantic di erences in two syntactically disparate implementations, ParDiff automatically transforms a protocol implementation into a nite-state machine (FSM) that characterizes the message formats.In this abstracted FSM, state transitions are conditioned by rst-order logic formulas, which represent format constraints, and ordered by the indices of a message bu er. is high-level abstraction makes ParDiff e ective, disregarding the syntactic disparity of multiple protocol implementations.Furthermore, the transformation discards all implementation details irrelevant to parsing and, thus, makes it possible for our tool to handle large programs.
To nd bugs, ParDiff combines a bisimulation algorithm and SMT solvers to compare the FSMs extracted from di erent implementations.e bisimulation algorithm a empts to align state transitions in two FSMs so that we can compare the parsing status in a ne-grained way.at is, our di erential analysis checks if the constraints on each pair of aligned state transitions are equivalent.Given that these implementations should follow the same protocol speci cation, the FSMs should precisely align with each other.Hence, any discrepancy in state transitions between these FSMs indicates a potential bug. is approach e ectively breaks down the comparison of large, intricate formulas into multiple comparisons of smaller, manageable ones, enhancing the accuracy and precision of the analysis.
Finally, to reduce the manual e ort in bug localization, ParDiff records source code locations during the process of format constraint abstraction.Consequently, every disparity detected in the FSMs can be directly traced back to its source code position, facilitating the precise identi cation of underlying issues.Notably, compared to common di erential testing techniques, this process eliminates the need to scrutinize execution discrepancies across numerous inputs, signi cantly reducing the human e ort to diagnose and localize bugs.
Contributions.In summary, we make the following contributions.
• We propose a novel static di erential analysis for locating hidden bugs in protocol parsers.
-It features a path constraint reduction algorithm to isolate the message parsing logic from other functionalities and to generate protocol format constraints.-It translates format constraints to a compact representation, i.e., constrained nite-state machines, for precise alignment between di erent implementations.-It leverages a bisimulation algorithm that precisely projects the di erences in the constrained nite-state machines to buggy code at a ne-grained level.
• We implement the proposed approach in ParDiff,5 a practical static di erential analyzer for protocol parsers.We have evaluated ParDiff on 14 real-world protocols.e result shows that ParDiff is e cient and capable of analyzing two disparate implementations of a protocol in one minute on average.We have detected a total of 41 bugs, with 25 con rmed or xed by the developers.Conventional di erential symbolic execution cannot nish the analysis due to the path explosion problem.Di erential testing can only detect 3 of them in the same time budget (i.e., less than 2 minutes) and can only detect 25 in 24 hours (i.e., using over 720× time cost).
Organization.Section 2 presents a real-world bug to motivate our solution.Section 3 provides an overview of our approach.Section 4 presents the detailed design of ParDiff.Section 5 presents the evaluation, where ParDiff is compared to both static and dynamic analysis tools.Section 6 discusses the related work.Section 7 concludes the paper.

MOTIVATING EXAMPLE
Before diving into the technical details, let us rst use a real-world bug found by ParDiff to illustrate what bugs our new approach can discover and the limitations of existing work.

Bu er Accesses O ending Protocol-Specified Bound
We consider a bug found in an implementation of the BABEL network routing protocol (RFC 8966) [Chroboczek and Schinazi 2023].e program mistakenly accesses a bu er at an index, which is out of the bound speci ed by the protocol, but still within the memory-safe bound due to an over-sized allocation.erefore traditional crash-focused approaches face di culty in detecting it.
Figure 1 shows two BABEL implementations that exhibit this bug.e rst implementation (Figure 1 ) is from the other reference implementation [Chroboczek 2023].e two implementations de ne di erent APIs, and exhibit substantial syntactic and semantic di erences, as highlighted in green for syntactic di erences and red for semantic di erences.
Line 15 of Figure 1a manifests the bug, which checks a less constrained condition diverging from the condition at line 17 in Figure 1b.As shown at the bo om of Figure 1a, a BABEL packet consists of a pre x, multiple messages, and a su x. e function parse update subtlv in Figure 1 parses a message a, whose length is alen, into a list of type-length-value (TLV) items iteratively.Each loop iteration parses one TLV item.A TLV consists of a eld type (line 5), a length eld len (line 12), and the value eld, e.g., channels (line 23), whose length is speci ed by the length eld.
e incorrect check (line 15) allows the parser to read bytes beyond the delimited length of the current message (line 23), i.e., extruding into the next message or the su x of the current message.is does not cause a bu er over-read error in the usual sense, because it does not exceed the allocated bound of the whole bu er.However, it is semantically not allowed to use the content of the second message to interpret the rst as all messages should be independent.
To see a concrete failing case, let us consider a message where a[0]=2, a[1]=2, and alen=2, as shown at the bo om in Figure 1a.Since the length of the message is 2 and a TLV item in the message contains at least two bytes, i.e., the type and the length bytes, we can conclude that the message a contains only one TLV item whose length is 2. is further implies that the TLV item contains an empty value.However, the len eld (i.e., a[1]) mistakenly speci es that there is a two-byte value.e incorrect check at line 15 fails to recognize this inconsistency, causing two bytes from the following message to be undesirably copied at line 23.In contrast, the second implementation in (b) correctly checks i + len + 2 <= alen at line 17, thus, can lter this invalid message.

Why Existing Works Fail
Traditional methods, including symbolic execution (e.g., [Cadar et al. 2008;Shi et al. 2018;Wei et al. 2023;Xie and Aiken 2005]), model checking (e.g., [Ball et al. 2011;Cho et al. 2013;Musuvathi and Engler 2004]), and fuzzing (e.g., [Godefroid et al. 2012;Haller et al. 2013;Huang et al. 2020]), cannot e ectively detect this bug.Although the buggy program accesses bytes beyond the range of a message, it does not access bytes beyond the whole bu er. at is, it does not violate common memory-safety properties but a domain-speci c correctness oracle, which is either unavailable or very expensive to obtain in practice.
To address the domain-speci c oracle problem, dynamic and static di erential analyses have been proposed in the literature.Di erential testing or fuzzing [Arnaboldi 2023;Reen and Rossow 2020;Zou et al. 2021] feeds the same input into di erent implementations and compares their execution behaviors.For example, DPIFuzz [Reen and Rossow 2020] encodes the runtime program behavior into a hash and then compares the hashes generated by di erent implementations but with the same inputs.However, such techniques can hardly nd this bug due to the following limitations.First, fuzzing techniques require high-quality seed inputs so that they can achieve high code coverage and generate (partially) valid protocol messages that can pass all preceding validation checks before reaching the buggy code.at is, in the aforementioned example (Figure 1), the message should be able to pass the checks at lines 6, 10, 15, and 20 in the rst implementation.Since fuzzing techniques are mostly coverage-driven relying on random mutations, such a simple search strategy is unlikely to generate inputs satisfying the aforementioned conditions.In addition, they may generate a large amount of di erence-inducing inputs.To comprehend the diverging execution behaviors, humans need to execute each input from the entry function of message parsers.
is additional e ort signi cantly increases the di culty of identifying bugs from the di erences.Unlike fuzzing, as a static technique, di erential symbolic analysis [Badihi et al. 2020;Mora et al. 2018;Person et al. 2008;Ramos and Engler 2011;Rutledge and Orso 2022;Verdoolaege et al. 2012] does not rely on seed inputs, and they can achieve high coverage of program paths and infer precise constraints over a packet.Due to the well-known path explosion issue in symbolic execution, these techniques o en assume that two implementations have few di erences and, thus, are primarily used to analyze multiple versions of the same so ware [Person et al. 2008].In the example Figure 1 and more general situations, two protocol implementations can come from completely di erent codebases, containing a large number of di erences in the code.Such se ings break the assumption of di erential symbolic analysis and make it impractical.

OUR APPROACH IN A NUTSHELL
In this section, we present an overview of ParDiff and illustrate its capability of detecting the bug in the motivating example.We also discuss how inherent challenges in our approach are addressed.

Overview
ParDiff automatically extracts message-parsing logic from di erent protocol implementations to nite state machines (FSMs) constrained by message-parsing properties.By li ing programs to FSMs, unessential implementation di erences are abstracted away.en, ParDiff constructs a bisimulation relation [Gentilini et al. 2003] between two FSMs corresponding to two di erent implementations.e discrepancy between the two FSMs can be rei ed to a concrete input, indicating a potential bug.Checking discrepancy or equivalence at the FSM level allows us to e ciently compare two implementations in quasi-linear steps and locate buggy code at a ne-grained level.Compared to existing work that a empts to nd subtle bugs in protocol parsers, ParDiff has the following advantages: • Unlike conventional static di erential analysis that su ers from scalability, ParDiff abstracts away format-irrelevant code and, thus, is capable of handling large programs.• Unlike conventional dynamic di erential analysis that su ers from low code coverage or requires manual e ort to recover protocol formats, ParDiff automatically infers precise protocol formats and can reach deep code guarded by complex conditions.• ParDiff generates inputs that are precisely mapped to buggy source code locations, signi cantly reducing the human e ort to diagnose and localize bugs.

How ParDi Works
Despite these advantages, it is technically challenging to realize ParDiff in practice due to path explosion and ne-grained comparison of path conditions.Next, we brie y discuss the steps of ParDiff, together with our solution to addressing the inherent challenges.
Stage 1: Extracting Protocol Format Constraints (detailed in Section 4.1).A message parser imposes constraints on input messages.ese constraints in an execution path encode a message format.Ideally, ParDiff should exhaustively enumerate all paths to generate complete protocol formats.However, this is challenging in practice, because there are an exponential number of paths for any nontrivial program, which is known as the path explosion problem.Moreover, protocol parsers may contain auxiliary code, i.e., routines that are not related to parsing or respectively only exist in one parser but not the others.Taking these auxiliary routines into account leads to false positives in comparison.
To mitigate path explosion and reduce false positives, ParDiff must only select a nite number of paths that are critical for identifying discrepancies or establishing the equivalence between two parsers.To this end, we rst adopt loop unrolling and state merging, as commonly used in symbolic execution and bounded model checking.However, this is not enough since the extracted format constraints may still contain constraints irrelevant to the parsing logic.us, ParDiff also lters out constraints that are not imposed on the input bu er.
In the example of Figure 1, we unroll the loop once and only extract constraints imposed on the input bu er a.In Figure 1(a), one execution path 1 goes through line numbers 4 → 6 → 8 → 4 → 28, and is constrained by 1 , i.e., 0 < alen ∧ [0] = 0 ∧ 1 ≥ alen.e constraint forms a valid format for the set of messages that only contain a single byte of zero.Similarly, the constraint of the path 2 : 4 alen, indicating a valid format where [0] = 2.We can also extract similar constraints 1 and 2 from the second implementation (Figure 1(b)).Stage 2: Comparing Protocol Format Constraints (detailed in Section 4.2).Assume 1 and 2 are two format constraints obtained from their corresponding implementations.To nd di erences between two protocol implementations, it seems straightforward to use an SMT solver to check if 1 ≠ 2 is satis able.A satis able assignment indicates an input message that can be parsed  in one implementation but not the other, thereby nding a semantic di erence.However, this monolithic satis able assignment does not directly inform us which speci c lines of the program contribute to the semantic discrepancy.Additionally, to expose all possible discrepancies, we must keep querying the SMT solver for various assignments.Hence, this coarse-grained use of the SMT solver is ine ective in locating the discrepancy between the two implementations (see Section 5.4).
To precisely locate the buggy statements in the protocol parser, ParDiff must identify the speci c constraints that lead to semantic di erences. is entails nding a matching between two format constraints so that ParDiff can safely ignore those constraints that do not lead to semantic di erences.To this end, ParDiff transforms each format constraint to an FSM, which encodes the spatial parsing logic of the protocol format.Edges are conditioned by constraints imposed on the input messages.Transitions between states are ordered by constrained bu er indices.at is, if one state transition is constrained by the -th element of a network message, its immediately subsequent state transitions should be constrained by the + 1-th element.is ensures that we can align two FSMs by the order of how they constrain the input memory bu er.
Figure 2 illustrates partial FSMs, which are produced based on the format constraints collected from the two implementations in Figure 1.Each state transition describes how one or multiple consecutive bytes are parsed and each path from the start state to the nal state describes how a valid message is parsed.
Stage 3: Locating Implementation Di erences and Bugs (detailed in Section 4.3).Given two FSMs produced in the previous stage, we leverage bisimulation [Gentilini et al. 2003] to compare two FSMs, which can e ciently nd nonequivalent state transitions.For example, in Figure 2, we can establish a bisimulation between S 0 → S 1 and S ′ 0 → S ′ 1 , since their transition conditions are equivalent; however, the transitions S 1 → S 3 and S ′ 1 → S ′ 3 do not bisimulate each other.Note that the two FSMs in Figure 2 are already isomorphic modulo transition conditions.However, in practice, two FSMs can di er a lot in both structures and transition conditions, and they can still . The syntax of the language protocol parsers.
be compared by bisimulation.Since each state transition in the FSMs contains only constraints over a few message bytes, when bisimulation fails, we can easily locate the implementation di erences according to the di erences in state-transition constraints.For example, the aforementioned state transitions di er in the constraints [1] ≤ and [1] + 2 ≤ , which correspond to the buggy code at line 15 and line 17 in the rst and the second implementations, respectively.

DESIGN
is section discusses the details of the three stages in our approach: (1) collecting protocol format constraints (Section 4.1), ( 2) translating format constraints into a nite state machine (Section 4.2), and (3) comparing state machines to locate possible bugs in di erent implementations (Section 4.3).At the end of this section, we discuss the soundness of our design (Section 4.4).

Collecting Format Constraints
To locate the implementation di erences, ParDiff rst collects format-relevant path constraints from multiple protocol implementations.To avoid path explosion, ParDiff unrolls loops (and recursive function calls) up to a constant number, and merges path constraints at the joint point of multiple paths, instead of enumerating each path in a program.To remove format-irrelevant constraints, ParDiff ignores the constraints not imposed on the input bu er.

Language of Protocol Parsers.
To illustrate how ParDiff works, we use a small language that models a protocol parser.e syntax of the small language is de ned in Figure 3.In the language, the entry of a protocol parser is a function parse that takes at least two parameters as the inputs.Following the motivating example (see Figure 1), the parameter a is an array of bytes representing the network message to parse.e parameter alen represents the length of the array.It may also take additional arguments to con gure the parser.Values in the language could be either a variable or a constant.An expression could be a value, a binary expression, or an array access of the input message.A program statement in the parser could be an assignment to a variable, an abort statement (modeling errors e.g., flog err in Figure 1), a conditional statement, or a sequential composition of two statements.In our language, abort terminates the program.In a parser, the message a is o en read-only.us, we do not have statements to modify the input message.
e small language is loop-free; when implementing ParDiff, we follow the standard approach in bounded model checking [Biere et al. 2009] to unroll loops.While loop unrolling may introduce unsoundness, ParDiff is e ective in terms of bug detection (instead of sound veri cation) as shown in our evaluation.e language also does not model pointers.In the implementation, we utilize an o -the-shelf pointer analysis [Sui et al. 2011].e language does not model function calls for the sake of simplicity, since we can inline all functions into the main parser.Before performing the analysis, ParDiff users need to annotate the protocol entry function, the input message bu er a and its length alen.In practice, identifying their location is relatively straightforward, since the parser entry usually closely follows certain network system calls (e.g., recv, recvfrom, and recvmsg), as shown in prior works [Caballero et al. 2009;Shi et al. 2023].
4.1.2Format Constraints.Given a program in the language, we collect path constraints relevant to protocol formats, referred to as the format constraint, via static symbolic analysis.e analysis maps each variable to an abstract value ˆ in Figure 4. Speci cally, we use ⊤ to denote a formatirrelevant value, i.e., values that do not depend on the input bu er and its length.We use [ ˆ ] to represent a byte in a message and alen the message length.e ite constructs represent an if-then-else constraint.With the abstract value de ned, we then de ne the output of the analysis, i.e., the format constraint, below.
De nition 1 (Format Constraint) A format constraint is a formula over a set of atomic branching conditions6 in a parser, satisfying the following requirements: (1) Each atomic branching condition in is a formula over [ ˆ ] and alen, without any format-irrelevant values.(2) Negating any atomic branching condition corresponds to a possible failure in the parsing procedure, i.e., triggering the abort statement.
• For requirement (1), a format constraint does not contain any format-irrelevant values (i.e., ⊤).During analysis, a format-irrelevant value is derived from a variable that is neither datadependent nor control-dependent on the protocol message and, thus, is an implementationspeci c variable, e.g., the variable status in Example 1 below.• For requirement (2), each atomic condition in a format constraint must be related to some validity checks in the code.at is, negating the condition could lead to parsing failure, i.e., triggering the abort statement.Otherwise, the condition is not guarding against invalid format and, thus, is for other purposes that are irrelevant to parsing.For instance, for debugging, a parser implementation may include a branching statement if ( [0] > 0) { print(. . .); }.Since there are no abort statements in both branches, such a branching condition ( [0] > 0) does not imply the validity of a protocol message.us, ParDiff will exclude it in the format constraints.
Example 1 Figure 5 shows a parser, which has three inputs, the message a, its length alen, and a variable status that denotes the system status and is not related to parsing.e right part of the gure shows the structure of the network message.e rst a[0] + 3 bytes can be split into four elds.e rst eld uses one byte and denotes the length of the third, the data, eld.e second eld uses one byte and determines if the system runs in debugging mode.e third is the data eld, whose length is a[0].e fourth is the ctrl eld, whose value must be 1. e parser contains four branching conditions at Lines 4, 7, 11, and 15. e format constraint is  and line 15 as discussed in the next example.e condition at Line 7 is not relevant because it is not related to any byte in the input message.e condition at Line 11 is not relevant, either, because while the condition checks if [1] = 0, neither branch aborts.at is, the validity of the message is not related to the value of [1].✷ 4.1.3Collecting Format Constraints via Path Constraint Reduction.e static analysis for collecting the format constraints is described as the inference rules in Figure 6. e inference rules de ne how each statement in our language updates the program state in the form of ⌊A, ⌋ ⌊A ′ , ′ ⌋.
Here, ⌊A, ⌋ and ⌊A ′ , ′ ⌋ are the program states before and a er the execution of statement , respectively.In a program state, is the collected format constraint, while the abstract store A maps a program variable to its abstract value.Additionally, the lookup operation on A is also de ned for constants: A( ) = for any constant , indicating constants retain their values.A( ↦ → ˆ ) represents the abstract store updated with variable now bound to a new abstract value ˆ .e rule Init initializes the program state by mapping the variable len to the abstract value alen and mapping all variables not related to network messages to a format-irrelevant value.e assignment rules AssignVal, AssignBin, and AssignArr are quite standard.For instance, in the rule AssignBin, if the abstract values of the variables 1 and 2 are ˆ 1 and ˆ 2 , respectively, the resulting abstract value will be ˆ 1 ⊕ ˆ 2 .ese rules follow the exact semantics of these statements.
e rule Abort resets the path constraint to false because it terminates the program and cannot reach the exit of the program.e rule Seqencing means that we analyze the program statements in order, using the postcondition of 1 as the precondition of 2 .
e rule Branching assumes that the abstract value of the branching condition is ˆ , and given the path constraint before the if-statement, the analysis result of the true branch is ⌊A 1 , 1 ⌋ and the result of the false branch is ⌊A 2 , 2 ⌋. e program state a er a branching structure could be in three cases.In the rst two cases, one of the branches cannot reach the joint point a er the branching.us, we directly use the analysis result of the other branch that can reach the joint point.A branch may not be able to reach the joint point due to two possible reasons: (1) the path constraint of that branch is unsatis able, or (2) the branch contains an abort statement that terminates the program.In the third case where both branches can reach the joint point, we merge two abstract stores where the new variables are guarded using the ite operator, meaning that if the condition ˆ is true, the abstract value of is A 1 ( ) or, otherwise, is A 2 ( ) s. e path constraint is the disjunction of the path constraints from the two branches.
Branching Fig. 6.Inference rules for collecting format constraints.
To compute a format constraint and remove irrelevant branching conditions, we apply the inference rules in Figure 6 together with a set of simpli cation rules, including but not limited to: • Simplify ˆ ∨ ¬ˆ into true by the law of excluded middle, and simplify ˆ ∨ ˆ or ˆ ∧ ˆ into ˆ .
• . . .e rst and second simpli cation rules de ne the operations on a format-irrelevant value ⊤.Intuitively, the rst simpli cation rule preserves format-relevant constraints while the second one removes irrelevant ones.e two rules together ensure the rst requirement in De nition 1 to be satis ed. e third and fourth simpli cation rules ensure the second requirement in De nition 1 to be satis ed. at is, given a branching condition ˆ , if neither the true branch nor the false branch aborts the program, this branching condition is not format-relevant.us, the fourth rule simpli es ˆ ∨ ¬ˆ into true.e third rule facilitates the use of the fourth rule.We illustrate the analysis procedure in the following example.
Example 2 (Continued) Figure 5 shows the ①-⑨ steps of computing the format constraint.Initially, the abstract value of status is set to a format-irrelevant value ⊤.In Line 4, the path constraint of the true branch is [0] + 3 > alen (Step ①), which is then set to false due to the program-terminating abort statement (Step ②). e initial path constraint of the false branch is [0] + 3 ≤ alen (Step ③).
In Step ⑥, we compute the merged path constraint.Since the true branch at Line 4 cannot reach the joint point, the constraint at Line 10 inherits the one from the false branch, i.e., [0] + 3 ≤ alen.Similarly, the path constraints at Lines 11 and 12 are [0] + 3 ≤ alen ∧ [1] = 0 and [0] + 3 ≤ alen ∧ [1] ≠ 0. A er Line 12, we merge them into [0] + 3 ≤ alen according to the third and fourth simpli cation rule.As such, we get the format-relevant path constraint [0] + 3 ≤ alen at Line 13.Similarly, a er Line 15, we get one more constraint, i.e., Given a program in the language de ned in Figure 3, the static analysis produces a sound format constraint.
Proof.(Sketch.) e correctness of this lemma is implied by two facts.First, all inference rules in Figure 6 model the exact semantics of the program statements.Second, the simpli cation rules applied to the path constraint do not change its semantic meaning.□

From Format Constraints to FSM
As discussed in Section 2, it is challenging to locate implementation di erences at a ne-grained level by directly comparing two constraints either syntactically or semantically.us, we transform a format constraint into an FSM. e FSM speci es how a network message is parsed byte by byte (or eld by eld) as well as the constraints each byte or eld needs to satisfy (Note: a eld is multiple consecutive bytes).As such, the problem of comparing two protocol implementations is reduced to the problem of comparing two FSMs, which can then be addressed by bisimulation [Sangiorgi 1998] and provides two advantages (see Section 4.3): (1) the implementation di erences can be located at a ne-grained level; (2) the comparison can complete in quasi-linear steps [Gentilini et al. 2003].
In what follows, we rst discuss how to transform an ordered format constraint to an FSM, then detail how we construct ordered format constraints from disordered format constraints.

4.2.1
From Ordered Format Constraint to FSM.We de ne the ordered format constraint below.

De nition 2 (Ordered Format Constraint) A format constraint is ordered if and only if, for any
In the de nition, a top-level byte [ ] is used to form other format constraints, instead of used to compute byte indices.For example, given the format constraint [0] + 3 ≤ alen ∧ [ [0] + 2] = 1, we say [0] in the rst sub-constraint is a top-level byte but it is not at the top level in the second subconstraint.Intuitively, if constraints are faithfully extracted from parser implementation, ordered constraints mean that the parser processes bytes in the message bu er in order (e.g., parsing byte 0 before parsing byte 1).
Example 3 e format constraint, De nition 3 (FSM Format) An FSM format contains a set of states and a set of transitions between states.It represents the entire message format in a sequential order.
• State: Each state is represented by S , where ≥ 0. e initial state S 0 symbolizes the condition before any part of the message is parsed.• State transition: A transition is represented as a tuple (S , , S ), where ≥ 0, > , and is a format constraint.e transition from state S to state S occurs i the constraint is satis ed.
Algorithm 1 shows how we recursively transform an ordered format constraint to the de ned FSM representation.We will explain how to turn unordered constraints into ordered ones later in Section 4.2.2.As a convention, transitions in a path of an FSM should parse bytes in a network message in order.at is, given two consecutive transitions, (S, , S ′ ) and (S ′ , ′ , S ′′ ), and ′ should respectively constrain two exclusive ranges of bytes and the bytes in precede bytes in ′ .Given a disjunctive constraint, Algorithm 1 creates an FSM for each sub-formula in the constraint and returns the union of these FSMs (line 3).Given a conjunctive constraint, Algorithm 1 creates an FSM for each sub-formula and concatenates them by connecting each nal state in an FSM to each Algorithm 1: Build FSM from Ordered Format Constraint.
1 procedure fsm( ) The finite state machine generated for the constraint start state in the next FSM, with a transition constraint true (line 5).Given an atomic constraint that does not contain any connectives, i.e., ∧ or ∨, we create a single state transition using as the transition constraint.
Example 6 is example illustrates how we translate the format constraint = 3) into an FSM using Algorithm 1.As illustrated in Figure 7(a), the FSMs M 1 and M 2 are respectively generated for [0] = 0 and [0] = 1.e FSM M is the union of the two FSMs.e FSM M ′ is created in a similar way and is connected to the FSM M. ✷ Lemma 4.2.e FSM produced by Algorithm 1 is an equivalent representation of the input format constraint .at is, assuming the conjunction of the state-transition constraints on each path of the FSM (i.e., transitions from the start state to the nal) is , we have ≡ for all .
Proof.Given any format constraint , we use to represent the conjunction of all constraints in an FSM path (i.e., the state transitions from the start state to the nal state).
Base: If a format constraint is an atomic constraint without any connectives ∧ or ∨, Algorithm 1 returns from Line 7 where it generates a single state transition constrained by .In this case, it is apparent that the lemma is correct.
Induction: Consider two format constraints, and as well as their corresponding FSM, denoted as fsm( ) and fsm( ), which contain and paths, respectively.Let us assume that the lemma to prove is correct.at is, we have ≡ ∨ =1 and ≡ ∨ =1 .
us, we have us, if the lemma is correct for and , it is also correct for ∨ .
Induction Case (2): Consider the format constraint ∧ , denoted as .As shown by line 5 in Algorithm 1, we have fsm( ) = fsm( ) ⊕ fsm( ).In fsm( ), all nal states of fsm( ) are connected to all start states of fsm( ).Hence, fsm( ) contains × paths, and = ∧ , where 1 ≤ ≤ and 1 ≤ ≤ .erefore, we have us, if the lemma is correct for and , it is also correct for ∨ .Summary: if the lemma to prove is correct for and , it is also correct for ∧ and ∨ .us, the lemma to prove is correct.□ As shown above, the generated FSM may be non-deterministic because there are the same transition constraints from one state to di erent states (e.g. the transitions connecting M and M ′ in Figure 7(a)).As a post-processing step, we follow existing automata theories [Khoussainov and Nerode 2012] to simplify each FSM and make it deterministic (as in Figure 7(b)).Note that as constraints are ordered to begin with, the state transition paths in the generated FSM are also ordered. is essentially allows us to align transition paths across FSMs from multiple implementations (e.g., aligning the transitions related to parsing the same byte in the message bu er) and conduct e ective bi-simulation (see Section 4.3).

Reordering Format Constraints.
In practice, a protocol parser may not rigorously follow the stream order to parse a message.As such, the format constraint collected in the rst stage may not be ordered.In such cases, we employ Algorithm 2 to rewrite an arbitrary format constraint into an equivalent but ordered one.
In the algorithm, Lines 5-11 perform the main operation to reorder sub-formulas in a conjunctive constraint.Lines 5-7 are straightforward and transform a constraint like [1] > 1 ∧ [0] > 1 by switching the positions of [0] > 1 and [1] > 1. Lines 8-10 deal with a special case where two sub-formula cannot be reordered by switching positions.For instance, switching positions of > 1 cannot make it ordered.In this case, we regard the two as a single atomic constraint by replacing ∧ with an equivalent operator &.Intuitively, we group constraints that cannot be ordered into groups such that groups can be ordered.For instance, the algorithm regards the three atomic constraints in e two groups can be ordered by switching their positions, yielding As such, the generated FSM satis es the property that it parses a message in the stream order -in this example, it parses the rst three bytes using the rst state transition and then the fourth byte using the second transition.Lemma 4.5.Algorithm 3 guarantees to nd the di erences between a pair of state transitions in two FSMs if the two FSMs are not equivalent [Gentilini et al. 2003].
Example 7 Consider the two FSMs in our motivating example (see Figure 2).We input the two start states, S 0 and S ′ 0 into Algorithm 3.For the two outgoing transitions from the state S 0 , we can respectively nd two transitions in the other FSM from the state S ′ 0 such that the transition constraints are equivalent (lines 6-7 in Algorithm 3).We then bi-simulate the two FSMs from S 1 and S ′ 1 (lines 8-9 in Algorithm 3).Since the two outgoing transitions from S 1 and S ′ 1 are not equivalent, we nd the di erences in the transition constraints, i.e., [1] ≤ vs.
[1] + 2 ≤ , to locate the implementation di erences. ✷ From FSM Di erences to Implementation Di erences and Bugs.Since the transition constraints obtained from the last step correspond to branching conditions in the code, we can then locate the di erences between the two implementations.For easier localization, we keep a record of the source code positions at which these constraints are generated during the rst stage.As demonstrated in our motivating example (Section 2), such di erences o en imply some bugs in protocol parsers.Our evaluation shows that ParDiff found 41 bugs, with 25 con rmed or xed (Section 5.6).

Soundness
Lemmas 4.1-4.5 state that given protocol parsers wri en in the small language de ned in Figure 3, our approach is sound and guaranteed to nd di erences between the parsers.e lemmas prove the following facts: • Lemma 4.1 states that we generate a sound format constraint from the protocol parser; • Lemmas 4.2-4.4state that we translate each format constraint to an FSM, which is an equivalent representation of the format constraint; • Lemma 4.5 states that by bisimulation, we can nd the di erent state transitions if two FSMs are not equivalent.
In practice, we have to handle common program structures not included in the abstract language, which leads to a soundy [Livshits et al. 2015] implementation of ParDiff.In other words, ParDiff shares the same reasonable assumptions and standard approaches (to handle challenging program structures) with previous bug-nding techniques, e.g., [Babic and Hu 2008;Shi et al. 2018;Xie and Aiken 2005].For example, in our implementation, we unroll each loop twice in the control ow graphs and call graphs.Following the aforementioned bug-nding techniques, we currently have not modeled inline assembly and call statements that invoke non-standard library APIs.For pointer analysis, we adopt Sui et al. [2011]'s approach to resolve pointer relations.e soundy (i.e., reasonably unsound) operations in the implementation do lead to false positives or negatives, which, however, are acceptable as we show in the evaluation.In total, ParDiff lets us nd 41 bugs in mature protocol implementations.

EVALUATION
We implement our tool ParDiff on top of the LLVM (12.0.1) compiler infrastructure [La ner and Adve 2004] and the Z3 (4.8.12) SMT solver [De Moura and Bjørner 2008].e current implementation of ParDiff works on protocol parsers wri en in C, but note that our approach is general for other common languages.Given a parser, we compile its source code into LLVM bitcode and send the bitcode to ParDiff for further processing.In ParDiff, Z3 is used to represent abstract values as symbolic expressions and check the equivalence of constraints.With the implementation, we design a series of experiments to answer the following four research questions:  • RQ2: How e cient are the three stages of ParDiff?
• RQ3: How e ective is ParDiff in detecting bugs?
• RQ4: What are the root causes of the discovered bugs?

Experimental Setup
To create a set of protocols with multiple implementations, we refer to an index of open-source routing platforms [contributors 2022].From this list, we apply a set of criteria to re ne our selection: the projects must be implemented in C programming language and actively maintained within the past year.is results in the selection of FRRouting [community 2023] and BIRD [Martin et al. 2023], two open-source routing protocol suites.Next, for every protocol implemented in a suite, we nd an alternate implementation that also meets the aforementioned criteria for comparison.en we obtain a dataset consisting of ten network protocols, listed in the rst ten rows in Table 1.Moreover, we incorporate four well-known protocols from the TCP/IP protocol suite.Finally, we get a dataset with 14 network protocols, each with two distinct implementations.e protocol and codebase information is shown in Table 1.
To answer RQ1, we run ParDiff on our dataset, and test the e ectiveness of each stage.To evaluate the e ectiveness of path constraint reduction (Stage 1), we record the number of LLVM instructions involved in computing the format constraint, with and without the constraint reduction algorithm.To evaluate the e ectiveness of FSM generation and simpli cation (Stage 2), we record the number of FSM nodes and edges before and a er FSM simpli cation.We then check the di erences generated by the bisimulation to identify bugs (Stage 3).For each protocol we test, we record the number of di erences, the number of di erences caused by bugs, the number of di erences caused by implementation options, and the number of unique bugs implied by the di erences, in order to gauge the e ectiveness of ParDiff.Additionally, we record the number of atomic branch conditions (cf.De nition 1) within each di erence as a measure of human e orts in bug localization.
To answer RQ2, for each protocol, we record the execution time of each stage, together with the total time of our analysis, to measure the e ciency of our tool.
To answer RQ3, we evaluate our approach by comparing it to both static and dynamic analysis tools.In terms of static analysis tool comparisons, we conduct tests with existing static di erential analysis, such as di erential symbolic execution [Lahiri et al. 2012[Lahiri et al. , 2013;;Person et al. 2008Person et al. , 2011;;Ramos and Engler 2015;Rutledge and Orso 2022].ese techniques are indeed e ective for identifying discrepancies across di erent versions of the same so ware by leveraging shared structures and code snippets.However, as discussed previously, they struggle with comparing independent protocol implementations, because the shared code structures that enable the e ciency of di erential symbolic execution are mostly absent.As such, these tools need exhaustive and independent analysis on each implementation and fail to nish the analysis due to the path explosion problem.us, we omit the evaluation of these tools.
We also compare our ne-grained approach with the coarse-grained use of SMT solvers.e coarse-grained way monolithically queries the satis ability of two format constraints 1 and 2 , i.e., check if 1 ≠ 2 .If the query is satis able, we count the number of atomic branch conditions (cf.De nition 1) in both implementations, which is a measure of the manual e ort to precisely locate the root cause (sub)condition in the program leading to the divergence.
As for dynamic analysis tools, we compare our tool with a di erential fuzzing tool specially designed for protocol parsers, i.e., DPIFuzz [Reen and Rossow 2020].While there are a few other di erential fuzzing tools (e.g., [Petsios et al. 2017;Yang et al. 2021;Zou et al. 2021]), they are mostly domain-speci c and have special designs for input generation and mutations, which cannot be directly applied to general network protocol parsers.erefore, we select DPIFuzz, whose mutators are speci cally designed for network protocol packets.We use the packet-level mutations and the execution behaviors (including abort or return in advance, return values of protocol parsers, etc) as fuzzing feedback.We run DPIFuzz with the rst ten protocols in our dataset (Table 1), excluding IPv4, IPv6, ICMP, and ICMP6, which are kernel-space implementations (note that DPIFuzz is a user-space fuzzer that does not support fuzzing kernel code).For each protocol, we execute DPIFuzz in two se ings: rst, run with equal duration with ParDiff (that is, if ParDiff operates for x seconds, we also run DPIFuzz for x seconds).We repeat this procedure ten times to avoid random factors.Second, run each protocol for 24 hours, and repeat three times.
To answer RQ4, we analyze the root causes of all bugs we found and group them into three categories.We also provide case studies and discuss their potential impacts.

RQ1: E ectiveness of Each Stage in ParDi
Stage 1: Collecting Format Constraints via Path Constraint Reduction.e numbers of LLVM instructions involved in computing format constraints, with and without our reduction algorithm, are shown in Figure 8.For all protocol implementations, the number of instructions has signi cantly decreased a er applying path constraint reduction.is indicates that the path constraint reduction process is e ective in reducing the complexity of the codebases, with an average of 99.57%.Stage 2: From Format Constraints to FSM with Simpli cation.Table 2 indicates that the FSM simpli cation process in Stage 2 further reduces the complexity of each FSM a er path constraint reduction in Stage 1.On average, there is an approximately 30.0% reduction in the number of states and a 23.06% reduction in the number of state transitions.Particularly, the most signi cant reduction is seen in the second FSM of the OSPFv2 protocol with approximately 60% fewer nodes and 60% fewer edges.
Stage 3: Locating Implementation Di erences.For each FSM di erence we generated, we manually identify whether it is a true di erence, i.e., a real semantic di erence.We consider other di erences as false positives, which are due to some inaccuracies of our tool, like the inherent limitation of pointer analysis and loop analysis.Additionally, for each true di erence, we identify whether it is a bug, or it is due to implementation options allowed by protocol speci cations.As shown in Table 4(a), ParDiff successfully identi ed 41 unique bugs in 14 network protocols, with a precision of 92.8%.For each identi ed bug, we generated either a bug report or a x patch.Till submission, 25 of these bugs have been con rmed by the developers.Notably, we have patched 11 of these bugs, which have already been merged or approved by the developers in the corresponding open-source repository.
Each FSM di erence is made up of several atomic branching conditions (see De nition 1).We record the number of atomic branch conditions in each di erence and show the result in the last column of Table 4(a).Among the 14 protocols, there are 827 atomic conditions within 264 di erences.As such, developers need to examine 3.13 atomic conditions on average for every di erence detected.Since each atomic condition corresponds to one source code line, developers would examine approximately 3.13 source lines per di erence (assuming they are familiar with the protocols).

RQ2: E iciency of Each Stage in ParDi
We executed ParDiff on all protocols in our dataset and recorded the total analysis time.Additionally, we measured the time ParDiff took for each of the three stages separately.As presented in Figure 9, ParDiff can complete the analysis in an average of 50.48 secs, with 74.15% of its time extracting formats from the source code, around 22.37% generating FSM, and only 1.97% conducting the di erentiation.It supports that by applying path constraint reduction in stage 1 and FSM simpli cation in stage 2, ParDiff can di erentiate formats and generate di erences very quickly.

RQ3-1: Comparing ParDi to Coarse-grained use of SMT Solver
As shown in Table 3, for each protocol, the SMT Solver determines the two constraints, referred to as 1 and 2 , are not equivalent.However, comparing these constraints directly merely con rms the presence of at least one di erence between the two implementations.To precisely locate the origins of these discrepancies or potential bugs, we must delve deeper and inspect each individual atomic condition within both 1 and 2 .e count of these atomic conditions for each implementation is recorded in Table 3. Across the 14 protocols, there are a total of 2515 atomic conditions for 14 identi ed di erences.Hence, developers need to carefully check approximately 179.6 lines per di erence to pinpoint potential bugs. is number signi cantly exceeds the number of 3.13 lines per di erence provided by our tool, representing a signi cant challenge that requires nontrivial human e ort.

RQ3-2: Comparing ParDi to DPIFuzz
We compare the two tools from two perspectives: the number of discovered bugs and the e ciency of bug nding.
Bugs Identi ed.As shown in Table 4, our tool detects a total of 41 bugs, 40 of which are found in the rst ten network protocols, while DPIFuzz can only detect 25 bugs (detected by at least one of the three runs).We observe that DPIFuzz has di culty nding bugs in long program paths due to the di culty of generating inputs that satisfy all the complex constraints in deep paths.It is worth mentioning that, as shown in Table 5, bugs with ID#42 to 45 can be triggered by at least one run of DPIFuzz, but are not detected by our tool.is is because, in the current implementation, if our tool detects a di erence on a path, it stops bisimulation on that path.Hence, we miss the 0.00 ± 0.00 0.00 ± 0.00 45.00± 12.98 2.33 ± 0.94 RIPng 0.00 ± 0.00 0.00 ± 0.00 111.67± 12.40 3.00 ± 0.00 VRRP 0.00 ± 0.00 0.00 ± 0.00 3.00 ± 0.00 2.00 ± 0.00 opportunity of nding other bugs on that path a er that di erence.is could be solved by forcing the tool to continue bisimulation along the path.We put this into our future work.Additionally, our tool exhibits higher stability in bug detection.In contrast, the bugs DPIFuzz can trigger are quite random, only 12 of the 25 bugs are detected in all three runs.Besides, we record the time each bug is detected by ParDiff and DPIFuzz.Particularly, we record the time taken by DPIFuzz to trigger the bugs in each of its three runs.We use T/O to indicate timeout, which means this bug isn't detected in an execution.e result shows that DPIFuzz spends much longer time than ParDiff to detect each bug.Besides, due to the innate randomness in fuzzing tools, DPIFuzz cannot stably detect most bugs.E ciency.We compare the e ciency of DPIFuzz with ParDiff in two se ings: running DPIFuzz with the same time budget as ParDiff and running DPIFuzz for 24 hours.
We rst study the capability of nding bugs given equal time to both tools.For each protocol, we execute DPIFuzz ten times, matching the total run time of ParDiff for each iteration.We keep track of the number of inputs and the bugs identi ed in every individual run.e mean and standard deviation of the results are shown in the le half part of Table 4(b), which indicates that ParDiff can hardly nd any bug given the same time as our tool.Please note that we do realize this is due to the di erent nature of the tools and these numbers are for reference only.5) we submi ed to x a bug in FRRouting's BABEL implementation.e buggy version misses implementing some constraint checks while dealing with MESSAGE REQUEST packets, leading to an incomplete input validation checking.
is pull request has already been merged by the developers.

RELATED WORK
Di erential Symbolic Analysis.Symbolic-execution-based di erential testing tools primarily focus on syntactically similar programs (e.g., programs evolved from the same base version).ese tools o en rely on matching code snippets and data structures to reduce the symbolic execution overhead.ey identify similar or divergent code segments through static analysis [Person et al. 2008[Person et al. , 2011;;Rutledge and Orso 2022], branch divergence during symbolic execution [Cadar and Palikareva 2014;Palikareva et al. 2016], or CFG pa erns [Malík and Vojnar 2021].However, these tools can struggle if programs from independent projects are substantially di erent in their syntactic structures.Contrarily, ParDiff provides a complementary approach for analyzing di erent implementations, even when their syntactic structures vary signi cantly.
Existing di erential model checking techniques [Ferreira et al. 2021;Fiterȃu-Bros ¸tean et al. 2016]are mainly used to compare high-level state-machine representations of protocols.ese are e ective in pinpointing operational di erences between di erent protocol versions or implementations.For instance, Prognosis [Ferreira et al. 2021] models state transitions and message exchange pa erns (e.g.TCP 3-way handshake) of TCP, as well as QUIC's ow control mechanisms, for comparison.While these methods are well-suited for studying the overall logic of protocol implementations, they might not thoroughly check the robustness of individual parsers when faced with malformed or unexpected input.Fuzzing and Di erential Fuzzing.Traditional fuzzing tools, like BooFuzz [Pereyda 2023], primarily aim to uncover crashes or assertion failures, rather than semantic bugs.Di erential fuzzing explores potential behavior divergence between two programs and is capable of detecting semantic bugs.However, these techniques [Yang et al. 2021;Zou et al. 2021] encounter several challenges.Firstly, their e ectiveness strongly depends on the quality of the inputs and may su er from low coverage of code.Secondly, substantial human e ort is needed to identify and locate bugs from a large number of inputs.Lastly, these techniques tend to be slow, o en taking several hours or days to converge.To mitigate these challenges, we propose ParDiff, a high-coverage bug detection technique with e cient bug identi cation and localization.Hybrid Techniques.Symbolic execution and fuzzing can be combined to identify semantic di erences between two program versions.HyDi [Noller et al. 2020] leverages both dynamic symbolic execution and concolic testing to nd regression bugs.ese tools also su er from the limitations of applying di erential symbolic execution in syntactically disparate implementations.Input Grammar Synthesis.Input grammar synthesis techniques [Bastani et al. 2017;Gopinath et al. 2020;Lin et al. 2010] are widely used to generate grammar describing the expected syntactic structure of program inputs.ese grammars help ensure that inputs adhere to speci ed formats and can be valuable for generating test inputs.However, it is worth noting that these techniques o en su er from performance issues [Bendrissou et al. 2022].Furthermore, input grammar synthesis techniques typically target context-free grammars, which imposes certain limitations on their applicability.In contrast, protocol formats include semantic constraints among di erent bytes in a protocol message.ese constraints may involve arithmetic operations, bit-level manipulations, or context-dependent rules that go beyond the capabilities of simple context-free grammars.

CONCLUSION
is work proposes ParDiff to detect bugs hidden in network protocol parsers, which could hardly be detected by conventional tools.ParDiff extracts normalized protocol formats as nite state machines from di erent implementations of the same protocol, and leverages di erential analysis to locate bugs in the code.ParDiff successfully detects 41 semantic bugs, with 25 con rmed or xed.
Fig. 1.Motivating example.(a) A buggy implementation.The circled numbers and the message structure provide an example to show how the buggy code allows us to read bytes out of the scope of the message when parsing the message.(b) A correct implementation.

Fig. 5 .
Fig. 5.The le part shows an example of collecting format constraints.The right part shows the structure of the network message to parse.
ParDi : Practical Static Di erential Analysis of Network Protocol Parsers 137:11 Proc.ACM Program.Lang., Vol. 8, No. OOPSLA1, Article 137.Publication date: April 2024.ParDi : Practical Static Di erential Analysis of Network Protocol Parsers 137:13 Lemma 4.3.e input and output format constraints of Algorithm 2 are equivalent.Proof.(Sketch.)Observe that in the algorithm, we only exchange the positions of two subconstraints, e.g., and , if they are in a conjunctive formula.Apparently, ∧ is equivalent to ∧ .us, the input and output format constraints of Algorithm 2 are equivalent.□ Lemma 4.4.e output format constraint of Algorithm 2 is ordered.Proc.ACM Program.Lang., Vol. 8, No. OOPSLA1, Article 137.Publication date: April 2024.

Fig. 8 .
Fig. 8. Instructions used to compute format constraints, with and without constraint reduction (CR).

Figure 10
Figure 10(b) shows part of a pull request (ID#12 in Table5) we submi ed to x a bug in FRRouting's BABEL implementation.e buggy version misses implementing some constraint checks while dealing with MESSAGE REQUEST packets, leading to an incomplete input validation checking.ispull request has already been merged by the developers.
which can be inferred from the branching conditions in line 4 QingkaiShi, Xuwei Liu, Xiangzhe Xu, Le Yu, Congyu Liu, Guannan Wei, and Xiangyu Zhang

Table 1 .
Protocols and Their Codebases for Evaluation.

Table 2 .
Statistic of FSM node and edge number.Notably, the large di erences in RADV and IPv6 protocols are due to incomplete implementations in the corresponding codebase.

Table 3 .
Compare with coarse-grained use of SMT Solver.(Treat all constraints as a whole formula to query.)

Table 5 .
Network protocol bug list.We use T/O to indicate timeout, and N/A for not applicable.