Fuzzing Configurations of Program Options - RCR Report

This artifact contains the source code and instructions to reproduce the evaluation results of the article “Fuzzing Configurations of Program Options.” The source code includes the configuration grammars for six target programs, the scripts to generate configuration stubs, and the scripts to post-process fuzzing results. The README of the artifact includes the steps to prepare the experimental environment on a clean Ubuntu machine and step-by-step commands to reproduce the evaluation experiments. A VirtualBox image with ConfigFuzz properly set up is also included.


OVERVIEW 1.Article
While many real-world programs are shipped with configurations to enable/disable functionalities, existing fuzzers have mostly been applied to test single configurations of these programs. This article presents ConfigFuzz to explore the potential benefits of fuzzing configurations of program options. ConfigFuzz fuzzes the program configurations and the input data at the same time, by separating a program's input space into two parts: the configuration bytes and data bytes. Program options are encoded into the configuration bytes in a transformed program to allow a fuzzer's mutation operators to decide when and how to mutate the program's configurations during the fuzzing campaign.
In the article, we first assess how program configurations affect fuzzing performance by performing an empirical study that ran AFL [1] on three common, configurable fuzzing targets: FFmpeg, 55:2 Z. Zhang et al.
nm, and gif2png (Section 3). We observed that different configurations contributed disproportionally to code coverage, while almost every individual configuration enabled some unique code to be reached.
We instantiated ConfigFuzz on six configurable, common fuzzing targets and integrated their executions in FuzzBench [5]. In our evaluation, ConfigFuzz outperformed two baseline fuzzers in four targets (xmllint [9], gif2png [6], cxxfilt [3], and FFmpeg [4]), while the results were mixed in the other targets (nm [7] and objdump [8]) due to program size and configuration space. We also analyzed the options fuzzed by ConfigFuzz and how they affect the performance.

Artifact
The artifact has been made available at https://figshare.com/articles/software/Supplementary_ artifact_for_the_paper_Fuzzing_Configurations_of_Program_Options_/20792062 under CC-BY 4.0 license. The DOI of the artifact is 10.6084/m9.figshare.20792062. The content of the artifact is organized as follows: • benchmarks: It contains all the ConfigFuzz and evaluation baseline setups used in the article.
Each setup is constructed as a FuzzBench benchmark and can be fuzzed with FuzzBench. • ConfigFuzz: It contains the implementation of ConfigFuzz (i.e., configuration stub generation), post-processing scripts for the experimental results generated from FuzzBench, and the scripts for producing the plots and tables presented in Section 5 of the article. configfuzz.ova. • setup_fuzzbench.sh: It is a script that automatically downloads FuzzBench, applies Config-Fuzz.patch, and replaces the FuzzBench benchmarks with those in benchmarks.

PREREQUISITES AND REQUIREMENTS
This artifact requires a machine with no less than 16GB RAM and 200GB disk. The OS is preferably clean Ubuntu 20.04. Alternatively, configfuzz.ova provides a virtual machine with ConfigFuzz properly set up.

STEPS TO REPRODUCE
The experimental results can be reproduced in four steps with the artifact. The following paragraphs generally describe each step. The detailed process and commands are described in README.md.
Set up the experimental environment. The instructions in README.md include commands to install the system dependencies, including Clang and Python 3.9. ConfigFuzz depends on FuzzBench to run fuzzing experiments. Setting up FuzzBench can be done with the setup_fuzzbench.sh script and the corresponding commands are also included in README.md. This step can be skipped when running ConfigFuzz with the virtual image configfuzz.ova.
Generate FuzzBench benchmarks with ConfigFuzz. This step enables FuzzBench to fuzz target programs that are transformed by ConfigFuzz. Users need to first manually rename the main function in the target program as fuzz_main, where the entry files can be found at the entry-files folder. Based on the baseline or ConfigFuzz setup (introduced in Section 5.1.2), users need to locate the corresponding configuration grammar in the Grammar folder. The modified entry file and the grammar are then given as inputs to a script in ConfigFuzz, which instruments the configuration stub into the entry file.
Run fuzzing experiments with FuzzBench. This step has the same process as running the original FuzzBench, including preparing a config file and running the FuzzBench script to start the experiments. Evaluation of the article uses the setup of five trials and 24 hours' timeout, as introduced in Section 5.1.4. We evaluated ConfigFuzz with five different setups and two baselines on AFL [1] and AFL++ [10], as discussed in Section 5.1.3. The FuzzBench benchmarks for all the baselines are automatically prepared in the first step.
Post-process fuzzing results. This artifact reproduces the results we reported in Section 5, namely the coverage growth plots (Figures 5-9) and the option distribution tables (Tables 6 and 7). In this step, users need to run the scripts in ConfigFuzz to calculate coverage and generate corresponding results. By generating Figures 5-9 with the artifact, users can reproduce the two results we reported in Section 5. First, we claimed that ConfigFuzz outperformed the baselines on xmllint, gif2png, cxxfilt, and FFmpeg, while Baseline-2-way and/or Baseline-def achieved higher coverage on nm and objdump. This result should be supported by comparing line coverage of ConfigFuzz to baselines reported in the coverage growth plots. Second, when comparing different setups of ConfigFuzz, we found that ConfigFuzz-max did not always produce the highest coverage among these ConfigFuzz settings and ConfigFuzz-2 outperformed ConfigFuzz-1 in most cases. Comparing line coverage between ConfigFuzz setups in the coverage growth plots should lead to a consistent conclusion. We investigated the distributions of options generated by ConfigFuzz, in order to gain insights of the fuzzing process. A similar distribution as Tables 6 and 7 should be observed with the reproduced option distribution tables.