Name: Magma Experiment for Code/Bug-based Coverage Benchmarking
Creator: Edmond
Published: 2026-01-12 14:20:35
License: 暂无描述

下载链接：

https://edmond.mpg.de/citation?persistentId=doi:10.17617/3.UQGK4A

下载链接

链接失效反馈

官方服务：

资源简介：

<h1>Magma Experiment for Code- and Bug-based Coverage Benchmarking</h1> <h2>Overview</h2> <p> This dataset contains the <b>raw experimental artifacts</b> for the Magma portion of our study on the concordance (split-half reliability) of <b>coverage-based</b> and <b>bug-based</b> fuzzer benchmarking procedures. [oai_citation:0‡FSE2026.pdf](sediment://file_000000006394720c8478a4c3a5cd9f3e) </p> <p> The artifacts are intentionally released at a low level (queues, logs, and bug information) so that others can <b>reproduce</b>, <b>audit</b>, and <b>recompute</b> outcomes under alternative analysis choices. </p> <p> Note: This dataset is <b>not linked directly from the paper</b>. The paper links to a companion code repository, and that repository links to this dataset. </p> <hr/> <h2>Experimental Setup (Summary)</h2> <ul> <li><b>Benchmark suite:</b> Magma v1.2.0</li> <li><b>Benchmarks used:</b> 18 fuzz drivers (benchmarks) across multiple programs</li> <li><b>Fuzzers:</b> 8 fuzzers</li> <li><b>Trials:</b> 20 independent trials per (fuzzer, benchmark) combination (subject to supported combinations)</li> <li><b>Campaign length:</b> 23 hours per trial</li> <li><b>Bug evaluation:</b> Magma canary-based ground truth; we count <b>triggered bugs</b></li> <li><b>Coverage evaluation:</b> branch coverage (LLVM tooling)</li> </ul> <hr/> <h2>What This Dataset Contains</h2> <p> This dataset includes the raw outputs needed to compute bug-based results directly and to compute coverage-based results via offline replay: </p> <ul> <li><b>Execution queues</b> produced during fuzzing campaigns (per fuzzer, benchmark, trial)</li> <li><b>Raw bug information</b> from Magma’s canary-based bug reporting (including logs/outputs per run)</li> <li><b>Per-run metadata</b> required to group results and reproduce the analysis workflow</li> </ul> <hr/> <h2>Important: How Coverage Results Are Obtained</h2> <p> <b>Coverage is not precomputed in this dataset.</b> This dataset includes the execution queues and other artifacts, but <b>coverage values must be derived</b> using the companion code repository. </p> <p> To extract coverage: </p> <ol> <li>Obtain the companion code repository referenced by the paper.</li> <li>Configure it to point to this dataset on disk.</li> <li>Run the provided pipeline to replay the execution queues and compute <b>branch coverage</b> via LLVM tooling.</li> </ol> <p> If you download only this dataset without the companion code, you can still analyze <b>bug-based</b> outcomes, but you <b>cannot</b> reproduce the paper’s coverage-based results. </p> <hr/> <h2>Directory Layout</h2> <p> The dataset is organized hierarchically by <b>fuzzer</b>, <b>benchmark</b>, and <b>trial</b>. The exact naming is intended to match the assumptions of the companion analysis scripts. </p> <p> Conceptually: </p> <pre> &lt;fuzzer&gt;/&lt;benchmark&gt;/&lt;trial&gt;/... </pre> <p> Each trial directory corresponds to one 23-hour campaign run and contains the artifacts produced by that run (e.g., queues, logs, bug-triggering information, and auxiliary metadata). </p> <hr/> <h2>File Count</h2> <p> At release time, this dataset contains exactly <b>2,860 files</b>. </p> <hr/> <h2>Known Omissions / Unsupported Combinations</h2> <ul> <li><b>SymCC × exif</b> is not present because SymCC does not compile on the <b>exif</b> benchmark; therefore, no such run exists.</li> </ul> <hr/> <h2>Data Completeness Note (Minimal)</h2> <p> A small number of runs (seven) were re-executed after a temporary server failure to complete the expected trial matrix, but were excluded from the paper. The released dataset is intended to be <b>complete</b> for the supported fuzzer–benchmark combinations. </p> <hr/> <h2>Relationship to Paper and Code</h2> <ul> <li><b>Paper:</b> defines the research questions, metrics, and statistical analysis. [oai_citation:1‡FSE2026.pdf](sediment://file_000000006394720c8478a4c3a5cd9f3e)</li> <li><b>Companion code repository:</b> performs coverage extraction (queue replay), aggregation, ranking, and concordance computations.</li> <li><b>This dataset:</b> provides the raw artifacts consumed by that code.</li> </ul> <hr/> <h2>Intended Use</h2> <ul> <li>Reproduction of bug-based outcomes and reanalysis of trial variability</li> <li>Recomputation of coverage-based outcomes via replay using the companion code</li> <li>Alternative analyses of bug and code coverage.</li> </ul> <p> This dataset is <b>not</b> a pre-aggregated results table; it is raw experimental output. </p> <hr/> <h2>License</h2> <p> This dataset is released under <b>CC BY 4.0</b>. </p> <hr/> <h2>Citation</h2> <p> Please cite this dataset via its DOI (doi:10.17617/3.UQGK4A). When referencing the methodology or results, also cite the accompanying paper, <i>In Bugs We Trust? On Measuring the Randomness of a Fuzzer Benchmarking Outcome<i> </p>

应用场景：