Dynamic Slicing of WebAssembly Binaries

NIAID Data Ecosystem2026-05-01 收录

下载链接：

https://zenodo.org/record/8025835

下载链接

链接失效反馈

官方服务：

资源简介：

This is the replication package that accompanies the paper titled: "Dynamic Slicing of WebAssembly Binaries". # Slices dataset ## Generating the dataset The dynamic slices have been generated with [P-ORBS](https://syed-islam.github.io/research/program-analysis/#observation-based-program-slicing-orbs). The steps and scripts to generate the dynamic slices are included in the `slicing-steps/` directory. These steps also describe how to generate the `stats.csv` file that is included in this dataset. The static slices have been generated with [wassail](https://github.com/acieroid/wassail). The scripts to generate the static slices are included in the current directory (`generate-static-slices.sh` which relies on `run_wassail.sh`). These steps generate the file `static-stats.csv` that is included in this dataset. The `original-size.csv` file, included in this dataset, can be generated as follows: ```sh find subjects-wasm-extract-slice -name \*.c.wat -exec c_count {} \; | grep subjects > counts.txt echo 'slice,original fn slice' > evaluation/original-sizes.csv sed -E 's|^(.*) subjects-wasm-extract-slice/[^/]*/([^/]*)/.*$|\2,\1|' counts.txt >> evaluation/original-sizes.csv ``` The numbers in Table 1 of the paper (the list of programs in the dataset along with their sizes) can be generated as follows. For the WebAssembly files, we can count the function size: ``` find subjects-wasm-extract-slice -name \*.c.wat -exec c_count {} \; | grep subjects > counts.txt sed -E 's|^(.*) subjects-wasm-extract-slice/([^/]*)/.*$|\1 \2|' counts.txt | python evaluation/table1-wasm-mean.py ``` or the full program size: ``` find subjects-wasm-extract-slice -name t.wat -exec c_count {} \; | grep subjects > counts.txt sed -E 's|^(.*) subjects-wasm-extract-slice/([^/]*)/.*$|\1 \2|' counts.txt | python evaluation/table1-wasm-mean.py ``` ## Structure of the dataset The dataset is structured as follows: - `subjects/` contains the instrumented `.c` source code, along with scripts to generate the dynamic slices. The original source code can be obtained by removing the line `printf("\nORBS:%x\n....`. - `subjects-wasm-extract-slice/` contains the original WebAssembly programs to slice. Each program has two files: `t.wat` is the full binary file, and `name.c.wat` is the binary code of the function containing the slicing criterion. - `all_slices/` contains the slices. For example, program `adpcm_ah1_254_expr` has the following files - `adpcm/adpcm_ah1_254_expr/EWS_adpcm.wat`: the EWS slice - `adpcm/adpcm_ah1_254_expr/SEW_adpcm.wat`: the SEW slice - `adpcm/adpcm_ah1_254_expr/ESW_adpcm.wat`: the ESW slice - `adpcm/adpcm_ah1_254_expr/static_adpcm.wat.slice`: the SWS slice The other files are produced by intermediary steps and can be ignored. They are: - `adpcm/adpcm_ah1_254_expr/ESW_adpcm.wat.orig`: original (unsliced) binary *file* from which SW and ESW slices are computed - `adpcm/adpcm_ah1_254_expr/SEW_adpcm.wat.orig`: original (unsliced) binary *function* from which SEW slice is computed - `adpcm/adpcm_ah1_254_expr/SW_adpcm.wat`: slice of entire binary file from which ESW slice is extracted - `adpcm/adpcm_ah1_254_expr/WS_adpcm.wat`: compiled (binary) version of dynamic C slice from which EWS slice is extracted # Research questions ## RQ1 The script `./RQ1.py` found in the `evaluation/` directory generates: - Figure 3 (time.pdf) - The mean, min, max, and stddev of the times - How many slices are computed below 10, 100, 1000, and 10000 seconds ## RQ2 The script `./RQ2.py` found in the `evaluation/` directory generates: - Figure 4 (loc.pdf) - The mean, median, min, max, and stddev of the sizes - The largest differences between the approaches - The number of slices larger than the original program ## RQ3 and RQ4 The process for these research questions is manual and requires comparing slices. It cannot be automated. We did make heavy use of `diff --side-by-side` in this analysis.

创建时间：

2023-07-17

5,000+

优质数据集

54 个

任务类型

进入经典数据集