Dynamic Slicing of WebAssembly Binaries
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/8025835
下载链接
链接失效反馈官方服务:
资源简介:
This is the replication package that accompanies the paper titled: "Dynamic Slicing of WebAssembly Binaries".
# Slices dataset
## Generating the dataset
The dynamic slices have been generated with [P-ORBS](https://syed-islam.github.io/research/program-analysis/#observation-based-program-slicing-orbs).
The steps and scripts to generate the dynamic slices are included in the `slicing-steps/` directory.
These steps also describe how to generate the `stats.csv` file that is included in this dataset.
The static slices have been generated with [wassail](https://github.com/acieroid/wassail).
The scripts to generate the static slices are included in the current directory (`generate-static-slices.sh` which relies on `run_wassail.sh`).
These steps generate the file `static-stats.csv` that is included in this dataset.
The `original-size.csv` file, included in this dataset, can be generated as follows:
```sh
find subjects-wasm-extract-slice -name \*.c.wat -exec c_count {} \; | grep subjects > counts.txt
echo 'slice,original fn slice' > evaluation/original-sizes.csv
sed -E 's|^(.*) subjects-wasm-extract-slice/[^/]*/([^/]*)/.*$|\2,\1|' counts.txt >> evaluation/original-sizes.csv
```
The numbers in Table 1 of the paper (the list of programs in the dataset along with their sizes) can be generated as follows.
For the WebAssembly files, we can count the function size:
```
find subjects-wasm-extract-slice -name \*.c.wat -exec c_count {} \; | grep subjects > counts.txt
sed -E 's|^(.*) subjects-wasm-extract-slice/([^/]*)/.*$|\1 \2|' counts.txt | python evaluation/table1-wasm-mean.py
```
or the full program size:
```
find subjects-wasm-extract-slice -name t.wat -exec c_count {} \; | grep subjects > counts.txt
sed -E 's|^(.*) subjects-wasm-extract-slice/([^/]*)/.*$|\1 \2|' counts.txt | python evaluation/table1-wasm-mean.py
```
## Structure of the dataset
The dataset is structured as follows:
- `subjects/` contains the instrumented `.c` source code, along with scripts to generate the dynamic slices.
The original source code can be obtained by removing the line `printf("\nORBS:%x\n....`.
- `subjects-wasm-extract-slice/` contains the original WebAssembly programs to slice. Each program has two files: `t.wat` is the full binary file, and `name.c.wat` is the binary code of the function containing the slicing criterion.
- `all_slices/` contains the slices. For example, program `adpcm_ah1_254_expr` has the following files
- `adpcm/adpcm_ah1_254_expr/EWS_adpcm.wat`: the EWS slice
- `adpcm/adpcm_ah1_254_expr/SEW_adpcm.wat`: the SEW slice
- `adpcm/adpcm_ah1_254_expr/ESW_adpcm.wat`: the ESW slice
- `adpcm/adpcm_ah1_254_expr/static_adpcm.wat.slice`: the SWS slice
The other files are produced by intermediary steps and can be ignored. They are:
- `adpcm/adpcm_ah1_254_expr/ESW_adpcm.wat.orig`: original (unsliced) binary *file* from which SW and ESW slices are computed
- `adpcm/adpcm_ah1_254_expr/SEW_adpcm.wat.orig`: original (unsliced) binary *function* from which SEW slice is computed
- `adpcm/adpcm_ah1_254_expr/SW_adpcm.wat`: slice of entire binary file from which ESW slice is extracted
- `adpcm/adpcm_ah1_254_expr/WS_adpcm.wat`: compiled (binary) version of dynamic C slice from which EWS slice is extracted
# Research questions
## RQ1
The script `./RQ1.py` found in the `evaluation/` directory generates:
- Figure 3 (time.pdf)
- The mean, min, max, and stddev of the times
- How many slices are computed below 10, 100, 1000, and 10000 seconds
## RQ2
The script `./RQ2.py` found in the `evaluation/` directory generates:
- Figure 4 (loc.pdf)
- The mean, median, min, max, and stddev of the sizes
- The largest differences between the approaches
- The number of slices larger than the original program
## RQ3 and RQ4
The process for these research questions is manual and requires comparing slices.
It cannot be automated.
We did make heavy use of `diff --side-by-side` in this analysis.
创建时间:
2023-07-17



