five

Dynamic Slicing of WebAssembly Binaries

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/8025835
下载链接
链接失效反馈
官方服务:
资源简介:
This is the replication package that accompanies the paper titled: "Dynamic Slicing of WebAssembly Binaries". # Slices dataset ## Generating the dataset The dynamic slices have been generated with [P-ORBS](https://syed-islam.github.io/research/program-analysis/#observation-based-program-slicing-orbs). The steps and scripts to generate the dynamic slices are included in the `slicing-steps/` directory. These steps also describe how to generate the `stats.csv` file that is included in this dataset. The static slices have been generated with [wassail](https://github.com/acieroid/wassail). The scripts to generate the static slices are included in the current directory (`generate-static-slices.sh` which relies on `run_wassail.sh`). These steps generate the file `static-stats.csv` that is included in this dataset. The `original-size.csv` file, included in this dataset, can be generated as follows: ```sh find subjects-wasm-extract-slice -name \*.c.wat -exec c_count {} \; | grep subjects > counts.txt echo 'slice,original fn slice' > evaluation/original-sizes.csv sed -E 's|^(.*) subjects-wasm-extract-slice/[^/]*/([^/]*)/.*$|\2,\1|' counts.txt >> evaluation/original-sizes.csv ``` The numbers in Table 1 of the paper (the list of programs in the dataset along with their sizes) can be generated as follows. For the WebAssembly files, we can count the function size: ``` find subjects-wasm-extract-slice -name \*.c.wat -exec c_count {} \; | grep subjects > counts.txt sed -E 's|^(.*) subjects-wasm-extract-slice/([^/]*)/.*$|\1 \2|' counts.txt | python evaluation/table1-wasm-mean.py ``` or the full program size: ``` find subjects-wasm-extract-slice -name t.wat -exec c_count {} \; | grep subjects > counts.txt sed -E 's|^(.*) subjects-wasm-extract-slice/([^/]*)/.*$|\1 \2|' counts.txt | python evaluation/table1-wasm-mean.py ``` ## Structure of the dataset The dataset is structured as follows:   - `subjects/` contains the instrumented `.c` source code, along with scripts to generate the dynamic slices. The original source code can be obtained by removing the line `printf("\nORBS:%x\n....`.   - `subjects-wasm-extract-slice/` contains the original WebAssembly programs to slice. Each program has two files: `t.wat` is the full binary file, and `name.c.wat` is the binary code of the function containing the slicing criterion.   - `all_slices/` contains the slices. For example, program `adpcm_ah1_254_expr` has the following files     - `adpcm/adpcm_ah1_254_expr/EWS_adpcm.wat`: the EWS slice     - `adpcm/adpcm_ah1_254_expr/SEW_adpcm.wat`: the SEW slice     - `adpcm/adpcm_ah1_254_expr/ESW_adpcm.wat`: the ESW slice     - `adpcm/adpcm_ah1_254_expr/static_adpcm.wat.slice`: the SWS slice     The other files are produced by intermediary steps and can be ignored. They are:     - `adpcm/adpcm_ah1_254_expr/ESW_adpcm.wat.orig`: original (unsliced) binary *file* from which SW and ESW slices are computed     - `adpcm/adpcm_ah1_254_expr/SEW_adpcm.wat.orig`: original (unsliced) binary *function* from which SEW slice  is computed     - `adpcm/adpcm_ah1_254_expr/SW_adpcm.wat`: slice of entire binary file from which ESW slice is extracted     - `adpcm/adpcm_ah1_254_expr/WS_adpcm.wat`: compiled (binary) version of dynamic C slice from which EWS slice is extracted      # Research questions ## RQ1 The script `./RQ1.py` found in the `evaluation/` directory generates:   - Figure 3 (time.pdf)   - The mean, min, max, and stddev of the times   - How many slices are computed below 10, 100, 1000, and 10000 seconds ## RQ2 The script `./RQ2.py` found in the `evaluation/` directory generates:   - Figure 4 (loc.pdf)   - The mean, median, min, max, and stddev of the sizes   - The largest differences between the approaches   - The number of slices larger than the original program    ## RQ3 and RQ4 The process for these research questions is manual and requires comparing slices. It cannot be automated.   We did make heavy use of `diff --side-by-side` in this analysis.
创建时间:
2023-07-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作