five

MSR 2023 Dataset

收藏
DataCite Commons2023-03-13 更新2024-08-18 收录
下载链接:
https://figshare.com/articles/dataset/MSR_2023_Dataset/22264000
下载链接
链接失效反馈
官方服务:
资源简介:
This is the artifact that accompanies the paper titled: "Wasmizer: Curating WebAssembly-driven Projects on GitHub". <br> It contains: - the scripts we used to produce a dataset of WebAssembly binaries - a dataset of WebAssembly binaries (8915 .wasm files and 1384 .wat files) - the WASMIZER tool that can be used to automatically collect a new dataset, with refined information # Scripts <br> The `Scripts/` folder contains the scripts that we used to collect our dataset in December 2022. Since then, we have refined our scripts and made a tool, WASMIZER, described at the end of this README and accessible in the `WASMIZER/` folder. <br> In the `Scripts/` folder, there are three subfolders : <br> - `Collector/` that contains the scripts to collect GitHub projects that may be targeting WebAssembly as a compilation target - `Compiler/` that tries to compile a project and extracts the `.wasm` and `.wat` found after compilation. - `SmellsChecker/` that contains the checkers for the smells used in the case study of the paper. Each of these directories contain a README providing more details. <br> # Dataset The `Dataset/` folder contains our dataset of WebAssembly binaries collected in December 2022. It is structured as follows: <br> - The `wasm/` folder contains WebAssembly binaries in their binary format (`.wasm`) - The `wat/` folder contains WebAssembly binaries in their textual format. The basename of the files are their SHA checksum. Each binary is accompanied by a `.meta` file of the same name containing the project and the path within the project in which it was found. For example, for the binary `00047ad76615715bb2b36fa2102135b8dc32ac3c17f3488451168f808e2039f0.wasm`, there is a `00047ad76615715bb2b36fa2102135b8dc32ac3c17f3488451168f808e2039f0.meta` file containing: <br> ``` ./JuiceFV-Emscripten_OpenGL/application/dependencies/lib_sources/GLM/test/gtx/test-gtx_easing.wasm ``` <br> This indicates that we have found a WebAssembly binary in the `JuiceFV/Emscripten_OpenGL` GitHub project, at location `application/dependencies/lib_sources/GLM/test/gtx/test-gtx_easing.wasm`. When collecting this dataset, a number of metadata have not been collected and are thus missing from this initial snapshot. However, we have bundled a tool described below, which scrapes repositories and extract .wasm files after compilation, which collects more metadata. <br> # WASMIZER The WASMIZER tool is provided in the `WASMIZER/` directory. This is a refined version of our scripts used to collect the dataset. More details can be found in `WASMIZER/README.md`. To obtain the latest version, one can run `git pull` from the `WASMIZER/` directory, or access the repository online at [https://github.com/arash-mazidi/WASMIZER](https://github.com/arash-mazidi/WASMIZER). <br> The tool is deployed and regularly pushes newly found projects, .wasm and .wat files to the following shared folder : https://tucloud.tu-clausthal.de/index.php/s/MMRQMEZm66GRGXI (password: wasmizer).
提供机构:
figshare
创建时间:
2023-03-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作