five

Replication package for "Control and Data Flow in Security Smell Detection for Infrastructure as Code: Is It Worth the Effort?"

收藏
DataCite Commons2025-06-01 更新2024-08-18 收录
下载链接:
https://figshare.com/articles/dataset/Replication_package_for_Control_and_Data_Flow_in_Security_Smell_Detection_for_Infrastructure_as_Code_Is_It_Worth_the_Effort_/21929856/1
下载链接
链接失效反馈
官方服务:
资源简介:
This replication package contains the source code of the prototype implementation of GASEL, a security smell detector for Ansible code, as well as analysis scripts and notebooks and derived data. <br> Note that throughout this package, the prototype implementation is named as its work-in-progress title "scansible", yet is referred to as GASEL in the paper. <br> Although raw git repositories used in the empirical analysis are <strong>not</strong> provided in this package due to privacy and licencing concerns, we provide the necessary scripts and data to clone these for replication. <br> Contents scansible.tar.gz: Source code of the GASEL prototype implementation. scripts.tar.gz: Scripts used in the empirical study to collect and prepare the dataset, run the analyses on the dataset, and post-process the results for RQ1. Ansible smells.ipynb: Notebook containing the statistical analyses for RQ2 through RQ4. data.tar.gz: Data generated by the analyses on the dataset, excluding raw repositories. source_data.tar.gz: Source data to start the analyses, containing pointers to analysed repositories. iac-security.tar.gz and docker-compose.yml: Docker image and Docker Compose file, respectively, for isolated replication. <br> Replication Instructions <br> 1. Download `source_data.tar.gz`, `iac-security.tar.gz`, `docker-compose.yml`, and `Ansible smells.ipynb`. 2. Download the Andromeda dataset, available at https://doi.org/10.6084/m9.figshare.13664519, and unzip its contents into `source_data/Andromeda`. Expected file structure is `source_data/Andromeda/GalaxyMetadata/*.yaml`. 4. Extract `source_data.tar.gz` and create an additional empty `data` directory. The latter is used to store results. 5. Decompress the `iac-security.tar.gz` file: `gunzip iac-security.tar.gz` 6. Load the Docker image: `docker load &lt; iac-security.tar` 7. Run the Docker image: `docker compose run -e GITHUB_AUTH_TOKEN= -d --rm --name iac-security scripts` * The docker-compose file will automatically pull and run an accompanying RedisGraph image, necessary to perform GASEL's detection. * Be sure to replace `` with a GitHub authentication token. This token requires no special scopes, and is merely used to query the number of stars for the considered repositories. 8. Attach the running Docker container: `docker attach iac-security` 9. Your terminal is now attached to a shell in the Docker container, from which you can run the analysis scripts: 1. Run `./prepare_dataset.sh` to prepare the dataset: Merging the datasets, applying additional filtering criteria, cloning the repositories, and enumerating their Ansible files. Note that this may take a while to complete, due to the large number of repositories that need to be cloned. 2. Run `./scan_dataset.sh` to run GASEL and the two baseline tools on the dataset. Note that this may take multiple hours to finish. 3. Run `./postprocess_results.sh` to create a sample of reports for manual validation. 4. Resulting data will be stored in the `data/` directory. 5. After manual validation, one can run `python postprocess_results/calculate_precision_recall.py` to summarise each detector's precision and recall for each smell. Note that you may need to activate the appropriate virtual environment first: `. .venv/bin/activate`. 6. After manual validation, `python postprocess_results/prepare_reasons.py` and `python postprocess_results/summarise_reasons.py` can be used to manually classify the root causes for detector differences. As before, you may need to activate the virtual environment first. 10. Use `Ansible smells.ipynb` to perform the analysis for RQs 2 through 4. You will need to copy `data/results/scansible/reports.csv` to the appropriate file location. <br> Alternatively to step 9, you may run each individual script manually. See the shell scripts for usage examples. <br> Licence Data contained within this replication package is licenced under CC BY-NC 4.0. The source code contained within this package is dual-licenced. You are free to use it for <strong>non-commercial</strong> purposes under the conditions of the GPL 3.0 license. If you wish to use this product or any of its code commercially, contact us at: coen.de.roover@vub.be
提供机构:
figshare
创建时间:
2023-03-15
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作