Replication package for "Control and Data Flow in Security Smell Detection for Infrastructure as Code: Is It Worth the Effort?"
收藏Figshare2023-03-15 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Replication_package_for_Control_and_Data_Flow_in_Security_Smell_Detection_for_Infrastructure_as_Code_Is_It_Worth_the_Effort_/21929856
下载链接
链接失效反馈官方服务:
资源简介:
This replication package contains the source code of the prototype implementation of GASEL, a security smell detector for Ansible code, as well as analysis scripts and notebooks and derived data. Note that throughout this package, the prototype implementation is named as its work-in-progress title "scansible", yet is referred to as GASEL in the paper. Although raw git repositories used in the empirical analysis are not provided in this package due to privacy and licencing concerns, we provide the necessary scripts and data to clone these for replication. Contents scansible.tar.gz: Source code of the GASEL prototype implementation. scripts.tar.gz: Scripts used in the empirical study to collect and prepare the dataset, run the analyses on the dataset, and post-process the results for RQ1. Ansible smells.ipynb: Notebook containing the statistical analyses for RQ2 through RQ4. data.tar.gz: Data generated by the analyses on the dataset, excluding raw repositories. source_data.tar.gz: Source data to start the analyses, containing pointers to analysed repositories. iac-security.tar.gz and docker-compose.yml: Docker image and Docker Compose file, respectively, for isolated replication. Replication Instructions 1. Download `source_data.tar.gz`, `iac-security.tar.gz`, `docker-compose.yml`, and `Ansible smells.ipynb`. 2. Download the Andromeda dataset, available at https://doi.org/10.6084/m9.figshare.13664519, and unzip its contents into `source_data/Andromeda`. Expected file structure is `source_data/Andromeda/GalaxyMetadata/*.yaml`. 4. Extract `source_data.tar.gz` and create an additional empty `data` directory. The latter is used to store results. 5. Decompress the `iac-security.tar.gz` file: `gunzip iac-security.tar.gz` 6. Load the Docker image: `docker load 7. Run the Docker image: `docker compose run -e GITHUB_AUTH_TOKEN= -d --rm --name iac-security scripts` * The docker-compose file will automatically pull and run an accompanying RedisGraph image, necessary to perform GASEL's detection. * Be sure to replace `` with a GitHub authentication token. This token requires no special scopes, and is merely used to query the number of stars for the considered repositories. 8. Attach the running Docker container: `docker attach iac-security` 9. Your terminal is now attached to a shell in the Docker container, from which you can run the analysis scripts: 1. Run `./prepare_dataset.sh` to prepare the dataset: Merging the datasets, applying additional filtering criteria, cloning the repositories, and enumerating their Ansible files. Note that this may take a while to complete, due to the large number of repositories that need to be cloned. 2. Run `./scan_dataset.sh` to run GASEL and the two baseline tools on the dataset. Note that this may take multiple hours to finish. 3. Run `./postprocess_results.sh` to create a sample of reports for manual validation. 4. Resulting data will be stored in the `data/` directory. 5. After manual validation, one can run `python postprocess_results/calculate_precision_recall.py` to summarise each detector's precision and recall for each smell. Note that you may need to activate the appropriate virtual environment first: `. .venv/bin/activate`. 6. After manual validation, `python postprocess_results/prepare_reasons.py` and `python postprocess_results/summarise_reasons.py` can be used to manually classify the root causes for detector differences. As before, you may need to activate the virtual environment first. 10. Use `Ansible smells.ipynb` to perform the analysis for RQs 2 through 4. You will need to copy `data/results/scansible/reports.csv` to the appropriate file location. Alternatively to step 9, you may run each individual script manually. See the shell scripts for usage examples. Licence Data contained within this replication package is licenced under CC BY-NC 4.0. The source code contained within this package is dual-licenced. You are free to use it for non-commercial purposes under the conditions of the GPL 3.0 license. If you wish to use this product or any of its code commercially, contact us at: coen.de.roover@vub.be
创建时间:
2023-03-15



