Replication Package for Causes and Effects of Fitness Landscapes in System Test Generation: A Replication Study

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/14764980

下载链接

链接失效反馈

官方服务：

资源简介：

This is a replication package of the paper "Causes and Effects of Fitness Landscapes in System Test Generation: A Replication Study". This replication package contains all the data and scripts used in the paper. Since raw data is too large, recalculating might take too long, so we saved all the calculated results separately and compressed them. We provide the raw and processed data and scripts to calculate the results. The package contains the following files and directories: data/: contains all the raw and processed data. compressedData.zip: contains compressed statistics data generated by EvoMaster. covered_target.zip: contains a list of the covered targets. Not used in the experiments. Shared to provide more insights. mergedCoverageMetrics.zip: contains the number of covered targets, success rates, and the number of discrete fitness values of each branch and algorithm. This file is used to generate the figures and tables in the paper. mergedSearchMetrics.zip: contains the fitness landscape measurements of each branch and algorithm. This file is used to generate the figures and tables in the paper. rawData.zip: contains raw data obtained after postprocessing. After extracting this file, it becomes ~65GB in size. rpc-thrift-ncs_4.csv: contains fitness values of each branch for each step to make a detailed analysis of the DBI metric (used in dbi_analysis.R). snapshotCompressedData.zip: contains statistics data generated by EvoMaster for each snapshot. Not used in the experiments. Shared to provide more insights. suts.csv: contains the list of SUTs used in the experiments. targets.zip: contains raw data produced by EvoMaster. We need to fill in the missing fitness values with "0". You can use data_fix.py file to fill in the missing values. You need to first merge csv files according to sut names (with merge_data function) and then fill in the missing values (with fix_all_suts function). Analyze.R: contains all the calculations of the metrics and generation of the tables. For the table generation, it uses the data/mergedCoverageMetrics.zip and data/mergedSearchMetrics.zip files. data_fix.py: contains the code to fill the missing fitness values and merge csv files. dbi_analysis.R: contains detailed analysis of the DBI metrics given in the paper. exp.py: contains experiment settings of the EvoMaster. generateFigures.R: contains the code to generate the figures in the paper. It uses the data/mergedCoverageMetrics.zip and data/mergedSearchMetrics.zip files. helper.R: contains helper functions for the Analyze.R and generateFigures.R files. metricCalculation.R: contains the calculation of the metrics used in the paper. First, you need post-processed data produced by data_fix.py. Then you need to compress all data using init function in this file. This generate compressed data according the sut names. Then calculation can be done with this terminal command: Rscript --vanilla metricCalculation.R {FIRST_BRANCH} {LAST_BRANCH} {SUT} For example for the pay-publicapi: Rscript --vanilla metricCalculation.R 1 14 pay-publicapi This approach allows us to calculate the metrics for each branch separately. Since the calculation is time-consuming, we suggest calculating each branch's metrics separately. In this way, you can parallelize the calculation, i.e., you can use a cluster. After the calculation, you can merge the all branch calculations with the getAndMergeZipFiles function in the metricCalculation.R file. This will provide final data (mergedCoverageMetrics.zip and mergedSearchMetrics.zip) for the analysis. You can use this data to generate the figures and tables. settings.py: contains the settings of the analysis. mergedCoverageMetrics.zip contains: problem algorithm branch numberOfCoverage successRate numberOfDiscreteFitness - - - - - - mergedSearchMetrics.zip contains: problem algorithm branch seed autoCorrelation str_data neutral_distance neutrality_volume information_content partial_information_content density_basin - - - - - - - - - - For any questions, please contact the authors of the paper.

创建时间：

2025-01-30

5,000+

优质数据集

54 个

任务类型

进入经典数据集