A mixed-method in-depth study of test-specific refactorings: dataset
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14346265
下载链接
链接失效反馈官方服务:
资源简介:
This submission is the replication package and dataset of the paper "A mixed-method in-depth study of test-specific refactorings."
The files are organized in folders according to the corresponding method.
Method 1 (Mining datasets) has seven files, two compressed excel sheets and five archive files.
Excel sheets:
Victor Guerra Veloso - Review Matrix.xlsx.zstd: this spreadsheet comprises the data from the related work as it is exported by the scripts from Gerrit2CommitMapper and MethodOneRelatedFiles.tar.zstd. This is also the input file for the association rules experiment
Validation Sample ItemsetsSupport AllLevels FullyMerged.xlsx.zstd: as the name suggests it's a sample of the fully merged output of the association rules experiment.
Archives:
Gerrit2CommitMapper.tar.zstd contains the Gerrit2CommitMapper script that we applied to Gerrit issues to extract associated commits in the projects' git repositories.
MethodOneRelatedFiles.tar.zstd contains some scripts (as python notebooks) that were used to conduct the method one.
related-work-datasets.tar.zstd contains all the raw datasets from related works
MiningDatasetRMinerOutput.tar.zstd includes the output of RefactoringMiner for each commit in our dataset
TestRefactoringExistingDataMining.tar.zstd comprises the python project we develop to assist the replication of the method 1's association rules experiment. It expects "Victor Guerra Veloso - Review Matrix.xlsx" as input and generate all the association rules for different settings, such as using different algorithms (FPGrowth, FPMax, and Apriori), hyper-parameters (minsupport ranging from 0.01 to 0.15), and transaction granularities (commit-level, file-level, and method-level)
Method 2 (Monitoring) has two compressed text files, one excel sheet, and an archived version of the RefactoringMonitor tool source code. Notice that, RefactoringMonitor includes all components (WebApp, API, and worker), configuration files (docker-compose.yml), and a README.md with instruction for execution.
Text files:
InitialSet: list of repositories initially monitored (based on method 1)
ExtendedSet: list of repositories monitored after collecting popular repositories (based on github search)
Excel sheet: The excel sheet concentrates the result of the Monitoring and Survey as well as the multiple runs of the validation by the second author.
Method 3 (StackOverflow) has one excel sheet and one archive file.
The excel sheet contains the results of the experiment and associates tags to the collected and studied StackOverflow questions.
The archive file comprise all the scripts to collect and assist the analysis of the stackoverflow questions. It also includes both input and output data files. These scripts leverage webbrowser, the Python's standard library feature to open browsers in a specific URL, to iterate over all stackoverflow questions allowing the researcher to assign tags or create new tags.
创建时间:
2024-12-10



