five

A mixed-method in-depth study of test-specific refactorings: dataset

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14346265
下载链接
链接失效反馈
官方服务:
资源简介:
This submission is the replication package and dataset of the paper "A mixed-method in-depth study of test-specific refactorings." The files are organized in folders according to the corresponding method. Method 1 (Mining datasets) has seven files, two compressed excel sheets and five archive files. Excel sheets: Victor Guerra Veloso - Review Matrix.xlsx.zstd: this spreadsheet comprises the data from the related work as it is exported by the scripts from Gerrit2CommitMapper and MethodOneRelatedFiles.tar.zstd. This is also the input file for the association rules experiment Validation Sample ItemsetsSupport AllLevels FullyMerged.xlsx.zstd: as the name suggests it's a sample of the fully merged output of the association rules experiment.  Archives: Gerrit2CommitMapper.tar.zstd contains the Gerrit2CommitMapper script that we applied to Gerrit issues to extract associated commits in the projects' git repositories. MethodOneRelatedFiles.tar.zstd contains some scripts (as python notebooks) that were used to conduct the method one. related-work-datasets.tar.zstd contains all the raw datasets from related works MiningDatasetRMinerOutput.tar.zstd includes the output of RefactoringMiner for each commit in our dataset TestRefactoringExistingDataMining.tar.zstd comprises the python project we develop to assist the replication of the method 1's association rules experiment. It expects "Victor Guerra Veloso - Review Matrix.xlsx" as input and generate all the association rules for different settings, such as using different algorithms (FPGrowth, FPMax, and Apriori), hyper-parameters (minsupport ranging from 0.01 to 0.15), and transaction granularities (commit-level, file-level, and method-level) Method 2 (Monitoring) has two compressed text files, one excel sheet, and an archived version of the RefactoringMonitor tool source code. Notice that, RefactoringMonitor includes all components (WebApp, API, and worker), configuration files (docker-compose.yml), and a README.md with instruction for execution. Text files: InitialSet: list of repositories initially monitored (based on method 1) ExtendedSet: list of repositories monitored after collecting popular repositories (based on github search) Excel sheet: The excel sheet concentrates the result of the Monitoring and Survey as well as the multiple runs of the validation by the second author. Method 3 (StackOverflow) has one excel sheet and one archive file. The excel sheet contains the results of the experiment and associates tags to the collected and studied StackOverflow questions. The archive file comprise all the scripts to collect and assist the analysis of the stackoverflow questions. It also includes both input and output data files. These scripts leverage webbrowser, the Python's standard library feature to open browsers in a specific URL, to iterate over all stackoverflow questions allowing the researcher to assign tags or create new tags.
创建时间:
2024-12-10
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作