five

Towards Supporting Open Source Library Maintainers with Community-Based Analytics

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/15029877
下载链接
链接失效反馈
官方服务:
资源简介:
Overview This replication package contains data and scripts used in our study. Please refer to the latest version of the data set. The package is structured into four main components. Folder Structure 1. Data ProcessedData: Contains refined datasets that guide our research questions. RawData: Contains raw scraped data about dependents from GitHub, selected for analysis. 2. RepoClonerDataAnalyser The starting repository for the study. Select the top 10 libraries and their dependents. Clones repositories and analyzes all research questions. Implemented in Python. 3. methodTypeResolutionJavaParser A Java project used for method resolution. After cloning repositories and filtering potential Java files using the RepoClonerDataAnalyzer project, this tool is used for parsing and resolving method types. 4. JacocoCoverageReporter Converts raw JaCoCo HTML coverage reports into CSV format. Implemented in Python  Usage Instructions Each project within this package has its own README file with detailed setup and execution instructions. Below is a high-level guide: Data Collection: Use RepoClonerDataAnalyser to select, clone, and filter dependents. Method Resolution: Run methodTypeResolutionJavaParser on filtered Java files. Coverage Analysis: Use JacocoCoverageReporter to convert JaCoCo HTML reports into CSV format and then Use RepoClonerDataAnalyser for further analysis. Please refer to our paper for more details. Data Analysis: Utilize the processed data in the Data folder for research insights. Requirements Python 3.x Java 8+ Required dependencies (listed in individual project README files. In version 1, you might notice a random GitHub repository URL provided in individual README. It is intended solely for context and clarity. It does not lead to an accessible resource and results in a 404 error. We have removed it in version 2 to avoid any confusion. We recommend using the latest version of the dataset.)
创建时间:
2025-03-22
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作