five

Gitome: A curated dataset for GitHub README-related tasks

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10295080
下载链接
链接失效反馈
官方服务:
资源简介:
About  This repository contains the source code implementation used to replicate the experimental results obtained in the submitted to the 21st International Conference on Mining Software Repositories (MSR204). "Gitome: A curated dataset for GitHub README-related tasks" authored by: Claudio Di Sipio, Juri Di Rocco, Riccardo Rubei, Phuong Than Nguyen, and Davide Di Ruscio, Università degli Studi dell'Aquila, Italy Data description  The dataset is structured as follows:  emf_metamodel.zip: It contains the Ecore project with the Gitome data model existing_dumps.zip: It contains the existing datasets used to build Gitome lang_aggr_stats.csv: It contains the language data to compute the statistics presented in the paper langs.csv: It contains all the languages and their frequency output_dataset.zip: It contains the benchmarking dataset obtained by parsing the README files repository_lists.zip: It contains the list of repositories for each considered dataset (with possible duplicates) topics.csv: It contains all the topics and their frequency topics_aggr_stats.csv:  It contains the topics data to compute the statistics presented in the paper gitome_repo.txt: It contains the list of the URLs of the considered GitHub repositories   How to collect Gitome To collect all the data stored in this archive, please refer to the supporting Github repository https://github.com/MDEGroup/Gitome-MSR2024.
创建时间:
2023-12-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作