five

GitHub Profiles (users/organisations) and Repositories (research/non-research) of Potsdam Researchers and Research Organisations: An annotated dataset of with howfairis and software quality variables.

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/12607762
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset accompanies the paper "Software FAIRness, Documentation and Development Practices in Potsdam Researchers' GitHub Repositories" It includes 3 CSV files that contain data related to github profiles of users/organisations, their repositories annotated as research/non-research repositories and followed by FAIRness and other software qualtiy variables. The data were collected using SWORDS-template-UP (v1.0.0) methods (collect_users, collect_repositories, collect_variables) which is extended version of SWORS-template adopted according our needs and detailed in the paper. GitHub (research) user/organisation profiles. ( github_profiles.csv ) Column name Description  user_id GitHub username   html_url   URL of the GitHub profile   type     Type of profile (user or organization) organisation Acronym or name of the organization     GitHub repositories (github_repositories.csv) This file contains the repositories scraped from the GitHub profiles of research users and organizations. Column name  Description  html_url   URL link to the repository   description GitHub project description   project Specifies if the project is research or non-research language Programming language used in the project   organisation Acronym or name of the university, institution, or research organization research_group Acronym or name of the research group the repository belongs to Research repositories filtered and annotated (github_research_repositories_filtered_annotated.csv) This file contains filtered and annotated information about research repositories. Column Name  Description  Collection Method  html_url   Repository URL     howfairis_repository Indicates if the repository is public or private (True/False)   (Script- howfairis_variable.py) is a wrapper for howfairis pypi library that checks the 5 recommendations of FAIR howfairis_license   Indicates if the repository has a license (True/False) (Script- howfairis_variable.py) is a wrapper for howfairis pypi library that checks the 5 recommendations of FAIR howfairis_registry Indicates if the repository has implemented community registry (True/False) (Script- howfairis_variable.py) is a wrapper for howfairis pypi library that checks the 5 recommendations of FAIR howfairis_citation Indicates if the repository has a .cff file (True/False)   (Script- howfairis_variable.py) is a wrapper for howfairis pypi library that checks the 5 recommendations of FAIR howfairis_checklist Indicates if the repository has implemented OpenSSF best practices badge (True/False) (Script- howfairis_variable.py) is a wrapper for howfairis pypi library that checks the 5 recommendations of FAIR fair_score Score based on howfairis variables (0-5)     dlr_soft_class Name of the university, company, research institute, or research organization (Manual) Annotated the repository based on DLR software engineering guideline. There are no specific definitions on metrics how to categorise them (github repositories) into application classes. Which were needed to do a comparitive analysis.  installation_instruction Presence of installation instruction (True/False)   (Manual) Checked the presense of Installation Instruction in the readme or in the project wiki pages.  project_information   Presence of basic project information in README (True/False)   (Manual) Checked if the readme have basic information about the project.  usage_guide Presence of folder named test/tests in the root directory (True/False) (Manual) Checked the presense of Usage Guide in the readme or in the project wiki pages. For command line tools checked if they have help command which guides how to use the tool.   test_folder Presence of folder named test/tests in the root directory (True/False) (Script - test_folder.py) Checks the folder names test/tests in the root directory of the repository. requirements_explicit   Explicit requirements for Python, R, C++ repositories (True/False) (Script - requirement_explicit.py) Checks the files (requirements.txt, DESCRIPTION, CMakeLists.txt) in the root directory.  continuous_integration Indicates if the repository uses continuous integration (True/False) (Script- continious_integration.py) Checks the presence of folder .github (github actions) same for other continious integration (travisCI, CircleCI, Jekins, azure pipeline) ci_tool   Name of the continuous integration tool used (Script- continious_integration.py) Checks the presence of folder .github (github actions) same for other continious integration (travisCI, CircleCI, Jekins, azure pipeline) add_lint_rule   Indicates if additional linting rules are present (True/False) (Script - add_ci_rules.py) - it scans the YAML files in the .github/workflows directory to detect the presence of (linters) Python, R, and C++. add_test_rule Indicates if additional testing rules are present (True/False)   (Script - add_ci_rules.py) - it scans the YAML files in the .github/workflows directory to detect the presence of (testing libraries) Python, R, and C++. comment_at_start Indicates the level of comments at the start of the program (most, more, some, less) (Script - comment_at_start.py) Checks the presence of brief comments at the start at source code files in GitHub repositories. language   Programming language used in the repository     type   Specifies if the profile is a user or organization   Github organisation or user profiles. organisation   Name of the university, company, research institute, or research organization Oraganisation name (from where the user was found) research_group Name or acronym of the research group       Data for publication - https://github.com/Software-Engineering-Group-UP/potsdam-research-repos
创建时间:
2024-08-07
二维码
社区交流群
二维码
科研交流群
商业服务