five

A large-scale comparative analysis of Coding Standard conformance in Open-Source Data Science projects

收藏
DataCite Commons2021-10-04 更新2024-07-28 收录
下载链接:
https://figshare.com/articles/dataset/A_large-scale_comparative_analysis_of_Coding_Standard_conformance_in_Open-Source_Data_Science_projects/12377237/3
下载链接
链接失效反馈
官方服务:
资源简介:
This study investigates the extent to which data science projects follow code standards. In particular, <i>which standards are followed, which are ignored, and how does this differ to traditional software projects?</i> We compare a corpus of 1048 Open-Source Data Science projects to a reference group of 1099 non-Data Science projects with a similar level of quality and maturity.<br><br><b>results.tar.gz</b>: Extracted data for each project, including raw logs of all detected code violations.<br><b>notebooks_out.tar.gz</b>: Tables and figures generated by notebooks.<br><b>source_code_anonymized.tar.gz</b>: Anonymized source code (at time of publication) to identify, clone, and analyse the projects. Also includes Jupyter notebooks used to produce figures in the paper.<br>The latest source code can be found at: https://github.com/a2i2/mining-data-science-repositories<br><br>Published in ESEM 2020: https://doi.org/10.1145/3382494.3410680Preprint: https://arxiv.org/abs/2007.08978<br>
提供机构:
figshare
创建时间:
2021-10-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作