A large-scale comparative analysis of Coding Standard conformance in Open-Source Data Science projects
收藏DataCite Commons2021-10-04 更新2024-07-28 收录
下载链接:
https://figshare.com/articles/dataset/A_large-scale_comparative_analysis_of_Coding_Standard_conformance_in_Open-Source_Data_Science_projects/12377237
下载链接
链接失效反馈官方服务:
资源简介:
This study investigates the extent to which data science projects follow code standards. In particular, <i>which standards are followed, which are ignored, and how does this differ to traditional software projects?</i> We compare a corpus of 1048 Open-Source Data Science projects to a reference group of 1099 non-Data Science projects with a similar level of quality and maturity.<br><br><b>results.tar.gz</b>: Extracted data for each project, including raw logs of all detected code violations.<br><b>notebooks_out.tar.gz</b>: Tables and figures generated by notebooks.<br><b>source_code_anonymized.tar.gz</b>: Anonymized source code (at time of publication) to identify, clone, and analyse the projects. Also includes Jupyter notebooks used to produce figures in the paper.<br>The latest source code can be found at: https://github.com/a2i2/mining-data-science-repositories<br><br>Published in ESEM 2020: https://doi.org/10.1145/3382494.3410680Preprint: https://arxiv.org/abs/2007.08978<br>
提供机构:
figshare
创建时间:
2020-07-14



