A large-scale comparative analysis of Coding Standard conformance in Open-Source Data Science projects
收藏Figshare2020-07-14 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/A_large-scale_comparative_analysis_of_Coding_Standard_conformance_in_Open-Source_Data_Science_projects/12377237
下载链接
链接失效反馈官方服务:
资源简介:
This study investigates the extent to which data science projects follow code standards. In particular, which standards are followed, which are ignored, and how does this differ to traditional software projects? We compare a corpus of 1048 Open-Source Data Science projects to a reference group of 1099 non-Data Science projects with a similar level of quality and maturity.results.tar.gz: Extracted data for each project, including raw logs of all detected code violations.notebooks_out.tar.gz: Tables and figures generated by notebooks.source_code_anonymized.tar.gz: Anonymized source code (at time of publication) to identify, clone, and analyse the projects. Also includes Jupyter notebooks used to produce figures in the paper.The latest source code can be found at: https://github.com/a2i2/mining-data-science-repositoriesPublished in ESEM 2020: https://doi.org/10.1145/3382494.3410680Preprint: https://arxiv.org/abs/2007.08978
创建时间:
2020-07-14



