A large-scale comparative analysis of Coding Standard conformance in Open-Source Data Science projects
收藏Mendeley Data2024-06-25 更新2024-06-30 收录
下载链接:
https://figshare.com/articles/dataset/A_large-scale_comparative_analysis_of_Coding_Standard_conformance_in_Open-Source_Data_Science_projects/12377237/3
下载链接
链接失效反馈官方服务:
资源简介:
This study investigates the extent to which data science projects follow code standards. In particular, which standards are followed, which are ignored, and how does this differ to traditional software projects? We compare a corpus of 1048 Open-Source Data Science projects to a reference group of 1099 non-Data Science projects with a similar level of quality and maturity. results.tar.gz: Extracted data for each project, including raw logs of all detected code violations. notebooks_out.tar.gz: Tables and figures generated by notebooks. source_code_anonymized.tar.gz: Anonymized source code (at time of publication) to identify, clone, and analyse the projects. Also includes Jupyter notebooks used to produce figures in the paper. The latest source code can be found at: https://github.com/a2i2/mining-data-science-repositories Published in ESEM 2020: https://doi.org/10.1145/3382494.3410680Preprint: https://arxiv.org/abs/2007.08978
创建时间:
2023-06-28



