xszheng2020/the_stack_dedup_python_hits_1_qsc_codepython_frac_lines_pass
收藏Hugging Face2025-09-20 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/xszheng2020/the_stack_dedup_python_hits_1_qsc_codepython_frac_lines_pass
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了多个与代码仓库和代码质量相关的特征,如文件哈希值、大小、扩展名、编程语言、所属仓库的星级、评论数、分支数等。同时,还包含了代码质量的评估指标,如平均行长度、最大行长度、字符组成比例、代码质量信号等。数据集分为训练集,包含约296,213个示例。
The dataset contains various features related to code repositories and code quality, such as file hash, size, extension, programming language, star count of the repository, number of issues, number of forks, etc. It also includes code quality metrics such as average line length, maximum line length, character composition ratio, and code quality signals. The dataset is split into a training set with approximately 296,213 examples.
提供机构:
xszheng2020



