yushengsu/stack-v2-python-with-content-chunk1-modified
收藏Hugging Face2024-07-22 更新2024-07-22 收录
下载链接:
https://hf-mirror.com/datasets/yushengsu/stack-v2-python-with-content-chunk1-modified
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多个与代码仓库相关的特征,如仓库名称、URL、快照ID、修订ID等,以及文件列表中的详细属性,如字母比例、字母数字比例、平均行长度等。数据集分为一个训练集,包含大量示例和字节数。该数据集是从原始数据集中提取的仅包含Python代码的版本。
The dataset contains detailed information about GitHub repositories, including repository name, URL, snapshot ID, revision ID, branch name, visit date, revision date, committer date, GitHub ID, star events count, fork events count, GitHub activity license ID, created time, updated time, pushed time, primary programming language, file information (such as alpha fraction, alphanumeric fraction, average line length, content ID, detected licenses, is generated, is vendor, language, byte length, license type, maximum line length, number of lines, path, source encoding, text), and number of files. The dataset is divided into a training set, providing the size and number of samples.
提供机构:
yushengsu



