tyoc213/split-avelina-python-edu
收藏Hugging Face2025-04-03 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/tyoc213/split-avelina-python-edu
下载链接
链接失效反馈官方服务:
资源简介:
这是一个包含代码片段文本的数据集,不同配置的数据集包含不同数量的示例。每种配置都包括blob_id、仓库名称(repo_name)、文件路径(path)、文件大小(length_bytes)、分数(score)、整数分数(int_score)和文本内容(text)等特征。数据集分为100k、10k、1M、1k和full五种配置,分别包含90000、9000、900000、900和6910602个训练集示例以及10000、1000、100000、100和767845个测试集示例。
This is a dataset containing code snippet texts, with different configurations containing different numbers of examples. Each configuration includes features such as blob_id, repository name (repo_name), file path (path), file size (length_bytes), score (score), integer score (int_score), and text content (text). The dataset is divided into five configurations: 100k, 10k, 1M, 1k, and full, containing 90000, 9000, 900000, 900, and 6910602 training set examples, as well as 10000, 1000, 100000, 100, and 767845 test set examples respectively.
提供机构:
tyoc213



