codeparrot/codeparrot-clean-train
收藏Hugging Face2022-10-10 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/codeparrot/codeparrot-clean-train
下载链接
链接失效反馈官方服务:
资源简介:
# CodeParrot 🦜 Dataset Cleaned (train)
Train split of [CodeParrot 🦜 Dataset Cleaned](https://huggingface.co/datasets/lvwerra/codeparrot-clean).
## Dataset structure
```python
DatasetDict({
train: Dataset({
features: ['repo_name', 'path', 'copies', 'size', 'content', 'license', 'hash', 'line_mean', 'line_max', 'alpha_frac', 'autogenerated'],
num_rows: 5300000
})
})
```
提供机构:
codeparrot
原始信息汇总
CodeParrot 🦜 Dataset Cleaned (train)
数据集结构
- 数据集类型: 训练集
- 数据集大小: 包含5300000条记录
- 数据特征:
- repo_name
- path
- copies
- size
- content
- license
- hash
- line_mean
- line_max
- alpha_frac
- autogenerated



