Harshkmr/sampledFineweb
收藏Hugging Face2024-06-30 更新2024-07-06 收录
下载链接:
https://hf-mirror.com/datasets/Harshkmr/sampledFineweb
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多个字段,如文本、ID、dump、URL、日期、文件路径、语言、语言评分和词数统计等。数据集被分为训练集,包含10000个样本,文件大小为34046261字节。
This dataset includes multiple fields such as text, ID, dump, URL, date, file path, language, language score, and token count. The dataset is divided into a training set containing 10,000 samples, with a file size of 34,046,261 bytes.
提供机构:
Harshkmr



