tommyp111/fineweb-2m
收藏Hugging Face2024-11-06 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/tommyp111/fineweb-2m
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多个字段,包括文本内容、唯一标识符、数据转储、URL链接、日期、文件路径、语言、语言评分和词数统计。数据集主要用于训练,包含200万条示例,总大小为6650534433字节,下载大小为3912191977字节。
This dataset includes multiple fields such as text content, unique identifiers, data dumps, URL links, dates, file paths, language, language scores, and token counts. The dataset is primarily used for training, containing 2 million examples with a total size of 6650534433 bytes and a download size of 3912191977 bytes.
提供机构:
tommyp111



