mjkmain/fineweb-edu-1M
收藏Hugging Face2025-02-02 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/mjkmain/fineweb-edu-1M
下载链接
链接失效反馈官方服务:
资源简介:
这是一个包含文本数据的数据集,其中包括文本内容、唯一标识符、文本来源URL、文件路径、文本语言及其评分、词数、得分等信息。数据集被划分为训练集,共有100万条示例,总大小约为4.7GB。提供了默认配置,以及训练数据的文件路径。
This is a dataset containing text data, which includes fields such as text content, unique identifiers, source URL of the text, file path, text language and its score, word count, score, etc. The dataset is split into a training set with a total of 1 million examples and a total size of approximately 4.7GB. Default configuration is provided, along with the file path for the training data.
提供机构:
mjkmain



