EleutherAI/SmolLM2-1.7B-stage-4-20B
收藏Hugging Face2025-04-17 更新2025-05-31 收录
下载链接:
https://hf-mirror.com/datasets/EleutherAI/SmolLM2-1.7B-stage-4-20B
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含两个字段:文本(text)和来源(source),都是字符串类型。数据集目前只有训练集划分,包含17,590,288个示例,总大小约为74.38GB。尽管README文件没有提供详细的数据集描述,但从提供的信息可以推断,这可能是一个用于文本处理的较大数据集。
The dataset includes two fields: text and source, both of which are string types. There is currently only a training set split with 17,590,288 examples, totaling approximately 74.38GB in size. Although the README file does not provide a detailed dataset description, the information provided suggests that this may be a large dataset for text processing.
提供机构:
EleutherAI



