semran1/cosmopedia_4B

Name: semran1/cosmopedia_4B
Creator: semran1
Published: 2025-01-10 11:21:25
License: 暂无描述

Hugging Face2025-01-10 更新2025-02-15 收录

下载链接：

https://hf-mirror.com/datasets/semran1/cosmopedia_4B

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含文本和对应的token数量，适用于自然语言处理任务。它分为训练集，共有4293040个样本，数据大小约为16GB。

The dataset includes text and corresponding token counts, suitable for natural language processing tasks. It consists of a training set with a total of 4,293,040 samples, with a data size of approximately 16GB.

提供机构：

semran1

5,000+

优质数据集

54 个

任务类型

进入经典数据集