BEE-spoke-data/cosmopedia-v2-mincols
收藏Hugging Face2025-12-29 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/BEE-spoke-data/cosmopedia-v2-mincols
下载链接
链接失效反馈官方服务:
资源简介:
cosmopedia-v2数据集是一个文本数据集,经过优化移除了额外的列以减小数据集的大小和便于使用。它包含两个主要特征:文本内容和格式,适用于文本生成任务。该数据集的训练集包含大约3913万个示例,总大小为147550004237字节。数据集遵循odc-by许可。
The cosmopedia-v2 dataset is a text dataset with extra columns removed to reduce its size and make it easier to use. It includes two main features: text content and format, and is suitable for text generation tasks. The training set of this dataset contains approximately 39.13 million examples, with a total size of 147550004237 bytes. The dataset is licensed under odc-by.
提供机构:
BEE-spoke-data



