schuler/cosmopedia-v2-textbook-and-howto-4.5m
收藏Hugging Face2024-11-17 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/schuler/cosmopedia-v2-textbook-and-howto-4.5m
下载链接
链接失效反馈官方服务:
资源简介:
Cosmopedia V2 Textbook and WikiHow Dataset 4.5M数据集是从HuggingFaceTB Smollm-Corpus中提取的,特别关注Cosmopedia V2子集,包含被分类为textbook、textbook_unconditionned_topic或WikiHow类型的条目。该数据集专为需要教育内容的研究人员和开发者设计,适用于训练小型语言模型、内容生成和分析等任务。数据集格式为纯文本,适合资源受限的环境。
The Cosmopedia V2 Textbook and WikiHow Dataset is a collection of educational content filtered from the Smollm-Corpus dataset. It contains entries categorized as textbooks and WikiHow, suitable for researchers and developers requiring educational content. The dataset features include: exclusively educational material and step-by-step guides, ideal for training tiny language models and content generation analysis, and pure text format suitable for resource-constrained environments.
提供机构:
schuler



