SmallDoge/smallcorpus
收藏Hugging Face2025-05-19 更新2025-05-31 收录
下载链接:
https://hf-mirror.com/datasets/SmallDoge/smallcorpus
下载链接
链接失效反馈官方服务:
资源简介:
SmallCorpus是一个包含多种配置的数据集,主要用于文本生成任务。它包括了代码、数学、英文和中文的反思文本以及英文和中文的教材文本和网页文本。每个配置都包含一个训练集,数据集的大小从几千到数亿例子不等。
SmallCorpus is a dataset with multiple configurations, primarily used for text generation tasks. It includes code, math, English and Chinese reflection texts, as well as English and Chinese textbook and web texts. Each configuration contains a training set, and the size of the dataset ranges from thousands to hundreds of millions of examples.
提供机构:
SmallDoge



