Curation Corpus
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/curationcorp/curation-corpus
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为“策展语料库”,旨在评估语言模型,其中包含了精心挑选的文本。此外,该数据集还用于评估预训练模型的表现。其所涉及的任务是语言建模。
This dataset, named "Curated Corpus", is designed to evaluate language models and contains carefully selected texts. Additionally, it is utilized to assess the performance of pretrained models, with the relevant task being language modeling.



