OpenWebText (OWT) Corpus
收藏arXiv2025-09-30 收录
下载链接:
https://openwebtext2.readthedocs.io/en/latest/
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了10万个自然出现的提示语,这些提示语被用于情感控制实验。每个提示语都有25个由GPT-2大型模型生成的延续样本。数据集的规模为10万个提示语,其任务是进行情感控制。
This dataset contains 100,000 naturally occurring prompts designed for sentiment control experiments. Each prompt is paired with 25 continuation samples generated by the large GPT-2 language model. With a total of 100,000 prompts, this dataset is dedicated to sentiment control tasks.
提供机构:
OpenWebText



