Pride and Prejudice Seed Sets
收藏arXiv2025-09-30 收录
下载链接:
https://www.gutenberg.org/files/1342/1342-0.txt
下载链接
链接失效反馈官方服务:
资源简介:
该数据集通过在《傲慢与偏见》文本上构建了2000个4种子集,以测试句子生成能力。它突显了模型在不同体裁上的表现,与Europarl数据集形成对比。该数据集包含了大约125,000个单词中的2000组种子,其任务是基于给定的种子构建句子。
This dataset constructs 2000 seed sets split into 4 subsets using the text of *Pride and Prejudice*, with the goal of evaluating sentence generation capabilities. It showcases model performance across diverse text genres, providing a contrast to the Europarl dataset. This dataset comprises 2000 seed sets derived from roughly 125,000 words, and its core task is to generate sentences based on the provided seed sets.
提供机构:
Project Gutenberg



