aieng-lab/genter
收藏Hugging Face2025-10-14 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/aieng-lab/genter
下载链接
链接失效反馈官方服务:
资源简介:
GENTER数据集是一个由模板句子组成的集合,这些句子将人名与第三人称单数代词相关联。例如,数据集中的句子可能包含这样的结构:[NAME] asked, not sounding as if [PRONOUN] cared about the answer. 数据集来源于BookCorpus,只包含人名后面跟有正确第三人称单数代词(他/她)的句子。基于这些句子,创建模板句子(masked),包括两个模板键:[NAME]和[PRONOUN]。这使得数据集可以用来生成各种句子,通过变化名字(例如使用[aieng-lab/namexact]中的名字)并为每个名字插入正确的代词。
This dataset consists of template sentences associating first names (`[NAME]`) with third-person singular pronouns (`[PRONOUN]`), e.g., `[NAME] asked, not sounding as if [PRONOUN] cared about the answer`. The dataset is a filtered version of BookCorpus containing only sentences where a first name is followed by its correct third-person singular pronoun (he/she). Template sentences (`masked`) are created with two template keys: `[NAME]` and `[PRONOUN]`, allowing for the generation of various sentences with different names (e.g., using names from [aieng-lab/namexact]) and the insertion of the correct pronoun for each name.
提供机构:
aieng-lab



