five

aieng-lab/genter

收藏
Hugging Face2025-10-14 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/aieng-lab/genter
下载链接
链接失效反馈
官方服务:
资源简介:
GENTER数据集是一个由模板句子组成的集合,这些句子将人名与第三人称单数代词相关联。例如,数据集中的句子可能包含这样的结构:[NAME] asked, not sounding as if [PRONOUN] cared about the answer. 数据集来源于BookCorpus,只包含人名后面跟有正确第三人称单数代词(他/她)的句子。基于这些句子,创建模板句子(masked),包括两个模板键:[NAME]和[PRONOUN]。这使得数据集可以用来生成各种句子,通过变化名字(例如使用[aieng-lab/namexact]中的名字)并为每个名字插入正确的代词。

This dataset consists of template sentences associating first names (`[NAME]`) with third-person singular pronouns (`[PRONOUN]`), e.g., `[NAME] asked, not sounding as if [PRONOUN] cared about the answer`. The dataset is a filtered version of BookCorpus containing only sentences where a first name is followed by its correct third-person singular pronoun (he/she). Template sentences (`masked`) are created with two template keys: `[NAME]` and `[PRONOUN]`, allowing for the generation of various sentences with different names (e.g., using names from [aieng-lab/namexact]) and the insertion of the correct pronoun for each name.
提供机构:
aieng-lab
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作