five

gENder-IT

收藏
arXiv2025-09-30 收录
下载链接:
https://github.com/vnmssnhv/gender-it
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集名为gENder-IT,专为解决翻译场景中的自然性别现象而设计。它是一个词级别的标注挑战集,重点关注英文中性别含义模糊的句子,并为意大利语提供了可能的翻译选项。该数据集包含带有<F>(女性)、<M>(男性)或<A>(模糊)标签的句子,是MuST-SHE语料库的一个子集,并专注于词级别的标注。其规模是现有数据集的一个子集,任务重点在于关注性别现象的机器翻译。

This dataset, named gENder-IT, is specifically designed to address natural gender phenomena in translation scenarios. It is a word-level annotated challenge set that focuses on sentences with ambiguous gender meanings in English, and provides possible translation options for Italian. The dataset contains sentences labeled with <F> (female), <M> (male), or <A> (ambiguous), is a subset of the MuST-SHE corpus and focuses on word-level annotation. It is a subset of existing datasets in terms of its scale, with its core task focusing on machine translation related to gender phenomena.
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作