gENder-IT
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/vnmssnhv/gender-it
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为gENder-IT,专为解决翻译场景中的自然性别现象而设计。它是一个词级别的标注挑战集,重点关注英文中性别含义模糊的句子,并为意大利语提供了可能的翻译选项。该数据集包含带有<F>(女性)、<M>(男性)或<A>(模糊)标签的句子,是MuST-SHE语料库的一个子集,并专注于词级别的标注。其规模是现有数据集的一个子集,任务重点在于关注性别现象的机器翻译。
This dataset, named gENder-IT, is specifically designed to address natural gender phenomena in translation scenarios. It is a word-level annotated challenge set that focuses on sentences with ambiguous gender meanings in English, and provides possible translation options for Italian. The dataset contains sentences labeled with <F> (female), <M> (male), or <A> (ambiguous), is a subset of the MuST-SHE corpus and focuses on word-level annotation. It is a subset of existing datasets in terms of its scale, with its core task focusing on machine translation related to gender phenomena.



