文化相关平行语料库
收藏arXiv2024-03-23 更新2024-06-21 收录
下载链接:
https://github.com/BigBinnie/Benchmarking-LLM-basedMachine-Translation-on-Cultural-Awareness
下载链接
链接失效反馈官方服务:
资源简介:
本研究构建了一个专注于文化特定实体的平行语料库,旨在评估机器翻译系统的文化意识。该数据集包含7253个文化特定实体,涵盖18个概念类别,涉及超过140个国家和地区。数据集通过精细的标注和分类,确保了文化特定内容的准确性和丰富性。此数据集的应用领域主要集中在提高机器翻译在跨文化交流中的有效性,特别是在处理文化特定术语和表达时。
This study constructs a parallel corpus focusing on culture-specific entities, aiming to evaluate the cultural awareness of machine translation systems. This dataset contains 7,253 culture-specific entities, covering 18 conceptual categories and involving more than 140 countries and regions. The dataset ensures the accuracy and richness of culture-specific content through meticulous annotation and classification. The main application scenarios of this dataset focus on improving the effectiveness of machine translation in cross-cultural communication, especially when dealing with culture-specific terms and expressions.
提供机构:
威斯康星大学麦迪逊分校
创建时间:
2023-05-24



