MultiMUC
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/xinyadu/gtt/
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个多语言数据集,由MUC-4文件创建而成,包含了五种目标语言的自动翻译和实体提及对齐功能。此外,数据集还包括了由标注者使用如Awesome-align和TASA等工具执行的自动翻译和实体对齐校正。该数据集规模涵盖了1,700份MUC-4文件,旨在完成细粒度的模板填充任务。
This multilingual dataset is constructed from MUC-4 documents, offering automatic translation and entity mention alignment capabilities for five target languages. Furthermore, the dataset includes corrections to automatic translations and entity alignments, which were performed by annotators using tools such as Awesome-align and TASA. Comprising 1,700 MUC-4 documents in total, this dataset is designed for fine-grained template filling tasks.
提供机构:
IARPA BETTER program



