XGLUE
收藏arXiv2020-05-22 更新2024-06-21 收录
下载链接:
https://microsoft.github.io/XGLUE/
下载链接
链接失效反馈官方服务:
资源简介:
XGLUE是由微软研究院创建的一个新的跨语言基准数据集,旨在训练和评估大规模跨语言预训练模型。该数据集包含11个多样化的任务,涵盖自然语言理解和生成场景,并提供多种语言的标注数据。XGLUE通过扩展Unicoder模型,为理解和生成任务提供了一个强大的基线,并评估了多种跨语言预训练模型的性能。此数据集的应用领域广泛,旨在解决不同语言间资源不平衡的问题,特别是在英语与其他语言之间。
XGLUE is a novel cross-language benchmark dataset created by Microsoft Research, aiming to train and evaluate large-scale cross-language pre-trained models. This dataset includes 11 diverse tasks covering both natural language understanding and natural language generation scenarios, and provides annotated data in multiple languages. Leveraging an extended Unicoder model, XGLUE offers a robust baseline for both understanding and generation tasks, and enables the performance evaluation of various cross-language pre-trained models. With broad application scenarios, this dataset is designed to address the resource imbalance issue across different languages, particularly between English and other languages.
提供机构:
微软研究院
创建时间:
2020-04-03



