xP3
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/bigscience-workshop/xmtf
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为xP3,是一个包含46种语言的监督数据集复合体,其中包含英文提示和机器翻译提示。它旨在研究跨语言的多元任务提示微调。xP3数据集涵盖了翻译、简化、程序合成以及其他各类代码数据集任务,旨在提高英文和非英文任务上的性能表现。该数据集是扩展了P3数据集系列的复合体,其任务范围涉及多种自然语言及与代码相关的多元学习任务。
This dataset, named xP3, is a composite supervised dataset spanning 46 languages, which includes both English prompts and machine-translated prompts. It is designed to support research on cross-lingual multi-task prompt tuning. The xP3 dataset covers tasks including translation, simplification, program synthesis, and various code-related dataset tasks, with the goal of boosting performance on both English and non-English tasks. As an extension of the P3 dataset family, this composite dataset encompasses a diverse set of natural language and code-related multi-task learning tasks.
提供机构:
BigScience Workshop



