five

efederici/mc-translation

收藏
Hugging Face2024-11-05 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/efederici/mc-translation
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含来自OpenAI的MMMLU数据集的专业人工翻译,用于训练翻译模型,以帮助翻译未来的评估数据集。翻译评估基准是一个关键但具有挑战性的任务。虽然自动翻译可能会引入错误或偏见,但专业人工翻译既昂贵又耗时。该数据集利用现有的专业翻译(MMMLU)来训练专门的翻译模型,以协助翻译未来的评估集。

This dataset contains professional human translations from OpenAIs MMMLU dataset, repurposed to train translation models that can help translate future evaluation datasets. The dataset supports multiple languages including English, Swahili, Spanish, German, Chinese, Bengali, Italian, Hindi, Japanese, Korean, Portuguese, Arabic, and Indonesian. The dataset features include prompt, output, input language, output language, and conversations. The dataset is divided into a training set containing 209,115 samples. The goal of the dataset is to address the challenges of translating evaluation benchmarks, leveraging professional human translations to train specialized translation models to assist in future translation tasks.
提供机构:
efederici
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作