five

ahmedsamirio/oasst2-9k-translation

收藏
Hugging Face2024-07-13 更新2024-07-13 收录
下载链接:
https://hf-mirror.com/datasets/ahmedsamirio/oasst2-9k-translation
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含从OpenAssistant/oasst2数据集中随机抽取的9.45k条消息,并使用GPT-4o翻译成现代标准阿拉伯语和埃及阿拉伯语。数据集由9.45k行和3列组成,每行包括:en(原始英文消息)、ar(现代标准阿拉伯语翻译)、eg(埃及阿拉伯语翻译)。

This dataset consists of a random subset of 9.45k messages from the OpenAssistant/oasst2 dataset, translated into Modern Standard Arabic and Egyptian Arabic. The dataset contains 9.45k rows and 3 columns, with each row including the original English message, the translation in Modern Standard Arabic, and the translation in Egyptian Arabic. The main use of this dataset is for translation between English, Modern Standard Arabic, and Egyptian Arabic. One limitation of the dataset is that it sampled messages instead of entire conversations, which prevents its use in instruction finetuning using the translated languages, but this limitation will be addressed in the extended version of the dataset.
提供机构:
ahmedsamirio
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作