ahmedsamirio/oasst2-9k-translation
收藏Hugging Face2024-07-13 更新2024-07-13 收录
下载链接:
https://hf-mirror.com/datasets/ahmedsamirio/oasst2-9k-translation
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含从OpenAssistant/oasst2数据集中随机抽取的9.45k条消息,并使用GPT-4o翻译成现代标准阿拉伯语和埃及阿拉伯语。数据集由9.45k行和3列组成,每行包括:en(原始英文消息)、ar(现代标准阿拉伯语翻译)、eg(埃及阿拉伯语翻译)。
This dataset consists of a random subset of 9.45k messages from the OpenAssistant/oasst2 dataset, translated into Modern Standard Arabic and Egyptian Arabic. The dataset contains 9.45k rows and 3 columns, with each row including the original English message, the translation in Modern Standard Arabic, and the translation in Egyptian Arabic. The main use of this dataset is for translation between English, Modern Standard Arabic, and Egyptian Arabic. One limitation of the dataset is that it sampled messages instead of entire conversations, which prevents its use in instruction finetuning using the translated languages, but this limitation will be addressed in the extended version of the dataset.
提供机构:
ahmedsamirio



