VISAI-AI/slimorca-th-translated
收藏Hugging Face2025-12-29 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/VISAI-AI/slimorca-th-translated
下载链接
链接失效反馈官方服务:
资源简介:
Slimorca TH Translated数据集是SlimOrca数据集的一个子集,使用Qwen3-30BA3B-Instruct-2507模型进行了英文到泰文的翻译。数据集包含对话内容,每个对话有四个字段:from表示对话来源,value_en表示英文内容,value_th表示泰文翻译内容,weight表示权重。数据集仅包含训练集,共有9865个样本。需要注意的是,翻译过程中有时模型会回答问题而不是进行翻译,因此在使用前需要对数据进行过滤和清洗。
Slimorca TH Translated is a subset of the SlimOrca dataset translated from English to Thai using the Qwen3-30BA3B-Instruct-2507 model. The dataset contains conversations with four fields: from indicating the source of the conversation, value_en for the English content, value_th for the Thai translation, and weight for the weight. The dataset includes only the training split with 9,865 examples. Note that the translation process sometimes results in the model answering the question instead of translating, so filtering and cleaning the data is necessary before use.
提供机构:
VISAI-AI



