pythainlp/thai-local-instruction-v2
收藏Hugging Face2025-08-01 更新2025-08-09 收录
下载链接:
https://hf-mirror.com/datasets/pythainlp/thai-local-instruction-v2
下载链接
链接失效反馈官方服务:
资源简介:
泰语本地指令数据集v2是一个包含泰语四种地方语言(โคราช、ปักษ์ใต้หรือภาษาใต้、เหนือหรือภาษาคำเมือง、อีสาน)的指令数据集。该数据集包含输入和目标字符串,适用于文本生成任务。数据集分为训练集,共有39829个样本。数据来源于多个网站,包括泰语维基词典、伊桑语言俱乐部和pythainlp的泰语本地语言翻译数据集。
Thai local language instruction dataset v2 is a dataset containing instructions in four Thai local languages (korat, pattani, khummuang, isan). The dataset includes input and target strings, suitable for text generation tasks. It is split into a training set with a total of 39,829 samples. The data sources include Thai Wiktionary, Isan Language Club, and the Thai local language translation dataset from pythainlp.
提供机构:
pythainlp



