airesearch/WangchanThaiInstruct_7.24
收藏Hugging Face2024-09-02 更新2024-07-06 收录
下载链接:
https://hf-mirror.com/datasets/airesearch/WangchanThaiInstruct_7.24
下载链接
链接失效反馈官方服务:
资源简介:
这是一个100%人工标注的泰语指令数据集(第一批发布),涵盖了医疗、金融、零售和法律四个领域。数据集包含七个任务:摘要、开放问答、封闭问答、分类、创意写作、头脑风暴和多项选择问答。每个数据行根据其来源有不同的许可证,用户需遵守相关许可证条款。数据集的特征包括ID、领域、指令、输入、输出、泰语特定信息、标签、任务类型和许可证。训练集包含5014个例子,总大小为40496223字节,下载大小为12484646字节。
This dataset is a 100% human-annotated Thai instruction dataset, released for the first time. It covers four domains: medical, finance, retail, and legal, and involves seven types of tasks including summarization, open QA, close QA, classification, creative writing, brainstorming, and multiple-choice QA. The dataset features include ID, domain, instruction, input, output, Thai specific information, tags, task type, and license. The dataset is divided into a training set with 5014 samples.
提供机构:
airesearch



