ThaiSyntheticQA/WangchanThaiInstruct_Multi-turn_Conversation_Dataset
收藏Hugging Face2024-07-30 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/ThaiSyntheticQA/WangchanThaiInstruct_Multi-turn_Conversation_Dataset
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-sa-4.0
task_categories:
- text-generation
language:
- th
tags:
- synthetic
- instruction-finetuning
size_categories:
- 1K<n<10K
---
# WangchanThaiInstruct Multi-turn Conversation Dataset
We create a Thai multi-turn conversation dataset from [airesearch/WangchanThaiInstruct (Batch 1)](https://huggingface.co/datasets/airesearch/WangchanThaiInstruct) by LLM. It was created from synthetic method using open source LLM in Thai language.
## Citation
> Thammaleelakul, S., & Phatthiyaphaibun, W. (2024). WangchanThaiInstruct Multi-turn Conversation Dataset [Data set]. Zenodo. https://doi.org/10.5281/zenodo.13132633
or BibTeX
```
@dataset{thammaleelakul_2024_13132633,
author = {Thammaleelakul, Sirapatch and
Phatthiyaphaibun, Wannaphong},
title = {{WangchanThaiInstruct Multi-turn Conversation
Dataset}},
month = jul,
year = 2024,
publisher = {Zenodo},
doi = {10.5281/zenodo.13132633},
url = {https://doi.org/10.5281/zenodo.13132633}
}
```
许可证:CC BY-SA 4.0(知识共享署名-相同方式共享4.0协议)
任务类别:文本生成(text-generation)
语言:泰语(th)
标签:合成式(synthetic)、指令微调(instruction-finetuning)
样本量范围:1000 < 样本量 < 10000
# WangchanThaiInstruct多轮对话数据集
本数据集依托由大语言模型(Large Language Model, LLM)生成的原始数据集[airesearch/WangchanThaiInstruct(批次1)](https://huggingface.co/datasets/airesearch/WangchanThaiInstruct)构建,采用泰语开源大语言模型通过合成式方法生成。
## 引用
> Thammaleelakul, S. 与 Phatthiyaphaibun, W. (2024). WangchanThaiInstruct多轮对话数据集[数据集]. Zenodo. https://doi.org/10.5281/zenodo.13132633
或采用BibTeX引用格式:
bibtex
@dataset{thammaleelakul_2024_13132633,
author = {Thammaleelakul, Sirapatch and
Phatthiyaphaibun, Wannaphong},
title = {{WangchanThaiInstruct Multi-turn Conversation
Dataset}},
month = jul,
year = 2024,
publisher = {Zenodo},
doi = {10.5281/zenodo.13132633},
url = {https://doi.org/10.5281/zenodo.13132633}
}
提供机构:
ThaiSyntheticQA



