five

ThaiSyntheticQA/WangchanThaiInstruct_Multi-turn_Conversation_Dataset

收藏
Hugging Face2024-07-30 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/ThaiSyntheticQA/WangchanThaiInstruct_Multi-turn_Conversation_Dataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-sa-4.0 task_categories: - text-generation language: - th tags: - synthetic - instruction-finetuning size_categories: - 1K<n<10K --- # WangchanThaiInstruct Multi-turn Conversation Dataset We create a Thai multi-turn conversation dataset from [airesearch/WangchanThaiInstruct (Batch 1)](https://huggingface.co/datasets/airesearch/WangchanThaiInstruct) by LLM. It was created from synthetic method using open source LLM in Thai language. ## Citation > Thammaleelakul, S., & Phatthiyaphaibun, W. (2024). WangchanThaiInstruct Multi-turn Conversation Dataset [Data set]. Zenodo. https://doi.org/10.5281/zenodo.13132633 or BibTeX ``` @dataset{thammaleelakul_2024_13132633, author = {Thammaleelakul, Sirapatch and Phatthiyaphaibun, Wannaphong}, title = {{WangchanThaiInstruct Multi-turn Conversation Dataset}}, month = jul, year = 2024, publisher = {Zenodo}, doi = {10.5281/zenodo.13132633}, url = {https://doi.org/10.5281/zenodo.13132633} } ```

许可证:CC BY-SA 4.0(知识共享署名-相同方式共享4.0协议) 任务类别:文本生成(text-generation) 语言:泰语(th) 标签:合成式(synthetic)、指令微调(instruction-finetuning) 样本量范围:1000 < 样本量 < 10000 # WangchanThaiInstruct多轮对话数据集 本数据集依托由大语言模型(Large Language Model, LLM)生成的原始数据集[airesearch/WangchanThaiInstruct(批次1)](https://huggingface.co/datasets/airesearch/WangchanThaiInstruct)构建,采用泰语开源大语言模型通过合成式方法生成。 ## 引用 > Thammaleelakul, S. 与 Phatthiyaphaibun, W. (2024). WangchanThaiInstruct多轮对话数据集[数据集]. Zenodo. https://doi.org/10.5281/zenodo.13132633 或采用BibTeX引用格式: bibtex @dataset{thammaleelakul_2024_13132633, author = {Thammaleelakul, Sirapatch and Phatthiyaphaibun, Wannaphong}, title = {{WangchanThaiInstruct Multi-turn Conversation Dataset}}, month = jul, year = 2024, publisher = {Zenodo}, doi = {10.5281/zenodo.13132633}, url = {https://doi.org/10.5281/zenodo.13132633} }
提供机构:
ThaiSyntheticQA
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作