turkish-nlp-suite/InstrucTurca
收藏Hugging Face2024-08-12 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/turkish-nlp-suite/InstrucTurca
下载链接
链接失效反馈官方服务:
资源简介:
InstrucTurca是一个为土耳其语大语言模型(LLMs)设计的多样化指令调优数据集,包含了来自多个领域的任务、代码、诗歌、数学、文章、医学文本等。该数据集适用于多种NLP任务,如摘要生成、问答、文本生成、翻译和分类。数据集是通过翻译多个英文数据集和资源创建的,使用了Snowflake Artic Instruct进行翻译,并进行了后处理以消除翻译中的幻觉。数据集适用于商业用途,遵循Snowflake Arctic的Apache 2.0许可证。
InstrucTurca is a rich Turkish instruction tuning dataset from various fields, including tasks, code, poems, math, essays, medical texts, and more. This diversity makes it ideal for various NLP tasks such as summarization, question answering, generation, translation, and classification. The main usage is intended for instruction training of Turkish LLMs. The dataset includes compiled data from various English datasets and sources, translated and processed by Snowflake Artic Instruct to eliminate potential hallucination translations.
提供机构:
turkish-nlp-suite



