xmanii/mauxi-mix-persian
收藏Hugging Face2024-11-30 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/xmanii/mauxi-mix-persian
下载链接
链接失效反馈官方服务:
资源简介:
MauxiMix是一个包含1000个高质量波斯语对话的数据集,专门用于训练和微调大型语言模型(LLMs)。该数据集通过高级语言模型从SmolTalk数据集中翻译而来,并经过精心筛选和分类。每个对话都包含用户和助手的角色信息,并按照特定主题和难度级别进行分类。数据集的结构和特征在README中也有详细描述,包括对话的格式、类别和难度级别。此外,README还提到了数据集的使用案例、技术细节、引用信息、目标和致谢。
MauxiMix is a carefully curated dataset of 1,000 high-quality Persian conversations, translated from the SmolTalk dataset using advanced language models. This dataset is specifically designed for training and fine-tuning Large Language Models (LLMs) with Supervised Fine-Tuning (SFT) techniques, contributing to the development of open-source Persian language models. The dataset contains 1,000 professionally translated Persian conversations, categorized by specific topics and difficulty levels, in a user/assistant role format, with high-quality translations suitable for LLM training and fine-tuning. The dataset is expanding to 10,000 conversations.
提供机构:
xmanii



