five

xmanii/mauxi-mix-persian

收藏
Hugging Face2024-11-30 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/xmanii/mauxi-mix-persian
下载链接
链接失效反馈
官方服务:
资源简介:
MauxiMix是一个包含1000个高质量波斯语对话的数据集,专门用于训练和微调大型语言模型(LLMs)。该数据集通过高级语言模型从SmolTalk数据集中翻译而来,并经过精心筛选和分类。每个对话都包含用户和助手的角色信息,并按照特定主题和难度级别进行分类。数据集的结构和特征在README中也有详细描述,包括对话的格式、类别和难度级别。此外,README还提到了数据集的使用案例、技术细节、引用信息、目标和致谢。

MauxiMix is a carefully curated dataset of 1,000 high-quality Persian conversations, translated from the SmolTalk dataset using advanced language models. This dataset is specifically designed for training and fine-tuning Large Language Models (LLMs) with Supervised Fine-Tuning (SFT) techniques, contributing to the development of open-source Persian language models. The dataset contains 1,000 professionally translated Persian conversations, categorized by specific topics and difficulty levels, in a user/assistant role format, with high-quality translations suitable for LLM training and fine-tuning. The dataset is expanding to 10,000 conversations.
提供机构:
xmanii
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作