ZeroAgency/ru-instruct-conversation-v1
收藏Hugging Face2025-03-18 更新2025-05-31 收录
下载链接:
https://hf-mirror.com/datasets/ZeroAgency/ru-instruct-conversation-v1
下载链接
链接失效反馈官方服务:
资源简介:
这是一个主要由俄语对话组成的综合数据集,适合用于大型语言模型的微调场景。总样本量为82208个。数据集通过simhash算法进行了去重处理,hamming距离阈值为3。数据集由以下数据集组合而成:IlyaGusev/saiga_scored、IlyaGusev/oasst2_ru_main_branch、attn-signs/kolmogorov-3和attn-signs/russian-easy-instructions。
A combined dataset of mostly Russian dialogs in the form of conversations, suitable for LLM fine-tuning scenarios. Total samples: 82208. Deduplicated using simhash with a hamming threshold of 3. Datasets used: IlyaGusev/saiga_scored (min_score: 8, no bad by regexp), IlyaGusev/oasst2_ru_main_branch, attn-signs/kolmogorov-3, attn-signs/russian-easy-instructions.
提供机构:
ZeroAgency



