five

kristaller486/Nebo-T1-Russian

收藏
Hugging Face2025-02-02 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/kristaller486/Nebo-T1-Russian
下载链接
链接失效反馈
官方服务:
资源简介:
Nebo-T1-Russian数据集可能是第一个为俄语设计的longCoT类型的数据集,通过Deeseek-R1创建。提示信息来自于Sky-T1数据集,并通过Llama3.3-70B模型进行了翻译。回答和推理由Deeseek-R1生成(685B)。总共包含16.4K个样本,其中约12.4K个样本只包含俄语。回答和推论的语种使用fasttext进行了标注。部分样本缺少推论的标记,这些样本在`correct_format`列中被标记。

Nebo-T1-Russian is (probably) the first longCoT dataset for the Russian language created via Deeseek-R1. The prompts are taken from the Sky-T1 dataset and translated via Llama3.3-70B. The answers and reasoning are generated by Deeseek-R1 (685B). There are 16.4K samples in total, with approximately 12.4K being Russian-only. The languages in the answers and reasoning are labeled using fasttext. Some samples (≈400) are missing reasoning markup tags, and these samples are labeled in the `correct_format` column.
提供机构:
kristaller486
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作