弱智吧R1输出集
收藏魔搭社区2026-06-05 更新2025-02-08 收录
下载链接:
https://modelscope.cn/datasets/nullskymc/ruozhiba_R1
下载链接
链接失效反馈官方服务:
资源简介:
## 数据集描述
### 数据集简介
使用 ruozhiba 数据集随机2000条数据用Deepseek-R1做了问答,旨在研究可能的后续微调模型任务。
(为什么只有2000条,白嫖不到多的,而且并发好慢)
数据集为 Alpaca 格式,所有思考内容均以 *\<think>\</think>* 标签输出。
## 数据集的格式和结构
### 数据格式
对数据的格式进行描述,包括数据的schema,以及提供必要的数据样本示范。本数据集采用 Alpaca 格式,每条数据包含 'instruction'(指令)、'input'(输入,可选)和 'output'(输出)。示例如下:
```
{
"instruction": "请介绍一下Alpaca数据格式。",
"input": "",
"output": "Alpaca数据格式包含三个主要字段:'instruction','input' 和 'output',其中'instruction'表示指令,'input'是可选的输入内容,'output'是相应的输出结果。"
}
```
## 数据集版权信息
本数据集从 [AI-ModelScope/ruozhiba](https://www.modelscope.cn/datasets/AI-ModelScope/ruozhiba) 中随机选取了2000条
## Dataset Description
### Dataset Overview
We randomly selected 2000 samples from the ruozhiba dataset and conducted question-answering tasks using Deepseek-R1, aiming to explore potential subsequent model fine-tuning tasks. (Only 2000 samples were used because we could not obtain more samples for free, and the concurrent processing speed was relatively slow.)
This dataset follows the Alpaca format, and all thinking content is enclosed within the *<think></think>* tags.
## Dataset Format and Structure
### Data Format
This section describes the data format, including its schema and necessary sample demonstrations. This dataset adopts the Alpaca format, where each entry contains three fields: 'instruction' (the core task instruction), 'input' (optional auxiliary input), and 'output' (the corresponding model response). The example is shown below:
json
{
"instruction": "Please introduce the Alpaca data format.",
"input": "",
"output": "The Alpaca data format consists of three core fields: 'instruction', 'input', and 'output'. 'instruction' represents the core task instruction, 'input' is optional auxiliary input content, and 'output' is the corresponding response result."
}
## Dataset Copyright Information
This dataset includes 2000 randomly selected samples sourced from [AI-ModelScope/ruozhiba](https://www.modelscope.cn/datasets/AI-ModelScope/ruozhiba).
提供机构:
maas
创建时间:
2025-02-04
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是从ruozhiba数据集中随机选取的2000条记录,用于Deepseek-R1的问答会话,采用Alpaca格式,包含指令、输入和输出三个字段,适合用于模型微调任务。
以上内容由遇见数据集搜集并总结生成



