精致的医疗r1数据
收藏魔搭社区2026-06-06 更新2025-04-26 收录
下载链接:
https://modelscope.cn/datasets/krisfu/delicate_medical_r1_data
下载链接
链接失效反馈官方服务:
资源简介:
# 数据:
基于华佗开源的高质量语料库
# 技术链路:
多智能体 + 数据进化 + 推理过程生成 + 推理过程验证过滤
# 多智能体
利用metagpt搭建的造数据的workflow
# 数据进化
改造 self-instruct 进行数据进化
# 推理过程生成
使用 qwq 模型进行每 query 10次 think 过程的生成
# 推理过程验证过滤所使用的指标
## 质量评估
###### 2.1.1. 召回率
首先经过step_partition步骤后,得到了带有N步推理的思维链。使用llm找到其中的关键
步骤,然后依次判断这些关键步骤解决的问题或陈述的事实是否在出现在真实答案中。将出现的个数除以总个数,
即可得到召回率分值。
###### 2.1.2. 精确率
首先经过step_partition步骤后,得到了带有N步推理的思维链。使用llm,以真实答案为
基准对每个步骤进行正确性的评估。将正确的个数除以总个数,即可得到精确率。
###### 2.1.3. F1值
通过前述计算得到的召回率和精确率,即可计算得出F1值,即
F1 = 2 * 精确率 * 召回率 / (精确率 + 召回率)
# 模型部署:
智能体用的qwen-72b-awq
推理过程生成用的qwq-32b
推理过程验证过滤用的qwen-72b-awq
# Data
Based on Huatuo's open-source high-quality corpus
# Technology Pipeline
Multi-Agent + Data Evolution + Reasoning Process Generation + Reasoning Process Validation and Filtering
# Multi-Agent
A data generation workflow built with MetaGPT
# Data Evolution
Modified self-instruct for data evolution
# Reasoning Process Generation
Use the QWQ model to generate 10 reasoning thinking processes per query
# Metrics for Reasoning Process Validation and Filtering
## Quality Assessment
###### 2.1.1. Recall Rate
First, after the `step_partition` step, a chain of thought with N-step reasoning is obtained. An LLM is utilized to identify the key steps within it, then sequentially judge whether the problems solved or facts stated by these key steps appear in the ground-truth answer. The recall score is calculated by dividing the number of matched cases by the total number of key steps.
###### 2.1.2. Precision
First, after the `step_partition` step, a chain of thought with N-step reasoning is obtained. An LLM is used to evaluate the correctness of each step, taking the ground-truth answer as the benchmark. The precision score is calculated by dividing the number of correct steps by the total number of key steps.
###### 2.1.3. F1 Score
The F1 score can be computed using the recall and precision obtained from the preceding steps, following the formula:
F1 = 2 * Precision * Recall / (Precision + Recall)
# Model Deployment
The AI agent uses Qwen-72B-AWQ; the reasoning process generation module uses QWQ-32B; the reasoning process validation and filtering module uses Qwen-72B-AWQ
提供机构:
maas
创建时间:
2025-04-23
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集基于华佗开源的高质量语料库,采用多智能体协作和数据进化技术构建,专注于医疗领域。通过复杂的推理过程生成和严格的验证过滤机制,确保数据的高质量和可靠性。
以上内容由遇见数据集搜集并总结生成



