Khyatimirani/endometriosis-clinical-knowledge-base
收藏Hugging Face2026-03-27 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Khyatimirani/endometriosis-clinical-knowledge-base
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- question-answering
language:
- en
tags:
- medical
- women-health
- endo-guide
- hormonal-health
size_categories:
- n<1K
---
# Endometriosis Patient–Assistant Conversations Dataset
## 📌 Overview
This dataset contains structured conversational data simulating interactions between patients and a clinical AI assistant focused on endometriosis.
The goal of this dataset is to support the development of AI systems that provide accurate, empathetic, and guideline-aligned responses to women navigating endometriosis symptoms, diagnosis, and treatment.
The conversations are designed to reflect real-world patient concerns, including symptom confusion, delayed diagnosis, fertility anxiety, and treatment decisions.
---
## 📚 Data Source
The dataset is generated based on the following clinical resource:
- :contentReference[oaicite:0]{index=0}
Original source: https://www.draliabadi.com/gynecology/endometriosis/
This resource provides a comprehensive overview of:
- Symptoms and clinical presentation
- Risk factors
- Diagnostic process
- Underlying mechanisms
- Fertility implications
- Treatment approaches
---
## 🧠 Dataset Creation
The dataset was synthetically generated using a large language model, grounded strictly in the above clinical guideline.
### Methodology
- The guideline was used as the **single source of truth**
- Patient questions were reverse-engineered based on realistic scenarios
- The assistant responses were constrained to:
- Use only information present in the source
- Avoid hallucinations or unsupported claims
- Maintain clinical accuracy with simple explanations
### Conversation Design
Each conversation represents:
- A patient with no medical background
- Natural language queries (symptoms, confusion, emotional concerns)
- A clinical assistant providing:
- Evidence-based responses
- Simplified explanations of medical concepts
- Contextual reassurance without overpromising
---
## 💬 Data Format
The dataset is in JSONL format using a chat template structure:
Each conversation consists of a single Q&A pair.
---
## 🧾 Example
`{"role": "user", "content": "I’ve been having really painful periods and bloating every month. Is this normal?"}
{"role": "assistant", "content": "Severe pain during periods is not considered normal. In conditions like endometriosis, common symptoms include pelvic pain, heavy bleeding, and bloating. In fact, many women report bloating as a primary symptom. If this is affecting your daily life, it may be worth getting evaluated."}`
---
## 🎯 Intended Use
This dataset can be used for:
- Training conversational AI assistants in women's health
- Fine-tuning LLMs for clinical Q&A
- Building AI companions for symptom awareness
- Evaluating model performance in healthcare conversations
---
## ⚠️ Limitations
- The dataset is based on a single clinical source and may not cover all perspectives
- It is synthetically generated and does not contain real patient data
- Not intended for direct medical diagnosis or treatment recommendations
- Should be used as a supportive informational tool, not a replacement for professional care
---
## 🤝 Ethical Considerations
- No personal or identifiable data is included
- Designed to promote awareness, not self-diagnosis
- Encourages seeking professional medical advice when needed
---
## 📬 Contact
For more information or collaboration inquiries, please reach out.
khyatimirani77@gmail.com
---
许可证:Apache-2.0
任务类别:问答
语言:英语
标签:医疗、女性健康、子宫内膜异位症指南、激素健康
数据规模:n<1K(少于1000条)
---
# 子宫内膜异位症患者-助手对话数据集
## 📌 数据集概览
本数据集包含结构化对话数据,模拟罹患子宫内膜异位症(Endometriosis)的患者与临床AI智能体(AI Agent)之间的交互场景。本数据集的研发目标是支持开发能够为子宫内膜异位症患者提供准确、共情且符合临床指南回应的AI系统,帮助女性应对相关症状、诊断流程与治疗方案选择。
对话内容旨在还原真实世界中的患者关切,包括症状认知困惑、诊断延迟、生育焦虑以及治疗决策考量。
## 📚 数据来源
本数据集基于以下临床资源生成:
- :contentReference[oaicite:0]{index=0}
原始资源链接:https://www.draliabadi.com/gynecology/endometriosis/
该资源全面涵盖以下内容:
- 症状与临床表现
- 风险因素
- 诊断流程
- 潜在发病机制
- 生育影响
- 治疗方案
## 🧠 数据集构建
本数据集基于上述临床指南,通过大语言模型(Large Language Model,LLM)合成生成。
### 方法学
- 以该指南作为**唯一可信来源**
- 基于真实临床场景反向构建患者提问
- 助手回应需遵循以下约束:
1. 仅使用源资源中包含的信息
2. 避免产生幻觉或无依据的表述
3. 保持临床准确性的同时以通俗语言解释医学概念
### 对话设计
每一段对话均代表:
- 无医学背景的普通患者
- 使用自然语言提出的疑问(涉及症状、认知困惑、情绪关切等)
- 临床助手提供的:
- 基于证据的专业回应
- 医学概念的通俗化解释
- 贴合场景的安抚性表述,且不做出过度承诺
## 💬 数据格式
本数据集采用JSONL格式,遵循对话模板结构:
每一段对话均为单组问答对。
## 🧾 示例
`{"role": "user", "content": "我每月都会经历严重的经期疼痛与腹胀,这属于正常情况吗?"}
{"role": "assistant", "content": "经期剧烈疼痛不属于正常生理现象。子宫内膜异位症等病症的常见症状包括盆腔疼痛、经量过多以及腹胀。事实上,不少女性会将腹胀作为首要症状。若该情况已影响您的日常生活,建议及时就医评估。"}`
## 🎯 预期用途
本数据集可用于:
- 训练女性健康领域的对话式AI智能体
- 针对临床问答场景对大语言模型(LLM)进行微调
- 开发用于症状认知的AI陪伴工具
- 评估医疗对话场景下的模型性能
## ⚠️ 局限性
- 本数据集仅基于单一临床资源生成,可能未覆盖所有视角
- 为合成生成数据,不包含真实患者信息
- 不得用于直接的医疗诊断或治疗方案推荐
- 仅可作为辅助信息工具,不可替代专业医疗护理
## 🤝 伦理考量
- 未包含任何个人身份识别信息
- 旨在提升疾病认知,而非用于自我诊断
- 鼓励有需要的患者寻求专业医疗建议
## 📬 联系方式
如需获取更多信息或洽谈合作,请联系:
khyatimirani77@gmail.com
提供机构:
Khyatimirani



