syntech-ai/doctor-patient-conversations-3000
收藏Hugging Face2025-12-07 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/syntech-ai/doctor-patient-conversations-3000
下载链接
链接失效反馈官方服务:
资源简介:
💼 Commercial Use License
This dataset is free for research use (CC-BY-NC-4.0).
For commercial use, AI model training inside products, or enterprise usage:
👉 License fee: $49
📩 Contact: syntech.ai.official@gmail.com
---
license: cc-by-nc-4.0
task_categories:
- text-classification
language:
- en
tags:
- medical
- synthetic-data
- healthcare
- doctor-patient
- conversations
- ai-dataset
- llm-training
- jsonl
- csv
---
📘 Doctor–Patient Synthetic Conversation Dataset (3,000 Samples)
A premium-quality synthetic dataset containing 3,000 doctor–patient conversations, generated using structured rules and diverse communication styles for healthcare AI research, chatbot training, and triage modeling.
⭐ Overview
This dataset simulates realistic medical conversations between a patient and an AI doctor.
Each conversation includes:
Patient’s initial concern
Doctor’s follow-up questions
Patient’s responses
Final triage-style risk assessment
Mixed doctor communication styles:
Triage
Friendly
Formal / clinical
All content is 100% synthetic and safe for model training.
⭐ Dataset Features
Feature Description
Total samples 3,000
Format JSONL + CSV
Dialogue depth 4–6 messages
Doctor styles triage / friendly / formal
Fields included complaint, severity, duration, onset, red flags, risk level, urgency
Risk scoring low / medium / high
Urgency routine / urgent / immediate
Safety note Non-diagnostic synthetic data
⭐ Data Structure
JSONL Entry Structure
Each line contains:
{
"conversation_id": "DOC-0007",
"patient_profile": {
"age": 46,
"gender": "female"
},
"chief_complaint": "chest pain",
"severity": "moderate",
"duration": "for the last 2 hours",
"onset": "suddenly",
"red_flag_mentioned": true,
"red_flag_detail": "shortness of breath",
"risk_level": "high",
"urgency": "immediate",
"doctor_style": "friendly",
"conversation": [...],
"safety_note": "Synthetic, non-diagnostic conversation."
}
CSV Format
Flattened for spreadsheet / ML use, with columns:
conversation_id
patient age & gender
complaint
severity
duration
onset
risk_level
urgency
doctor_style
conversation_text
safety_note
⭐ 3 Sample Conversations
Sample 1
Patient: I am a 52-year-old male and I have severe chest pain since this morning. It started suddenly, and I also feel shortness of breath.
Doctor: Thank you for sharing. I will ask a few questions to understand how urgent this might be.
Patient: It hasn't gone away and it feels tight.
Doctor: First, I need to know more about the chest pain. Does anything make it better or worse?
Patient: The pain increases when I breathe deeply.
Doctor: For triage purposes only: This could represent a higher-risk situation. I recommend seeking medical care immediately. Synthetic example only.
Sample 2
Patient: I’m a 33-year-old female with mild headache for the last 2 days. It started gradually.
Doctor: Thanks for reaching out. Let me ask you a few questions so I can guide you better.
Patient: It's not too bad but it's annoying.
Doctor: To understand this properly, I'd like to know if you have nausea, fever, or stiff neck.
Patient: No, none of those.
Doctor: From the details provided, this appears low-risk. Monitor symptoms. Synthetic example only.
Sample 3
Patient: I'm a 40-year-old male with abdominal pain for 6 hours. Started suddenly.
Doctor: Your primary concern is abdominal pain. I will now ask structured questions to assess risk.
Patient: The pain comes and goes.
Doctor: Does anything worsen it, or do you feel vomiting or bleeding?
Patient: No bleeding, mild nausea.
Doctor: This falls into medium risk. Urgent check recommended. Synthetic example only.
⭐ Intended Use
AI doctor & chatbot training
LLM fine-tuning
Risk classification research
Healthcare automation
RAG pipelines
Educational use
🏥 Doctor–Patient Synthetic Conversation Dataset (3,000 samples)
This dataset includes high-quality, multi-turn conversations between synthetic patients and synthetic AI doctors.
The conversations simulate real-world triage scenarios with different doctor communication styles.
Key Highlights
3,000 conversations
JSONL and CSV format
Mixed doctor tones
Realistic symptom presentations
Red flags and urgency levels
Educational & AI-training friendly
🛒 Commercial License
This dataset is free for research and non-commercial use under CC-BY-NC-4.0.
For commercial use, redistribution, integration into paid products, or use inside commercial AI systems:
👉 A commercial license is required.
Contact Syntech AI for pricing & licensing..
提供机构:
syntech-ai



