five

trivitaai/Grouth_Truth_Conversation

收藏
Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/trivitaai/Grouth_Truth_Conversation
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en - vi license: cc-by-4.0 task_categories: - text-generation - question-answering tags: - medical - healthcare - multi-turn - doctor-patient - ground-truth - evaluation - vietnamese pretty_name: Ground Truth Healthcare Conversations size_categories: - n<1K configs: - config_name: curated_30 data_files: - split: train path: data/curated_30_v2.jsonl - config_name: full_473 data_files: - split: train path: data/curated_all.jsonl - config_name: tier1_with_clinical_notes data_files: - split: train path: data/curated_30_ground_truth.jsonl - config_name: vietnamese_human_ai data_files: - split: train path: data/hellobacsi_vi_50.jsonl --- # Ground Truth Healthcare Conversations Real-life multi-turn healthcare conversations for model evaluation and development reference. Used as ground truth for the **SOCRATES** medical AI project by Trivita AI. ## Dataset Viewer Subsets | Subset | File | Count | Language | Description | |--------|------|-------|----------|-------------| | `curated_30` | `curated_30_v2.jsonl` | **30** | EN | Best 30 conversations across 5 sources — primary eval set | | `full_473` | `curated_all.jsonl` | **473** | EN | Full English collection | | `tier1_with_clinical_notes` | `curated_30_ground_truth.jsonl` | **30** | EN | MTS-Dialog + ACI-Bench + PriMock57 with clinical note ground truth | | `vietnamese_human_ai` | `hellobacsi_vi_50.jsonl` | **50** | **VI** | Vietnamese patient questions answered by Hello Bacsi AI | ## Sources | Source | Type | Language | Link | |--------|------|----------|------| | MTS-Dialog | Real doctor–patient + clinical note | EN | https://github.com/abachaa/MTS-Dialog | | ACI-Bench | Real clinical encounter + SOAP note | EN | https://github.com/wyim/aci-bench | | PriMock57 | Mock GP consultation (audio+transcript) | EN | https://github.com/babylonhealth/priMock57 | | r/AskDocs | Patient Q&A with verified doctor replies | EN | https://reddit.com/r/AskDocs | | LMSYS-Chat-1M | Human–LLM, health filtered | EN | https://huggingface.co/datasets/lmsys/lmsys-chat-1m | | hellobacsi.com | Vietnamese patient Q&A, AI replies | VI | https://hellobacsi.com/community | ## Format ```json { "id": "mts_0042", "source": "mts_dialog | aci_bench | primock57 | reddit_askdocs | lmsys_chat_1m | hellobacsi_community", "type": "doctor_patient | patient_verified_doctor | human_ai | human_ai_vi", "language": "en | vi", "turns": [ {"role": "doctor", "content": "What brings you in today?"}, {"role": "patient", "content": "I have had chest pain for 3 days..."} ], "ground_truth": { "clinical_note": "Patient presents with...", "section_header": "GENHX" }, "metadata": {"num_turns": 14} } ``` ## Usage ```python from datasets import load_dataset ds = load_dataset("trivitaai/Grouth_Truth_Conversation", name="curated_30") ds_vi = load_dataset("trivitaai/Grouth_Truth_Conversation", name="vietnamese_human_ai") ds_all = load_dataset("trivitaai/Grouth_Truth_Conversation", name="full_473") ds_notes = load_dataset("trivitaai/Grouth_Truth_Conversation", name="tier1_with_clinical_notes") ```
提供机构:
trivitaai
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作