five

synthetic-healthcare-admissions

收藏
魔搭社区2025-12-05 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/syncora/synthetic-healthcare-admissions
下载链接
链接失效反馈
官方服务:
资源简介:
# Synthetic Healthcare Admissions Dataset ### A fully synthetic **healthcare dataset** for building AI solutions in healthcare, developed using Syncora.ai. --- ## ✅ What's in This Repo? - ✅ **Healthcare Dataset (CSV)** → [Download Here](https://huggingface.co/datasets/syncora/synthetic-healthcare-admissions/blob/main/Healthcare_Syncora_Synthetic%201.csv) - ✅ **Example Jupyter Notebook** → [Open Notebook](https://huggingface.co/datasets/syncora/synthetic-healthcare-admissions/blob/main/Healthcare_Syncora_Synthetic_1%20(1).ipynb) - ✅ **Use cases** --- ## 📘 About This Dataset This **synthetic healthcare dataset** simulates hospital admission records including demographics, billing, medications, and lab results. It is **100% synthetic**, ensuring privacy and regulatory compliance for developers, healthcare institutes, and those training LLM models. **Why use this dataset?** - Explore **predictive modeling in healthcare** - Build **dataset for LLM training** for clinical conversations - Safely **generate synthetic data** without exposing real patient info --- ## 🔍 Dataset Snapshot | Column | Description | |--------------------|----------------------------------------------------| | **Age** | Patient age in years | | **Gender** | 0 = Female, 1 = Male | | **Blood Type** | Encoded blood group category (0–7) | | **Medical Condition** | Encoded diagnosis category | | **Billing Amount** | Hospital billing in USD | | **Admission Type** | 0 = Emergency, 1 = Urgent, 2 = Elective | | **Medication** | Encoded medication type | | **Test Results** | Encoded lab test result category | **Example row:** `80, 1, 7, 0, 37303.07, 0, 0, 0` --- ## ✅ Use Cases This **healthcare dataset** is ideal for: - 🏥 **Predictive Healthcare Analytics** – Predict billing amount, admission type, or risk scores - 💊 **Medication Optimization Models** – Analyze treatment outcomes - 🗣 **Healthcare Chatbots** – Train conversational LLMs on realistic medical workflows - 📊 **Cost Forecasting** – Estimate hospital expenses - 🧠 **Dataset for LLM Training** – Fine-tune models for clinical Q&A or triage --- ## 🚀 Generate Your Own Synthetic Data Need custom scenarios? Use our tool to generate synthetic data tailored to your requirements: 👉 [**Generate your own Synthetic Data Now**](https://huggingface.co/spaces/syncora/synthetic-generation) --- ## ⚡ Quick Start ```python from datasets import load_dataset dataset = load_dataset("syncora/synthetic-healthcare-admissions") df = dataset["train"].to_pandas() print(df.head())

# 合成医疗入院数据集 ### 完全合成的医疗数据集(healthcare dataset),依托Syncora.ai开发,用于构建医疗领域人工智能解决方案。 --- ## ✅ 仓库内容概览 - ✅ **医疗数据集(CSV格式)** → [点击下载](https://huggingface.co/datasets/syncora/synthetic-healthcare-admissions/blob/main/Healthcare_Syncora_Synthetic%201.csv) - ✅ **示例Jupyter Notebook** → [打开 Notebook](https://huggingface.co/datasets/syncora/synthetic-healthcare-admissions/blob/main/Healthcare_Syncora_Synthetic_1%20(1).ipynb) - ✅ **应用场景** --- ## 📘 数据集简介 本**合成医疗数据集(synthetic healthcare dataset)**可模拟医院入院记录,涵盖人口统计学信息、账单明细、用药情况与实验室检测结果。 该数据集为100%合成生成,可保障开发者、医疗机构以及训练大语言模型(Large Language Model,LLM)的人员的数据隐私与合规性。 **为何选择本数据集?** - 开展医疗领域预测建模研究 - 构建用于临床对话场景的大语言模型训练数据集 - 在不泄露真实患者信息的前提下,安全生成合成数据 --- ## 🔍 数据集快照 | 列名 | 描述说明 | |--------------------|----------------------------------------------------| | **年龄(Age)** | 患者年龄(单位:岁) | | **性别(Gender)** | 0 = 女性,1 = 男性 | | **血型(Blood Type)** | 编码后的血型分类(取值范围0~7) | | **病症类型(Medical Condition)** | 编码后的诊断分类 | | **账单金额(Billing Amount)** | 医院账单金额(单位:美元) | | **入院类型(Admission Type)** | 0 = 急诊,1 = 加急,2 = 择期 | | **用药类型(Medication)** | 编码后的用药类别 | | **检测结果(Test Results)** | 编码后的实验室检测结果分类 | **示例数据行:** `80, 1, 7, 0, 37303.07, 0, 0, 0` --- ## ✅ 典型应用场景 本医疗数据集适用于以下场景: - 🏥 **预测性医疗分析** – 预测账单金额、入院类型或风险评分 - 💊 **用药优化模型** – 分析治疗效果 - 🗣 **医疗聊天机器人** – 基于真实医疗流程训练对话式大语言模型 - 📊 **成本预测** – 估算医院运营开支 - 🧠 **大语言模型训练数据集** – 微调模型以支持临床问答或分诊服务 --- ## 🚀 生成自定义合成数据 需要定制化场景?可使用我们的工具生成符合您需求的合成数据: 👉 [**立即生成专属合成数据**](https://huggingface.co/spaces/syncora/synthetic-generation) --- ## ⚡ 快速入门 python from datasets import load_dataset dataset = load_dataset("syncora/synthetic-healthcare-admissions") df = dataset["train"].to_pandas() print(df.head())
提供机构:
maas
创建时间:
2025-08-31
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集是一个完全合成的医疗数据集,模拟医院入院记录,涵盖年龄、性别、医疗状况、账单金额等字段,旨在支持医疗AI开发、预测分析和LLM训练,同时确保数据隐私和合规性。它适用于医疗聊天机器人、成本预测和药物优化等多种应用场景。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作