five

Kabatubare/medical-guanaco-3000

收藏
Hugging Face2023-10-30 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Kabatubare/medical-guanaco-3000
下载链接
链接失效反馈
官方服务:
资源简介:
--- title: Reduced Medical Q&A Dataset language: en license: unknown tags: - healthcare - Q&A - NLP - dialogues pretty_name: Medical Q&A Dataset --- # Dataset Card for Reduced Medical Q&A Dataset This dataset card provides comprehensive details about the Reduced Medical Q&A Dataset, which is a curated and balanced subset aimed for healthcare dialogues and medical NLP research. ## Dataset Details ### Dataset Description The Reduced Medical Q&A Dataset is derived from a specialized subset of the larger MedDialog collection. It focuses on healthcare dialogues between doctors and patients from sources like WebMD, Icliniq, HealthcareMagic, and HealthTap. The dataset contains approximately 3,000 rows and is intended for a variety of applications such as NLP research, healthcare chatbot development, and medical information retrieval. - **Curated by:** Unknown (originally from MedDialog) - **Funded by [optional]:** N/A - **Shared by [optional]:** N/A - **Language(s) (NLP):** English - **License:** Unknown (assumed for educational/research use) ### Dataset Sources [optional] - **Repository:** N/A - **Paper [optional]:** N/A - **Demo [optional]:** N/A ## Uses ### Direct Use - NLP research in healthcare dialogues - Development of healthcare question-answering systems - Medical information retrieval ### Out-of-Scope Use - Not a substitute for certified medical advice - Exercise caution in critical healthcare applications ## Dataset Structure Each entry in the dataset follows the structure: "### Human:\n[Human's text]\n\n### Assistant: [Assistant's text]" ## Dataset Creation ### Curation Rationale The dataset was curated to create a balanced set of medical Q&A pairs using keyword-based sampling to cover a wide range of medical topics. ### Source Data #### Data Collection and Processing The data is text-based, primarily in English, and was curated from the larger "Medical" dataset featuring dialogues from Icliniq, HealthcareMagic, and HealthTap. #### Who are the source data producers? The original data was produced by healthcare professionals and patients engaging in medical dialogues on platforms like Icliniq, HealthcareMagic, and HealthTap. ### Annotations [optional] No additional annotations; the dataset is text-based. ## Bias, Risks, and Limitations - The dataset is not a substitute for professional medical advice. - It is designed for research and educational purposes only. ### Recommendations Users should exercise caution and understand the limitations when using the dataset for critical healthcare applications. ## Citation [optional] N/A ## Glossary [optional] N/A ## More Information [optional] N/A ## Dataset Card Authors [optional] N/A ## Dataset Card Contact N/A
提供机构:
Kabatubare
原始信息汇总

Reduced Medical Q&A Dataset 数据集概述

数据集描述

Reduced Medical Q&A Dataset 是从 MedDialog 集合的一个专门子集中派生出来的,专注于医生和患者之间的医疗对话。数据集包含约 3,000 条记录,适用于 NLP 研究、医疗聊天机器人开发和医学信息检索等多种应用。

  • 语言: 英语
  • 许可: 未知(假设用于教育/研究用途)

数据来源

数据集主要从 Icliniq、HealthcareMagic 和 HealthTap 等平台上的医疗对话中筛选出来。

使用场景

直接使用

  • 医疗对话的 NLP 研究
  • 医疗问答系统开发
  • 医学信息检索

非适用场景

  • 不能替代专业医疗建议
  • 在关键医疗应用中需谨慎使用

数据集结构

每条记录的结构为:"### Human: [Humans text]

Assistant: [Assistants text]"

数据集创建

筛选理由

数据集通过关键词抽样,旨在创建一个涵盖广泛医疗主题的平衡的医疗问答对集合。

源数据

数据主要为英语文本,源自 Icliniq、HealthcareMagic 和 HealthTap 等平台上的医疗对话。

源数据生产者

原始数据由医疗专业人员和患者在 Icliniq、HealthcareMagic 和 HealthTap 等平台上进行医疗对话产生。

偏差、风险和限制

  • 数据集不能替代专业医疗建议。
  • 仅设计用于研究和教育目的。

建议

用户在使用数据集进行关键医疗应用时应谨慎并理解其限制。

搜集汇总
数据集介绍
main_image_url
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作