five

aisc-team-d2/healthsearchqa

收藏
Hugging Face2024-03-05 更新2024-06-22 收录
下载链接:
https://hf-mirror.com/datasets/aisc-team-d2/healthsearchqa
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: id dtype: float64 - name: question dtype: string splits: - name: train num_bytes: 170966 num_examples: 4436 download_size: 79303 dataset_size: 170966 configs: - config_name: default data_files: - split: train path: data/train-* license: unknown task_categories: - question-answering language: - en tags: - medical size_categories: - 1K<n<10K --- # HealthSearchQA Dataset of consumer health questions released by Google for the Med-PaLM paper ([arXiv preprint](https://arxiv.org/abs/2212.13138)). From the [paper](https://www.nature.com/articles/s41586-023-06291-2): We curated our own additional dataset consisting of 3,173 commonly searched consumer questions, referred to as HealthSearchQA. The dataset was curated using seed medical conditions and their associated symptoms. We used the seed data to retrieve publicly-available commonly searched questions generated by a search engine, which were displayed to all users entering the seed terms. We publish the dataset as an open benchmark for answering medical questions from consumers and hope this will be a useful resource for the community, as a dataset reflecting real-world consumer concerns. **Format:** Question only, free text response, open domain. **Size:** 3,173. **Example question:** How serious is atrial fibrillation? **Example question:** What kind of cough comes with Covid? **Example question:** Is blood in phlegm serious?
提供机构:
aisc-team-d2
原始信息汇总

HealthSearchQA 数据集概述

数据集信息

  • 特征:
    • id: 数据类型为 float64
    • question: 数据类型为 string
  • 分割:
    • train: 字节数为 170966,样本数为 4436
  • 下载大小: 79303 字节
  • 数据集大小: 170966 字节
  • 配置:
    • default 配置包含 train 分割的数据文件路径为 data/train-*
  • 许可证: 未知
  • 任务类别: 问答
  • 语言: 英语
  • 标签: 医疗
  • 大小类别: 1K<n<10K

数据集描述

  • 来源: Google 发布的消费者健康问题数据集,用于 Med-PaLM 论文(arXiv preprint)。
  • 描述: 该数据集包含 3,173 个常见搜索的消费者问题,通过种子医疗条件及其相关症状筛选。使用种子数据从公共搜索中检索常见问题,并将其发布为开放基准,用于回答消费者医疗问题。
  • 格式: 仅包含问题,自由文本响应,开放领域。
  • 大小: 3,173 个问题。
  • 示例问题:
    • How serious is atrial fibrillation?
    • What kind of cough comes with Covid?
    • Is blood in phlegm serious?
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作