five

xhluca/publichealth-qa

收藏
Hugging Face2024-05-17 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/xhluca/publichealth-qa
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-nc-sa-3.0 task_categories: - question-answering language: - ar - en - es - fr - ko - ru - vi - zh size_categories: - n<1K # https://huggingface.co/docs/hub/en/datasets-manual-configuration configs: - config_name: english default: true data_files: - split: test path: data/english.csv - config_name: arabic data_files: - split: test path: data/arabic.csv - config_name: chinese data_files: - split: test path: data/chinese.csv - config_name: french data_files: - split: test path: data/french.csv - config_name: korean data_files: - split: test path: data/korean.csv - config_name: korean data_files: - split: test path: data/korean.csv - config_name: russian data_files: - split: test path: data/russian.csv - config_name: spanish data_files: - split: test path: data/spanish.csv - config_name: vietnamese data_files: - split: test path: data/vietnamese.csv --- # Usage ```python import datasets langs = ['arabic', 'chinese', 'english', 'french', 'korean', 'russian', 'spanish', 'vietnamese'] data = datasets.load_dataset('xhluca/publichealth-qa', split='test', name=langs[0]) ``` # About This dataset contains question and answer pairs sourced from Q&A pages and FAQs from CDC and WHO pertaining to COVID-19. They were produced and collected between 2019-12 and 2020-04. They were originally published as an [aggregated Kaggle dataset](https://www.kaggle.com/xhlulu/covidqa). # License CDC data is licensed under [CC-BY 3.0](https://web.archive.org/web/20201017141031/https://www2a.cdc.gov/cdcup/library/other/policy.htm) and WHO is licensed under [cc-by-nc-sa-3.0](https://web.archive.org/web/20210701063743/https://www.who.int/about/policies/publishing/copyright). # Source This data was originally included in the [COVID-QA dataset](https://www.kaggle.com/datasets/xhlulu/covidqa), where it was known as the multilingual split. The files in this updated repository were generated using the [publichealth-qa repository](https://github.com/xhluca/publichealth-qa).
提供机构:
xhluca
原始信息汇总

数据集概述

数据集基本信息

  • 许可证: CC-BY-NC-SA-3.0
  • 任务类别: 问答
  • 支持的语言: 阿拉伯语 (ar), 英语 (en), 西班牙语 (es), 法语 (fr), 韩语 (ko), 俄语 (ru), 越南语 (vi), 中文 (zh)
  • 数据集大小: 小于1K

数据集配置

  • 默认配置: 英语
  • 配置详情:
    • 英语:
      • 文件路径: data/english.csv
    • 阿拉伯语:
      • 文件路径: data/arabic.csv
    • 中文:
      • 文件路径: data/chinese.csv
    • 法语:
      • 文件路径: data/french.csv
    • 韩语:
      • 文件路径: data/korean.csv
    • 俄语:
      • 文件路径: data/russian.csv
    • 西班牙语:
      • 文件路径: data/spanish.csv
    • 越南语:
      • 文件路径: data/vietnamese.csv
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作