five

ISHEERO/data_merge_2

收藏
Hugging Face2024-03-26 更新2024-06-11 收录
下载链接:
https://hf-mirror.com/datasets/ISHEERO/data_merge_2
下载链接
链接失效反馈
官方服务:
资源简介:
# Card of the augmented and modified dataset for the LLM challenge on public health systems in Malawi ## Dataset Description This dataset was created within the scope of the Malawi Public Health Systems LLM Challenge organized by the AI Lab of the Malawi University of Business and Applied Sciences. The challenge aimed to develop an AI assistant capable of extracting information from the Malawi Technical Guidelines for Integrated Disease Surveillance and Response (TGs for IDSR). The dataset was derived from the provided "Train.csv" dataset, with the goal of augmenting the data to include additional relevant information. --- license: mit task_categories: - question-answering - text-generation language: - en tags: - medical size_categories: - 1K<n<10K --- ## Uses - **Retrieval-Augmented Generation (RAG) Models:** The structured nature of the data, where answers to questions are provided within one of two contexts, is conducive to training models optimized for retrieving relevant information from the correct context. ## Dataset Structure ### dataset.csv - Columns: "content", "source", "question", "answer", "new_content" - Description: Contains questions and answers contextualized in Malawi's Technical Guidelines for Integrated Disease Surveillance and Response (TGs for IDSR) or other relevant sources. Questions vary in type, including "what", "why", "who", "where", and those seeking comparisons between concepts. Each question is accompanied by two contexts and an answer extracted from one of the two contexts. Each context includes reference (source) information. **Note:** For contexts within the `content` column, source information is provided in the `source` column. For contexts within the `new_content` column, source information is embedded at the beginning of the text. ## Source Data - Challenge Link: [Malawi Public Health Systems LLM Challenge](https://zindi.africa/competitions/malawi-public-health-systems-llm-challenge) - Dataset Source: [Zindi - Malawi Public Health Systems LLM Challenge Data](https://zindi.africa/competitions/malawi-public-health-systems-llm-challenge/data) - Data Source Producers: AI Lab of the Malawi University of Business and Applied Sciences ## Dataset Card Contact - Email: Arnauld.adjovi@isheero.com
提供机构:
ISHEERO
原始信息汇总

数据集概述

数据集描述

本数据集是在马拉维商业与应用科学大学AI实验室组织的马拉维公共卫生系统LLM挑战赛中创建的。该挑战旨在开发一种能够从马拉维综合疾病监测和响应技术指南(TGs for IDSR)中提取信息的AI助手。数据集源自提供的"Train.csv"数据集,旨在通过增加额外相关信息来扩充数据。

数据集属性

  • 许可证: MIT
  • 任务类别:
    • 问答
    • 文本生成
  • 语言: 英语
  • 标签: 医学
  • 大小类别: 1K<n<10K

使用场景

  • 检索增强生成(RAG)模型: 数据结构化,每个问题都有两个上下文和一个答案,适合训练模型从正确的上下文中检索相关信息。

数据集结构

  • 文件: dataset.csv
  • : "content", "source", "question", "answer", "new_content"
  • 描述: 包含基于马拉维综合疾病监测和响应技术指南(TGs for IDSR)或其他相关来源的问题和答案。问题类型多样,包括"什么"、"为什么"、"谁"、"哪里"以及寻求概念之间比较的问题。每个问题伴随两个上下文和一个从其中一个上下文提取的答案。每个上下文包括参考(来源)信息。

源数据

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作