MWIDSRQA1.0 Malawi Integrated Disease Surveillance and Response Questions and Answers dataset
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10565936
下载链接
链接失效反馈官方服务:
资源简介:
The dataset contains questions and answers over a text containing technical guidelines for disease surveillance in Malawi. For each question we mention the location in the text that applies. The dataset focuses on disease surveillance and can be used for tasks relevant to hierarchical text classification, machine learning, information retrieval, QA from texts and structured data, multi-document summarization and many other areas. Additionally the dataset can also be used in developing training materials and tests to be used as part of public health / community surveillance courses and degrees.
Methodology: The dataset has two parts: one that was obtained via an automatic process of extracting questions and answers from text and a gold standard containing questions and answers that have been curated by academics and public health experts.
Data Source: The source data for the dataset are six booklets containing the Technical Guidelines for Integrated Disease Surveillance in Malawi. The six booklets are specific to Malawi and have been adapted from the WHO Technical Guidelines for IDSR. The booklets are organised into sections covering different areas for disease surveillance and response. The citation for these Booklets is: The Malawi Technical Guidelines for Integrated Disease Surveillance and Response Technical Guidelines. Lilongwe, December 2020. Licence: CC BY-NC-SA 3.0 IGO Some rights reserved. This work is available under the Creative Commons Attribution-NonCommercialShareAlike 3.0 IGO licence (CC BY-NC-SA 3.0 IGO; https://creativecommons.org/licenses/by-ncsa/3.0/igo).
Files Structure:
CSV containing questions, answers, booklet, paragraph.
CSV containing the keywords that are contained in answers.
Each question and answer corresponds to paragraphs taken from specific booklets. Hence for each question, the CSV specifies which bookle and which paragraphs within that booklet were used to answer the question.
The excel files containing the content of the six booklets with all paragraphs numbered.
Images do not appear in the Excel files but are numbered
Here is an example of the data contained in the dataset:
NoQ
Question Text
Question Answer
Reference Document
Paragraph(s) Number
1
What is public health surveillance?
Public Health Surveillance involves systematically identifying, collecting, collating, analyzing, and interpreting disease occurrence and public health event data to take timely and robust action.
TG Booklet 1
72
2
What constitutes an alert in the context of disease surveillance?
An indirect early warning signs of a potential public health event occurring in a community under surveillance.
TG Booklet 1
67
3
What kind of events do alerts capture?
Alerts may capture a wide variety of unusual events emerging at the community level and information from these alerts may be incomplete and unconfirmed and as such they all need to be triaged and verified.
TG Booklet 1
86
4
What is the definition of "unusual event"?
The definition of an "unusual event" may vary across different communities and needs to be defined in each context. It can be characterized as either a single event or a cluster of events that may indicate that something might be wrong in a community.
TG Booklet 1
434
5
Can rumours be the basis of unusual events?
Yes, for instance, an unusual event could be identified as "A cluster of deaths from an unknown cause in the same household or adjacent households".
TG Booklet 1
434
Format of the file containing keywords:
No
Question
1
Public Health Surveillance, identification, collection, analysis, interpretation, disease occurrence, public health events, timely action, dissemination, planning, monitoring, evaluating.
2
Alert, early warning signs, potential public health event, community surveillance.
3
Alert, early warning signs, potential public health event, community surveillance, triaging, verification.
4
Unusual event, disease surveillance, occurrence, pattern of cases, deviation, baseline level, cluster, community-specific, temporal nature.
5
Community-Based Surveillance (CBS), detection, reporting, public health significance, community members, Indicator-based surveillance, Event-based surveillance, focal persons, Health Surveillance Assistants (HSAs).
Acknowledgements and Copyright statements
The dataset is available under: CC BY-NC-ND 4.0 Legal Code | Attribution-NonCommercial-NoDerivs 4.0 International | Creative Commons
创建时间:
2024-01-25



