Annotated datasets from PADI-web for event-based surveillance of Avian Influenza, African Swine Fever, and West-Nile Virus Disease
收藏DataCite Commons2025-05-16 更新2025-04-16 收录
下载链接:
https://entrepot.recherche.data.gouv.fr/citation?persistentId=doi:10.57745/99SNOZ
下载链接
链接失效反馈官方服务:
资源简介:
These datasets concern unstructured data (articles) from news items detected by an event-based surveillance system; PADI-Web, between 2022 and 2023. Collected articles were manually annotated by relevance for epidemic intelligence purposes with the help of two epidemiologists Extracted data include relevant articles (with two possible labels; epidemiological events or general information) and irrelevant information regarding three different diseases: Avian Influenza (AI), African Swine Fever (ASF) and West Nile Virus disease(WNV). This database is extensive as it deals with different types of diseases (zoonotic, cross-border and vectorial disease ) and can be used to train or evaluate classification approaches to automatically identify written text on these diseases events and classify them by relevance. The structure of the dataset is as follow: Alert_id: Article identifier. Note that each article has a unique ID, if an article reports multiple events, it is duplicated and each line represent one event. Title: Article's title given by the news outlet. hsource: URL of the news outlet reporting the article. Source: Name of the news outlet reporting the article. url: URL information of the article reporting the considered event. Note that multiple articles can report same event. Issue_date: Date of the article publication Country: Name of the country where the event happened Place_name: Name of the administration, city or district where the event happened, if none of these is mentionned in the text, the country's name is reported. Administrative_division: The administrative level at which the information is reported (country, department, city...) Disease_name: Name of the disease that is reported in the article Species_name: Name of the affected host that is reported in the article Manualclass: Manual classification (Relevant or Irrelevant) Lat: Place_name lattitude coordinates Lon: Place_name longitude coordinates
提供机构:
Recherche Data Gouv
创建时间:
2023-07-28



