five

Padiweb collection of automatically and manually verified articles on highly pathogenic avian influenza (HPAI) outbreaks in Russia, Northern Europe, and Eastern Europe

收藏
DataCite Commons2026-04-13 更新2026-04-25 收录
下载链接:
https://dataverse.cirad.fr/citation?persistentId=doi:10.18167/DVN1/SOQHUX
下载链接
链接失效反馈
官方服务:
资源简介:
Padiweb is a multilingual text-mining software that collects and classifies Google News articles related to animal health events.<br> We built a dataset of articles on highly pathogenic avian influenza (HPAI) outbreaks in Russia, Northern Europe, and Eastern Europe in 2020 and 2021. These articles, collected by Padiweb in English, Russian, Norwegian, Finnish and Swedish, have been normalized and manually labeled according to the HPAI status of the event. For each relevant article, locations names have been extracted into a specific dataset.<br> This repository includes:<br> Read.me file that included the full description of databases, data processing, including annotation criteria and data dictionary<br> A dataset of 656 Padiweb articles manually labeled according to the HPAI status.<br> A dataset of 3 100 locations extracted from relevant Padiweb articles

Padiweb是一款多语言文本挖掘软件,用于收集并分类与动物健康事件相关的谷歌新闻文章。 我们构建了2020年至2021年间俄罗斯、北欧及东欧地区高致病性禽流感(Highly Pathogenic Avian Influenza, HPAI)暴发相关的新闻文章数据集。这些由Padiweb以英语、俄语、挪威语、芬兰语及瑞典语采集的文章已完成规范化处理,并依据对应事件的HPAI感染状态完成人工标注。针对每一篇相关文章,其中提及的地点名称均已被提取并整理为专属数据集。 本代码仓库包含以下内容: 1. 一份Read.me文件,其中完整涵盖了数据库详情、数据处理流程(含标注准则与数据字典)的完整说明; 2. 一份包含656篇经人工标注HPAI感染状态的Padiweb文章数据集; 3. 一份从相关Padiweb文章中提取的3100个地点名称数据集。
提供机构:
CIRAD Dataverse
创建时间:
2026-01-29
二维码
社区交流群
二维码
科研交流群
商业服务