five

Training and development dataset for information extraction in plant epidemiomonitoring

收藏
DataCite Commons2025-05-16 更新2025-04-16 收录
下载链接:
https://entrepot.recherche.data.gouv.fr/citation?persistentId=doi:10.57745/ZDNOGF
下载链接
链接失效反馈
官方服务:
资源简介:
The “Training and development dataset for information extraction in plant epidemiomonitoring” is the annotation set of the “Corpus for the epidemiomonitoring of plant”. The annotations include seven entity types (e.g. species, locations, disease), their normalisation by the NCBI taxonomy and GeoNames and binary (seven) and ternary relationships. The annotations refer to character positions within the documents of the corpus. The annotation guidelines give their definitions and representative examples. Both datasets are intended for the training and validation of information extraction methods.

「植物疫情监测信息抽取训练与开发数据集」为「植物疫情监测语料库」的标注集合。该标注涵盖7类实体(如物种、地点、病害),并通过NCBI分类法(NCBI taxonomy)与GeoNames地名数据库完成实体归一化,同时包含7类二元关系与三元关系。标注对应语料库各文档内的字符位置。标注指南对各类标注的定义与典型示例进行了说明。上述两个数据集均用于信息抽取方法的训练与验证。
提供机构:
Recherche Data Gouv
创建时间:
2025-01-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作