five

Benchmark data and models for fine-grained geographic NER with few-shot learning

收藏
DataCite Commons2025-05-07 更新2025-09-08 收录
下载链接:
https://figshare.com/articles/dataset/Benchmark_data_and_models_for_fine-grained_geographic_NER_with_few-shot_learning/25559193
下载链接
链接失效反馈
官方服务:
资源简介:
Geographic Named Entity Recognition (GNER) focuses on extracting geographic entity names from text and classifying them into pre-defined categories. Previous methods have not paid much attention to identifying fine-grained categories of geographic entities in sparse data situations, thus remaining limited in serving various geographic applications. To address this limitation, this paper presents a fine-grained GNER task and proposes a fine-grained GNER model, LH-FGNER, which incorporates a prototype network and hierarchical contrastive learning to improve fine-grained GNER. Specifically, the model designs label-guided sentence-level prototypes to capture the contextual semantics of geographic entities. It introduces a hierarchy tree to guide the construction of prototypes in vector space, which utilizes the hierarchy as a priori knowledge to improve the discrimination of fine-grained categories. In addition, two datasets are constructed to support the study of the fine-grained GNER task. Experimental results show that the proposed model is superior to the baseline and is robust. This work provides a methodological reference for few-shot GNER, which can be used to facilitate various geographic applications with text.The materials include the following:data: Data from the preparation process.pre_model: Model storage files from the experimental process of the paper, which allow for direct testing without the need for retraining the model.LH-FGNER: Code for the method proposed in the paper (LH-FGNER), along with code for related experimental discussions, non-open-source baselines, etc. It also includes environment files for reproduction and a README file for guidance.

地理命名实体识别(Geographic Named Entity Recognition,简称GNER)旨在从文本中提取地理实体名称,并将其归类至预定义类别。现有方法鲜有关注稀疏数据场景下地理实体的细粒度类别识别,因此在支撑各类地理应用时存在局限。为解决该局限,本文提出细粒度GNER任务,并构建细粒度地理命名实体识别模型LH-FGNER,该模型融合原型网络与层级对比学习方法以优化细粒度GNER任务。具体而言,该模型设计标签引导的句子级原型以捕捉地理实体的上下文语义;引入层级树指导向量空间内原型的构建,利用层级结构作为先验知识,提升细粒度类别的区分度。此外,本文构建了两个数据集以支撑细粒度GNER任务的研究。实验结果表明,所提模型优于基线模型且具备良好鲁棒性。本研究为少样本地理命名实体识别提供了方法论参考,可用于依托文本赋能各类地理应用。本研究附带的材料如下: data:数据预处理流程中产出的数据集; pre_model:本文实验环节生成的模型存储文件,可直接用于测试,无需重新训练模型; LH-FGNER:本文所提方法(LH-FGNER)的代码,包含相关实验讨论代码、非开源基线模型代码等,同时附带复现所需的环境配置文件与指导用README文档。
提供机构:
figshare
创建时间:
2025-05-07
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作