NER常用数据集汇总

Name: NER常用数据集汇总
Creator: 阿里云天池
Published: 2026-05-16 02:45:03
License: 暂无描述

阿里云天池2026-05-16 更新2024-03-07 收录

下载链接：

https://tianchi.aliyun.com/dataset/145108

下载链接

链接失效反馈

官方服务：

资源简介：

命名实体识别NER是NLP基础任务，一直以来受到学术界和业界的广泛关注。本文汇总了常见的中英文NER数据集任务，并整理了每个数据集任务的语种、规模、实体类别数量、论文、下载地址、代码Github、公开评测任务（Optional）等信息，并在协议许可的情况下对部分任务资源提供了天池站点存储，方便算法人员学习使用。

Named Entity Recognition (NER) is a fundamental natural language processing (NLP) task that has long garnered widespread attention from both academic and industrial communities. This paper compiles common Chinese and English NER dataset tasks, and organizes information for each task including language, scale, number of entity categories, associated papers, download links, GitHub code repositories, and optional public evaluation tasks, etc. Additionally, with the permission of the respective protocols, some of these task resources are stored on the Tianchi platform to facilitate learning and usage by algorithm practitioners.

提供机构：

阿里云天池

创建时间：

2023-02-01

搜集汇总

数据集介绍

背景与挑战

背景概述

该数据集是一个命名实体识别（NER）常用数据集的汇总资源，整理了包括中文和英文在内的多个领域（如新闻、电商、医疗、微博等）的数据集，提供了每个数据集的语种、规模、实体类别数量、论文和下载地址等关键信息。其特点在于覆盖面广，包含细粒度、多模态和跨语言数据集，并为了方便使用，部分资源已存储在天池站点，适合算法人员学习和研究NER任务。

以上内容由遇见数据集搜集并总结生成