Universal NER (UNER)

arXiv2024-03-26 更新2024-06-21 收录

下载链接：

https://doi.org/10.7910/DVN/GQ8HDL

下载链接

链接失效反馈

官方服务：

资源简介：

Universal NER (UNER) 是一个开放的、社区驱动的多语言命名实体识别基准项目，旨在提供高质量、跨语言一致的标注，以促进和标准化多语言NER研究。UNER v1包含19个数据集，覆盖13种不同语言，每个数据集由主要以母语为母语的标注者在现有的Universal Dependencies树库文本上进行标注。UNER项目强调提供一个共享的、普遍适用的定义、标签集和标注模式，广泛适用于各种语言。UNER v1的目标是通过解决多语言NLP社区对标准化、跨语言和人工标注的NER数据的需求，促进实体识别的多语言研究。随着UNER v1的发布，计划扩展UNER到新的语言和数据集，并欢迎所有有兴趣开发项目的新标注者。

Universal NER (UNER) is an open, community-driven multilingual named entity recognition (NER) benchmark project aimed at providing high-quality, cross-linguistically consistent annotations to facilitate and standardize multilingual NER research. UNER v1 includes 19 datasets covering 13 different languages, with each dataset annotated by native-speaking annotators based on texts from existing Universal Dependencies treebanks. The UNER project emphasizes providing a shared, universally applicable definition, label set and annotation schema that is broadly applicable across various languages. The goal of UNER v1 is to promote multilingual research on entity recognition by addressing the multilingual NLP community's demand for standardized, cross-lingual and human-annotated NER data. Following the release of UNER v1, the project plans to expand UNER to new languages and datasets, and welcomes all interested annotators who wish to contribute to the project.

提供机构：

多邻国

创建时间：

2023-11-16

5,000+

优质数据集

54 个

任务类型

进入经典数据集