WNUT 2017 (WNUT 2017 Emerging and Rare entity recognition)
收藏OpenDataLab2026-05-24 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/WNUT_2017
下载链接
链接失效反馈官方服务:
资源简介:
这项共同任务的重点是在新兴讨论的背景下识别不寻常的、以前看不见的实体。命名实体构成了许多其他任务(如事件聚类和摘要)的现代方法的基础,但在嘈杂的文本中回忆它们是一个真正的问题——即使在注释者中也是如此。这种下降往往是由于新的实体和表面形式。以推文“so.. kktny in 30 mins?”为例。 - 即使是人类专家也发现实体 kktny 难以检测和解决。该任务将评估在嘈杂文本中检测和分类新颖、新兴、单一命名实体的能力。该任务的目标是提供新兴实体和稀有实体的定义,并在此基础上提供用于检测这些实体的数据集。
This shared task focuses on identifying unusual, previously unseen entities in the context of emerging discussions. Named entities serve as the foundation for modern approaches to numerous downstream tasks such as event clustering and text summarization. However, recalling these entities from noisy text poses a significant challenge—even among professional annotators. This performance degradation often arises from novel entities and their variant surface forms. Take the tweet "so.. kktny in 30 mins?" as an illustrative example: even human experts find the entity "kktny" extremely difficult to detect and disambiguate. This task evaluates the capability to detect and classify novel, emerging, and standalone named entities from noisy textual data. The ultimate objective of this task is to establish formal definitions for emerging and rare entities, and to construct a dedicated dataset for detecting such entities.
提供机构:
OpenDataLab
创建时间:
2022-08-16
搜集汇总
数据集介绍

背景与挑战
背景概述
WNUT 2017数据集专注于在嘈杂文本中识别新兴和罕见的命名实体,旨在定义并评估对这些新颖、未见过实体的检测能力。该数据集为相关任务提供了基准支持。
以上内容由遇见数据集搜集并总结生成



