Datasets for NLPCC2022.SharedTask5.Track2

Name: Datasets for NLPCC2022.SharedTask5.Track2
Creator: www.doi.org
License: 暂无描述

www.doi.org2025-03-24 收录

下载链接：

https://www.doi.org/10.11922/sciencedb.j00104.00101

下载链接

链接失效反馈

官方服务：

资源简介：

Domain knowledge graph has been widely adopted for various domains, e.g., medicine, agriculture and service industry, because it can provide promising functions including intelligent search and personalized recommendation. Knowledge graph is normally composited of a huge number of entities and relations (connect entities), and the utility of knowledge graph largely&nbsp;depends on the richness of these entities/relations. To construct high-value knowledge graph (e.g., informative domain knowledge graph), researchers aim at automatically extracting entities from massive heterogeneous sources, which is often impossible to achieve with pure manual labor.With the blooming of natural language processing (NLP), researchers have proposed Named Entity Recognition (NER) technique to automatically extract entities from raw texts. NER is mostly regarded as a supervised sequence labeling/tagging task; that is, recognizing entities from unseen texts according to the patterns learned&nbsp;from labeled texts. As an essential step for knowledge graph construction, as well as some other NLP tasks, the development of NER is one of the main focuses in both the academia and the industry in recent years. Under this background, this competition targets at exploring novel and insightful NER methods to better capture the entities, especially for the construction of domain knowledge graphs.

领域知识图谱在诸多领域，例如医学、农业及服务业等领域得到了广泛应用，因其能够提供包括智能搜索和个性化推荐在内的诸多有益功能。知识图谱通常由大量实体及其相互关系（连接实体）构成，其效用在很大程度上取决于这些实体/关系的丰富性。为了构建高价值的知识图谱（例如，信息丰富的领域知识图谱），研究人员致力于从海量异构源中自动提取实体，而这通常无法仅凭纯手工劳动实现。随着自然语言处理（NLP）的蓬勃发展，研究者们提出了命名实体识别（NER）技术，以自动从原始文本中提取实体。NER通常被视为一种监督序列标注/标记任务；即根据从标注文本中学习到的模式来识别未见文本中的实体。作为知识图谱构建及其他NLP任务的关键步骤，NER的发展近年来在学术界和工业界都成为了主要的研究焦点。在此背景下，本次竞赛旨在探索新颖且富有洞察力的NER方法，以更好地捕捉实体，尤其是针对领域知识图谱的构建。

提供机构：

www.doi.org

5,000+

优质数据集

54 个

任务类型

进入经典数据集