five

德国细粒度命名实体识别与关系抽取语料库

收藏
arXiv2020-04-07 更新2024-06-21 收录
下载链接:
https://dfki-lt-re-group.bitbucket.io/smartdata-corpus
下载链接
链接失效反馈
官方服务:
资源简介:
德国细粒度命名实体识别与关系抽取语料库是由德国人工智能研究中心柏林分部创建,旨在支持个人旅行规划和供应链管理等领域的决策过程。该数据集包含来自新闻、Twitter和交通报告的2598个文档,涵盖了街道、站点和路线等细粒度地理实体,以及15种与交通和工业相关的关系和事件。数据集的创建过程涉及从大量文本流中随机抽样并进行标注,以训练和评估命名实体识别算法和关系抽取系统。该数据集特别适用于解决从异构高容量文本流中提取特定公司、运输路线和位置相关事件的挑战。

The German fine-grained named entity recognition and relation extraction corpus was developed by the Berlin Branch of the German Research Center for Artificial Intelligence (DFKI) to support decision-making processes in domains such as personal travel planning and supply chain management. This corpus contains 2598 documents sourced from news articles, Twitter posts and traffic reports, covering fine-grained geographic entities such as streets, stations and routes, as well as 15 industry- and transport-related relations and events. The construction of this corpus involved random sampling and annotation from large-scale text streams, for the purpose of training and evaluating named entity recognition algorithms and relation extraction systems. This corpus is particularly well-suited for addressing the challenges of extracting specific company, transportation route and location-related events from heterogeneous high-volume text streams.
提供机构:
德国人工智能研究中心柏林分部
创建时间:
2020-04-07
二维码
社区交流群
二维码
科研交流群
商业服务