Nerwip Corpus
收藏DataCite Commons2020-09-04 更新2024-07-27 收录
下载链接:
https://figshare.com/articles/dataset/Nerwip_Corpus/1289791/10
下载链接
链接失效反馈官方服务:
资源简介:
This corpus contains 409 Wikipedia articles. Those are biographies, manually annotated to higlight entities of the following types: Dates, Locations, Organizations and Persons. It was designed to be used by our tool Nerwip (https://github.com/CompNet/nerwip), in order to evaluate and compare existing NER tools on biographic data. It was constituted by Burcu Küpelioglu during her end of study project, and then cleaned and corrected by Samet Atdag during his MSc, to get a total of 250 articles (v3). Vincent Labatut then completed it further, to reach 409 articles (v4). The dataset is shared under a Creative Commons 0 license. If you use it, please cite the following article: A Comparison of Named Entity Recognition Tools Applied to Biographical Texts, S. Atdag & V. Labatut, 2013. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6632052&tag=1 The other files are NER tools-related data (models, dictionaries, etc.), needed by Nerwip to detect entities. If you want to use the tool, you need to unzip these files as explained in the README file associated to Nerwip on GitHub.
提供机构:
figshare
创建时间:
2016-01-19



