five

CrossNER

收藏
魔搭社区2026-05-07 更新2024-05-15 收录
下载链接:
https://modelscope.cn/datasets/yingxi/cross_ner
下载链接
链接失效反馈
官方服务:
资源简介:
# ontonotes4命名实体识别数据集 ## 数据集概述 OntoNotes Release 4.0, Linguistic Data Consortium (LDC) catalog number LDC2011T03 and isbn 1-58563-574-X, was developed as part of the OntoNotes project, a collaborative effort between BBN Technologies, the University of Colorado, the University of Pennsylvania and the University of Southern Californias Information Sciences Institute. The goal of the project is to annotate a large corpus comprising various genres of text (news, conversational telephone speech, weblogs, usenet newsgroups, broadcast, talk shows) in three languages (English, Chinese, and Arabic) with structural information (syntax and predicate argument structure) and shallow semantics (word sense linked to an ontology and coreference). OntoNotes Release 4.0 is supported by the Defense Advance Research Project Agency, GALE Program Contract No. HR0011-06-C-0022. ### 数据集的格式和结构 数据格式采用conll标准,数据分为两列,第一列是输入句中的词划分,第二列是每个词对应的命名实体类型标签。一个具体case的例子如下: ``` 军 O 方 O 已 O 经 O 营 O 救 O 出 O 了 O 1 O 1 O 名 O 菲 B-GPE 律 I-GPE 宾 I-GPE 人 O 质 O 。 O 获 O 救 O 的 O 人 O 质 O 为 O 以 O 前 O 电 O 视 O 布 O 道 O 家 O 阿 B-PER 美 I-PER 达 I-PER ``` ## 数据集版权信息 © 2006 Al Arabiya, © 2006 Al Hayat, 数据集文件元信息以及数据文件,请浏览“数据集文件”页面获取。 当前数据集卡片使用的是默认模版,数据集的贡献者未提供更加详细的数据集介绍,但是您可以通过如下GIT Clone命令,或者ModelScope SDK来下载数据集 #### 下载方法 :modelscope-code[]{type="sdk"} :modelscope-code[]{type="git"}

# CrossNER命名实体识别(Named Entity Recognition)数据集 ## 数据集概述 CrossNER数据集是面向文学、政治、英语、音乐、科学、人工智能等多个领域的英文命名实体识别数据集。 ### 数据集简介 本数据集包含该数据集下的全部测试集,具体实体类型与数据规模可参阅[CrossNER原始仓库](https://github.com/zliucr/CrossNER)。 ### 数据集的格式和结构 数据采用CoNLL标准格式,共分为两列:第一列为输入语句的分词结果,第二列为各分词对应的命名实体类型标签。具体示例如下: Typical O generative O model O approaches O include O naive B-algorithm Bayes I-algorithm classifier I-algorithm s O , O Gaussian B-algorithm mixture I-algorithm model I-algorithm s O , O variational B-algorithm autoencoders I-algorithm and O others O . O ## 数据集版权信息 知识共享署名4.0国际许可协议(Creative Commons Attribution 4.0 International) ## 引用方式 bib @inproceedings{DBLP:conf/aaai/Liu0YDJCMF21, author = {Zihan Liu and Yan Xu and Tiezheng Yu and Wenliang Dai and Ziwei Ji and Samuel Cahyawijaya and Andrea Madotto and Pascale Fung}, title = {CrossNER: Evaluating Cross-Domain Named Entity Recognition}, booktitle = {Thirty-Fifth {AAAI} Conference on Artificial Intelligence, {AAAI} 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, {IAAI} 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, {EAAI} 2021, Virtual Event, February 2-9, 2021}, pages = {13452--13460}, publisher = {{AAAI} Press}, year = {2021}, url = {https://ojs.aaai.org/index.php/AAAI/article/view/17587}, timestamp = {Mon, 07 Jun 2021 11:46:04 +0200}, biburl = {https://dblp.org/rec/conf/aaai/Liu0YDJCMF21.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
提供机构:
maas
创建时间:
2023-02-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作