bigbio/pubtator_central
收藏Hugging Face2022-12-22 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/bigbio/pubtator_central
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
bigbio_language:
- English
license: other
multilinguality: monolingual
bigbio_license_shortname: NCBI_LICENSE
pretty_name: PubTator Central
homepage: https://www.ncbi.nlm.nih.gov/research/pubtator/
bigbio_pubmed: True
bigbio_public: True
bigbio_tasks:
- NAMED_ENTITY_RECOGNITION
- NAMED_ENTITY_DISAMBIGUATION
---
# Dataset Card for PubTator Central
## Dataset Description
- **Homepage:** https://www.ncbi.nlm.nih.gov/research/pubtator/
- **Pubmed:** True
- **Public:** True
- **Tasks:** NER,NED
PubTator Central (PTC, https://www.ncbi.nlm.nih.gov/research/pubtator/) is a web service for
exploring and retrieving bioconcept annotations in full text biomedical articles. PTC provides
automated annotations from state-of-the-art text mining systems for genes/proteins, genetic
variants, diseases, chemicals, species and cell lines, all available for immediate download. PTC
annotates PubMed (30 million abstracts), the PMC Open Access Subset and the Author Manuscript
Collection (3 million full text articles). Updated entity identification methods and a
disambiguation module based on cutting-edge deep learning techniques provide increased accuracy.
## Citation Information
```
@article{10.1093/nar/gkz389,
title = {{PubTator central: automated concept annotation for biomedical full text articles}},
author = {Wei, Chih-Hsuan and Allot, Alexis and Leaman, Robert and Lu, Zhiyong},
year = 2019,
month = {05},
journal = {Nucleic Acids Research},
volume = 47,
number = {W1},
pages = {W587-W593},
doi = {10.1093/nar/gkz389},
issn = {0305-1048},
url = {https://doi.org/10.1093/nar/gkz389},
eprint = {https://academic.oup.com/nar/article-pdf/47/W1/W587/28880193/gkz389.pdf}
}
```
提供机构:
bigbio
原始信息汇总
数据集概述:PubTator Central
基本信息
- 语言: 英语
- 许可证: NCBI_LICENSE
- 多语言性: 单语种
- 数据集名称: PubTator Central
- 主页: https://www.ncbi.nlm.nih.gov/research/pubtator/
数据集描述
- PubMed集成: 是
- 公开可用: 是
- 任务类型:
- 命名实体识别 (NER)
- 命名实体消歧 (NED)
PubTator Central 是一个网络服务,用于探索和检索生物概念注释在全文生物医学文章中的应用。该服务提供来自最先进的文本挖掘系统的自动化注释,包括基因/蛋白质、遗传变异、疾病、化学物质、物种和细胞线,所有这些都可供立即下载。PubTator Central 注释了PubMed的3000万摘要、PMC开放获取子集和作者手稿集合的300万全文文章。更新后的实体识别方法和基于尖端深度学习技术的消歧模块提供了更高的准确性。



