five

iCite Database Snapshot 2021-02

收藏
Mendeley Data2024-01-31 更新2024-06-30 收录
下载链接:
https://nih.figshare.com/articles/dataset/iCite_Database_Snapshot_2021-02/14174006
下载链接
链接失效反馈
官方服务:
资源简介:
This is a database snapshot of the iCite web service (provided here as a single zipped CSV file, or compressed, tarred JSON files). In addition, citation links in the NIH Open Citation Collection are provided as a two-column CSV table in open_citation_collection.zip. iCite provides bibliometrics and metadata on publications indexed in PubMed, organized into three modules: Influence: Delivers metrics of scientific influence, field-adjusted and benchmarked to NIH publications as the baseline. Translation: Measures how Human, Animal, or Molecular/Cellular Biology-oriented each paper is; tracks and predicts citation by clinical articles Open Cites: Disseminates link-level, public-domain citation data from the NIH Open Citation Collection Definitions for individual data fields: pmid: PubMed Identifier, an article ID as assigned in PubMed by the National Library of Medicine doi: Digital Object Identifier, if available year: Year the article was published title: Title of the article authors: List of author names journal: Journal name (ISO abbreviation) is_research_article: Flag indicating whether the Publication Type tags for this article are consistent with that of a primary research article relative_citation_ratio: Relative Citation Ratio (RCR)--OPA's metric of scientific influence. Field-adjusted, time-adjusted and benchmarked against NIH-funded papers. The median RCR for NIH funded papers in any field is 1.0. An RCR of 2.0 means a paper is receiving twice as many citations per year than the median NIH funded paper in its field and year, while an RCR of 0.5 means that it is receiving half as many citations per year. Calculation details are documented in Hutchins et al., PLoS Biol. 2016;14(9):e1002541. provisional: RCRs for papers published in the previous two years are flagged as "provisional", to reflect that citation metrics for newer articles are not necessarily as stable as they are for older articles. Provisional RCRs are provided for papers published previous year, if they have received with 5 citations or more, despite being, in many cases, less than a year old. All papers published the year before the previous year receive provisional RCRs. The current year is considered to be the NIH Fiscal Year which starts in October. For example, in July 2019 (NIH Fiscal Year 2019), papers from 2018 receive provisional RCRs if they have 5 citations or more, and all papers from 2017 receive provisional RCRs. In October 2019, at the start of NIH Fiscal Year 2020, papers from 2019 receive provisional RCRs if they have 5 citations or more and all papers from 2018 receive provisional RCRs. citation_count: Number of unique articles that have cited this one citations_per_year: Citations per year that this article has received since its publication. If this appeared as a preprint and a published article, the year from the published version is used as the primary publication date. This is the numerator for the Relative Citation Ratio. field_citation_rate: Measure of the intrinsic citation rate of this paper's field, estimated using its co-citation network. expected_citations_per_year: Citations per year that NIH-funded articles, with the same Field Citation Rate and published in the same year as this paper, receive. This is the denominator for the Relative Citation Ratio. nih_percentile: Percentile rank of this paper's RCR compared to all NIH publications. For example, 95% indicates that this paper's RCR is higher than 95% of all NIH funded publications. human: Fraction of MeSH terms that are in the Human category (out of this article's MeSH terms that fall into the Human, Animal, or Molecular/Cellular Biology categories) animal: Fraction of MeSH terms that are in the Animal category (out of this article's MeSH terms that fall into the Human, Animal, or Molecular/Cellular Biology categories) molecular_cellular: Fraction of MeSH terms that are in the Molecular/Cellular Biology category (out of this article's MeSH terms that fall into the Human, Animal, or Molecular/Cellular Biology categories) x_coord: X coordinate of the article on the Triangle of Biomedicine y_coord: Y Coordinate of the article on the Triangle of Biomedicine is_clinical: Flag indicating that this paper meets the definition of a clinical article. cited_by_clin: PMIDs of clinical articles that this article has been cited by. apt: Approximate Potential to Translate is a machine learning-based estimate of the likelihood that this publication will be cited in later clinical trials or guidelines. Calculation details are documented in Hutchins et al., PLoS Biol. 2019;17(10):e3000416. cited_by: PMIDs of articles that have cited this one. references: PMIDs of articles in this article's reference list. Large CSV files are zipped using zip version 4.5, which is more recent than the default unzip command line utility in some common Linux distributions. These files can be unzipped with tools that support version 4.5 or later such as 7zip. Comments and questions can be addressed to iCite@mail.nih.gov

本文件为美国国立卫生研究院(National Institutes of Health, NIH)iCite网页服务的数据库快照(本次提供的文件为单个压缩CSV文件,或经压缩打包的tar格式JSON文件)。此外,NIH开放引用集(NIH Open Citation Collection)中的引用链接以两列CSV表格形式存放于open_citation_collection.zip文件中。 iCite可为PubMed收录的文献提供文献计量学数据与元数据,分为三大模块: 1. 影响力模块(Influence):提供经领域校正、以NIH资助文献为基准对标科学影响力的计量指标; 2. 转化性模块(Translation):衡量每篇文献偏向人类、动物或分子/细胞生物学的程度,并追踪、预测临床文献对其的引用情况; 3. 开放引用模块(Open Cites):发布来自NIH开放引用集的链接级公共领域引用数据。 ### 各数据字段定义: pmid:PubMed标识符(PubMed Identifier),即美国国立医学图书馆为PubMed收录文献分配的文章编号; doi:数字对象标识符(Digital Object Identifier),若可获取则提供; year:文献出版年份; title:文献标题; authors:作者姓名列表; journal:期刊名称(ISO标准缩写形式); is_research_article:研究文章标记,用于标识该文献的出版类型标签是否符合原创研究文章的定义; relative_citation_ratio:相对引用率(Relative Citation Ratio, RCR)——美国国立卫生研究院办公室(OPA)提出的科学影响力计量指标。该指标经领域与时间校正,以NIH资助文献为基准对标。NIH资助文献在任意领域的RCR中位数为1.0:RCR=2.0表示该文献每年获得的引用量是其所属领域及出版年份的NIH资助文献中位数的2倍,RCR=0.5则表示其每年获得的引用量仅为对应中位数的一半。其计算细节详见Hutchins等人发表于PLoS Biol. 2016;14(9):e1002541的研究; provisional(临时):针对出版于前两年的文献,其相对引用率将标记为"临时",以体现新文献的引用计量指标稳定性不如旧文献。若前一年出版的文献获得5次及以上引用,即便其出版时长往往不足一年,也会提供临时RCR;前前年出版的所有文献均会被赋予临时RCR。当前年份以NIH财政年度为准,该财年始于每年10月。例如,在2019年7月(NIH 2019财年),2018年出版的文献若获得5次及以上引用,则会被赋予临时RCR,且所有2017年出版的文献均会获得临时RCR;2019年10月,即NIH 2020财年伊始,2019年出版的文献若获得5次及以上引用,将被赋予临时RCR,且所有2018年出版的文献均会获得临时RCR; citation_count:引用计数,即引用该文献的唯一文献总数量; citations_per_year:年均引用数,即该文献自出版以来每年获得的平均引用量。若该文献同时以预印本和正式出版物形式发布,则以正式版本的出版年份作为主要出版日期。该指标为相对引用率的分子项; field_citation_rate:领域引用率,即基于该文献的共引网络估算得到的其所属领域的固有引用速率; expected_citations_per_year:年均预期引用数,即与该文献同领域、同出版年份的NIH资助文献所获得的年均引用量,为相对引用率的分母项; nih_percentile:NIH百分位,即该文献的RCR在所有NIH资助文献中的百分位排名。例如,95%表示该文献的RCR高于95%的NIH资助文献; human:人类相关占比,即该文献中属于人类类别的医学主题词(Medical Subject Headings, MeSH terms)占其所有归入人类、动物或分子/细胞生物学类别的医学主题词的比例; animal:动物相关占比,即该文献中属于动物类别的医学主题词占其所有归入上述三类的医学主题词的比例; molecular_cellular:分子/细胞生物学相关占比,即该文献中属于分子/细胞生物学类别的医学主题词占其所有归入上述三类的医学主题词的比例; x_coord:生物医学三角图X坐标,即该文献在生物医学三角图上的X轴坐标; y_coord:生物医学三角图Y坐标,即该文献在生物医学三角图上的Y轴坐标; is_clinical:临床文献标记,用于标识该文献符合临床文献的定义; cited_by_clin:被临床文献引用的PMID列表,即引用该文献的临床文献的PubMed标识符; apt:近似转化潜力(Approximate Potential to Translate, APT),即基于机器学习估算的该文献在后续临床试验或指南中被引用的可能性。其计算细节详见Hutchins等人发表于PLoS Biol. 2019;17(10):e3000416的研究; cited_by:引用文献的PMID列表,即引用该文献的所有文献的PubMed标识符; references:参考文献PMID列表,即该文献参考文献中收录的文献的PubMed标识符。 本数据集内的大型CSV文件采用zip 4.5版本压缩,该版本高于部分常见Linux发行版默认的unzip命令行工具版本,可使用支持zip 4.5及以上版本的解压工具(如7zip)进行解压。若有任何意见或疑问,请发送邮件至iCite@mail.nih.gov。
创建时间:
2024-01-31
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作