iCite Database Snapshot 2023-10|科学出版物数据集|影响力分析数据集

DataCite Commons2023-11-14 更新2024-07-13 收录

科学出版物

影响力分析

下载链接：

https://nih.figshare.com/articles/dataset/iCite_Database_Snapshot_2023-10/24534028/1

下载链接

链接失效反馈

资源简介：

This is a database snapshot of the iCite web service (provided here as a single zipped CSV file, or compressed, tarred JSON files). In addition, citation links in the NIH Open Citation Collection are provided as a two-column CSV table in open_citation_collection.zip. iCite provides bibliometrics and metadata on publications indexed in PubMed, organized into three modules: Influence: Delivers metrics of scientific influence, field-adjusted and benchmarked to NIH publications as the baseline. Translation: Measures how Human, Animal, or Molecular/Cellular Biology-oriented each paper is; tracks and predicts citation by clinical articles Open Cites: Disseminates link-level, public-domain citation data from the NIH Open Citation Collection Definitions for individual data fields: pmid: PubMed Identifier, an article ID as assigned in PubMed by the National Library of Medicine doi: Digital Object Identifier, if available year: Year the article was published title: Title of the article authors: List of author names journal: Journal name (ISO abbreviation) is_research_article: Flag indicating whether the Publication Type tags for this article are consistent with that of a primary research article relative_citation_ratio: Relative Citation Ratio (RCR)--OPA's metric of scientific influence. Field-adjusted, time-adjusted and benchmarked against NIH-funded papers. The median RCR for NIH funded papers in any field is 1.0. An RCR of 2.0 means a paper is receiving twice as many citations per year than the median NIH funded paper in its field and year, while an RCR of 0.5 means that it is receiving half as many citations per year. Calculation details are documented in Hutchins et al., PLoS Biol. 2016;14(9):e1002541. provisional: RCRs for papers published in the previous two years are flagged as "provisional", to reflect that citation metrics for newer articles are not necessarily as stable as they are for older articles. Provisional RCRs are provided for papers published previous year, if they have received with 5 citations or more, despite being, in many cases, less than a year old. All papers published the year before the previous year receive provisional RCRs. The current year is considered to be the NIH Fiscal Year which starts in October. For example, in July 2019 (NIH Fiscal Year 2019), papers from 2018 receive provisional RCRs if they have 5 citations or more, and all papers from 2017 receive provisional RCRs. In October 2019, at the start of NIH Fiscal Year 2020, papers from 2019 receive provisional RCRs if they have 5 citations or more and all papers from 2018 receive provisional RCRs. citation_count: Number of unique articles that have cited this one citations_per_year: Citations per year that this article has received since its publication. If this appeared as a preprint and a published article, the year from the published version is used as the primary publication date. This is the numerator for the Relative Citation Ratio. field_citation_rate: Measure of the intrinsic citation rate of this paper's field, estimated using its co-citation network. expected_citations_per_year: Citations per year that NIH-funded articles, with the same Field Citation Rate and published in the same year as this paper, receive. This is the denominator for the Relative Citation Ratio. nih_percentile: Percentile rank of this paper's RCR compared to all NIH publications. For example, 95% indicates that this paper's RCR is higher than 95% of all NIH funded publications. human: Fraction of MeSH terms that are in the Human category (out of this article's MeSH terms that fall into the Human, Animal, or Molecular/Cellular Biology categories) animal: Fraction of MeSH terms that are in the Animal category (out of this article's MeSH terms that fall into the Human, Animal, or Molecular/Cellular Biology categories) molecular_cellular: Fraction of MeSH terms that are in the Molecular/Cellular Biology category (out of this article's MeSH terms that fall into the Human, Animal, or Molecular/Cellular Biology categories) x_coord: X coordinate of the article on the Triangle of Biomedicine y_coord: Y Coordinate of the article on the Triangle of Biomedicine is_clinical: Flag indicating that this paper meets the definition of a clinical article. cited_by_clin: PMIDs of clinical articles that this article has been cited by. apt: Approximate Potential to Translate is a machine learning-based estimate of the likelihood that this publication will be cited in later clinical trials or guidelines. Calculation details are documented in Hutchins et al., PLoS Biol. 2019;17(10):e3000416. cited_by: PMIDs of articles that have cited this one. references: PMIDs of articles in this article's reference list. Large CSV files are zipped using zip version 4.5, which is more recent than the default unzip command line utility in some common Linux distributions. These files can be unzipped with tools that support version 4.5 or later such as 7zip. Comments and questions can be addressed to iCite@mail.nih.gov

提供机构：

The NIH Figshare Archive

创建时间：

2023-11-14

用户留言

有没有相关的论文或文献参考？

这个数据集是基于什么背景创建的？

数据集的作者是谁？

能帮我联系到这个数据集的作者吗？

这个数据集如何下载？

点击留言

数据主题

具身智能

数据集 4098个

机构 8个

大模型

数据集 439个

机构 10个

无人机

数据集 37个

机构 6个

指令微调

数据集 36个

机构 6个

蛋白质结构

数据集 50个

机构 8个

空间智能

数据集 21个

机构 5个

5,000+

优质数据集

54 个

任务类型

进入经典数据集

热门数据集

CatMeows

该数据集包含440个声音样本，由21只属于两个品种（缅因州库恩猫和欧洲短毛猫）的猫在三种不同情境下发出的喵声组成。这些情境包括刷毛、在陌生环境中隔离和等待食物。每个声音文件都遵循特定的命名约定，包含猫的唯一ID、品种、性别、猫主人的唯一ID、录音场次和发声计数。此外，还有一个额外的zip文件，包含被排除的录音（非喵声）和未剪辑的连续发声序列。

huggingface 收录

微博与抖音评论数据集

数据集源自微博平台与抖音平台的评论信息，基于两个热点事件来对评论等信息进行爬取收集形成数据集。原数据一共3W5条，但消极评论与中立评论远远大于积极评论。因此作特殊处理后，积极数据2601条，消极数据2367条，中立数据2725条，共7693条数据。

github 收录

CIFAR-10

CIFAR-10 数据集由 10 个类别的 60000 个 32x32 彩色图像组成，每个类别包含 6000 个图像。有 50000 个训练图像和 10000 个测试图像。数据集分为五个训练批次和一个测试批次，每个批次有 10000 张图像。测试批次恰好包含来自每个类别的 1000 个随机选择的图像。训练批次包含随机顺序的剩余图像，但一些训练批次可能包含来自一个类的图像多于另一个。在它们之间，训练批次恰好包含来自每个类别的 5000 张图像。

OpenDataLab 收录

CHFS中国家庭金融调查数据

中国家庭金融调查（China Household Finance Survey, CHFS）是中国家庭金融调查与研究中心（以下简称“中心”）在全国范围内开展的抽样调查项目，旨在收集有关家庭金融微观层次的相关信息，主要内容包括：人口特征与就业、资产与负债、收入与消费、社会保障与保险以及主观态度等相关信息，对家庭经济、金融行为进行了全面细致刻画。 CHFS基线调查始于2011年，目前已分别在2011、2013、2015、2017和2019年成功实施五轮全国范围内的抽样调查项目，2021年第六轮调查还在进行中。CHFS最新公开的2019年第五轮调查数据，样本覆盖全国29个省（自治区、直辖市），343个区县，1360个村（居）委会，样本规模达34643户，数据具有全国及省级代表性。

CnOpenData 收录

California Housing

Housing prices for residencies in California

kaggle 收录