five

NIH NCBI PubMed Central (PMC) Article Datasets - Full-Text Biomedical and Life Sciences Journal Articles on AWS

收藏
www.ncbi.nlm.nih.gov2025-03-24 收录
下载链接:
https://www.ncbi.nlm.nih.gov/pmc/about/copyright/
下载链接
链接失效反馈
官方服务:
资源简介:
PubMed Central® (PMC) is a free full-text archive of biomedical and life sciences journal article at the U.S. National Institutes of Health&#39;s National Library of Medicine (NIH/NLM). The PubMed Central (PMC) Article Datasets include full-text articles archived in PMC and made available under license terms that allow for text mining and other types of secondary analysis and reuse. The articles are organized on AWS based on general license type:<br/><br/> The PMC Open Access (OA) Subset, which includes all articles in PMC with a machine-readable Creative Commons license<br/><br/> The Author Manuscript Dataset, which includes all articles collected under a funder policy in PMC and made available in machine-readable formats for text mining<br/><br/> These datasets collectively span more than half of PMC’s total collection of full-text articles. PMC enables access to these datasets to expand the impact of open access and publicly-funded research; enable greater machine learning across the spectrum of scientific research; reach new audiences; and open new doors for discovery. The bucket in this registry contains individual articles in NISO Z39.96-2015 JATS XML format as well as in plain text as extracted from the XML. The bucket is updated daily with new and updated articles. Also included are file lists that include metadata for articles in each dataset.

PubMed Central®(PubMed中央)是美国国立卫生研究院(NIH)国家医学图书馆(NLM)免费全文生物医学和生命科学期刊文章的存档库。PubMed Central(PMC)文章数据集包括存档在PMC中并在许可条款下允许进行文本挖掘和其他类型二级分析和再利用的全文文章。这些文章基于一般许可类型组织在AWS上:<br/><br/> PMC开放获取(OA)子集,包括PMC中所有带有机器可读的Creative Commons许可的文章<br/><br/> 作者手稿数据集,包括在PMC下根据资助政策收集并以机器可读格式提供用于文本挖掘的所有文章<br/><br/> 这些数据集共同涵盖了PMC全文文章总收藏量的一半以上。PMC通过提供对这些数据集的访问,旨在扩大开放获取和公共资助研究的影响力;促进科学研究全谱系内机器学习的提升;触及新的受众群体;并为发现开辟新的途径。该注册表中的存储桶包含以NISO Z39.96-2015 JATS XML格式以及从XML中提取的纯文本格式的单个文章。存储桶每日更新,包含新文章和更新文章。此外,还包括包含每个数据集中文章元数据的文件列表。
提供机构:
National Library of Medicine (NLM)
二维码
社区交流群
二维码
科研交流群
商业服务