five

PubMed Research Data Product: Biomedical Citations

收藏
Snowflake2025-07-10 更新2025-07-11 收录
下载链接:
https://app.snowflake.com/marketplace/listing/GZ2FWZ5FDC1
下载链接
链接失效反馈
官方服务:
资源简介:
The PubMed Research Database by The Data Developers provides the PubMed corpus as a powerful, query-ready relational database in your data environment. This data product eliminates the complexities of API management and XML parsing, allowing your team to move beyond simple searches and perform large-scale, structured analysis of biomedical literature. This product is more than a list of articles; it is a deeply structured database that connects literature to its critical context. You can immediately query and join tables to access: - **Core Article Data:** Comprehensive citation information including the article_title ,abstract text ,journal_title , and publication dates from thearticles table. - **Detailed Author & Affiliation Data:** Granular author details, including last_name ,fore_name ,orcid , and theiraffiliation text for tracking institutional research output, available in the authors table. - **Funding and Grant Information:** Track research funding with detailed grant_id ,agency , andcountry data from thegrants table. - **Systematic Topic Indexing:** Leverage the NLM's controlled vocabulary with descriptor_name andqualifier_name from themesh_headings table for powerful, systematic searches. - **Linked Chemical & Substance Data:** Connect articles directly to chemical substances via registry_number andsubstance_name from thechemicals table. - **Complete Citation Networks:** Map the flow of scientific discovery and identify influential papers by analyzing the citation links within the references table. **Key Value & Features:** - **Perform Systematic Literature Reviews:** Go beyond simple keyword searching by using the structured mesh_headings table to perform precise, repeatable, and comprehensive literature analysis. - **Analyze Research Funding Landscapes:** Join articles and grants data to understand which agencies are funding specific research areas, track competitor funding, and identify key opinion leaders. - **Map Institutional & Competitive Output:** Use author affiliation data to benchmark research output from academic institutions and commercial competitors, find collaborators, and track scientific trends. - **Build Powerful Knowledge Graphs:** Utilize the rich relational links between articles ,authors ,chemicals , and citationreferences to model the entire landscape of biomedical research. This product is essential for teams in Medical Affairs, Competitive Intelligence, Health Economics and Outcomes Research (HEOR), R&D Strategy, and academic institutions that need to derive deep insights from the world's biomedical literature.
提供机构:
The Data Developers LLC
创建时间:
2025-07-10
原始信息汇总

PubMed Research Data Product: Biomedical Citations

概述

  • 提供PubMed语料库作为关系型数据库,支持大规模NLP和AI驱动的生物医学发现。
  • 消除API管理和XML解析的复杂性,支持结构化分析生物医学文献。

核心数据表

  • articles表:包含文章标题(article_title)、摘要(abstract)、期刊名称(journal_title)、出版日期等。
  • authors表:包含作者姓氏(last_name)、名字(fore_name)、ORCID(orcid)、所属机构(affiliation)。
  • grants表:包含资助信息(grant_id、agency、country)。
  • mesh_headings表:包含NLM控制词汇(descriptor_name、qualifier_name)。
  • chemicals表:包含化学物质信息(registry_number、substance_name)。
  • references表:包含文章引用关系。

关键价值与功能

  • 执行系统性文献综述
  • 分析研究资助格局
  • 绘制机构和竞争性研究成果图
  • 构建强大的知识图谱

适用领域

  • 医学事务
  • 竞争情报
  • 健康经济学与结果研究(HEOR)
  • 研发战略
  • 学术机构

商业需求

  • 生命科学商业化:识别关键意见领袖(KOLs)
  • 真实世界数据(RWD):提供科学背景
  • 市场准入与HEOR:加速价值档案创建
  • 竞争情报与研发战略:监控竞争对手活动
  • 数据驱动的资助申请:加强提案

数据字典(部分)

ARTICLES表

  • ABSTRACT: 文章内容摘要
  • ARTICLE_TITLE: 文章完整标题
  • JOURNAL_TITLE: 期刊名称
  • DOI: 数字对象标识符
  • PUBLICATION_DAY: 出版日

使用示例

  1. 识别精神分裂症领域关键意见领袖 sql SELECT a.last_name, a.fore_name, a.affiliation, COUNT(DISTINCT a.pubmed_id) AS publication_count FROM authors a JOIN mesh_headings mh ON a.pubmed_id = mh.pubmed_id WHERE mh.descriptor_name = Schizophrenia GROUP BY a.last_name, a.fore_name, a.affiliation ORDER BY publication_count DESC LIMIT 20;

  2. 识别顶级资助研究领域 sql SELECT mh.descriptor_name, COUNT(DISTINCT g.pubmed_id) AS funded_publication_count FROM grants g JOIN mesh_headings mh ON g.pubmed_id = mh.pubmed_id WHERE g.agency ILIKE %National Institutes of Health% OR g.agency ILIKE %NIH% GROUP BY mh.descriptor_name ORDER BY funded_publication_count DESC LIMIT 20;

  3. 在高影响力期刊中发现新兴关键词 sql SELECT k.keyword_text, COUNT(*) AS keyword_frequency FROM keywords k JOIN articles a ON k.pubmed_id = a.pubmed_id WHERE (a.journal_title ILIKE %Nature% OR ...) AND NOT EXISTS (...) GROUP BY k.keyword_text

技术信息

  • 更新频率:每月
  • 时间覆盖范围:最近1个月
  • 地理覆盖范围:全球
  • 云区域可用性:AWS US East (N. Virginia)

提供商信息

  • 提供商:The Data Developers LLC
  • 销售联系:marketplace.snowflake@thedatadevelopers.com
  • 支持联系:support.marketplace@thedatadevelopers.com
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集将PubMed生物医学文献转化为可直接查询的关系型数据库,包含文章、作者、资助、主题索引等结构化数据,支持系统文献综述、研究资助分析等深度应用,适用于医学事务、学术研究等领域。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作