TrendyGenes, a computational pipeline for the detection of literature trends in academia and drug discovery
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/records/8362679
下载链接
链接失效反馈官方服务:
资源简介:
TrendyGenes Literature Mining
This repository contains the files and code to build the TrendyGenes pipeline described in the paper "TrendyGenes, a computational pipeline for the detection of literature trends in academia and drug discovery" (Serrano Nájera et al. 2021).
Contents
The folder contains the following files:
PubMed_*.csv.gz: CSV files containing PubMed metadata (titles, abstracts etc.) split into multiple files
CoCitations*.csv.gz: CSV files containing co-citation networks computed from PubMed
MeSH2PMID.csv.gz: Map of MeSH terms to PMIDs
Authorship_Neo4J_complete.csv.gz: Authorship information for PubMed papers
Disease2PMID_Neo4J_complete.csv.gz: Map of disease terms to PMIDs after disambiguation
Genes_Neo4J_complete_CCPU.csv.gz: Map of genes to PMIDs after disambiguation
genes.csv.gz: List of human genes
diseases.csv.gz: List of MeSH disease terms
import_command*.txt: Commands to import data into Neo4j graph database
Building the Knowledge Graph
The various CSV files can be imported into a Neo4j graph database to build the knowledge graph containing publications, authors, genes, diseases etc. and their connections as described in the paper.
The import_command*.txt files contain the Neo4J bulk import syntax needed to import the data into Neo4j:
https://neo4j.com/developer/guide-import-csv/
Citation
Serrano Nájera G, Narganes Carlón D, Crowther DJ. TrendyGenes, a computational pipeline for the detection of literature trends in academia and drug discovery. Scientific Reports. 2021 Aug 3;11(1):15747.
License
[MIT]
This summarizes the key files provided and briefly explains how they can be used to build the knowledge graph database for the TrendyGenes pipeline. The citation provides a reference to the original paper.
创建时间:
2023-09-20



