five

Parsed Open Citations and PubMed Data

收藏
doi.org2025-01-16 收录
下载链接:
https://doi.org/10.13012/B2IDB-5216575_V1
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains five files. (i) open_citations_jan2024_pub_ids.csv.gz, open_citations_jan2024_iid_el.csv.gz, open_citations_jan2024_el.csv.gz, and open_citation_jan2024_pubs.csv.gz represent a conversion of Open Citations to an edge list using integer ids assigned by us. The integer ids can be mapped to omids, pmids, and dois using the open_citation_jan2024_pubs.csv and open_citations_jan2024_pub_ids.scv files. The network consists of 121,052,490 nodes and 1,962,840,983 edges. Code for generating these data can be found https://github.com/chackoge/ERNIE_Plus/tree/master/OpenCitations. (ii) The fifth file, baseline2024.csv.gz, provides information about the metadata of PubMed papers. A 2024 version of PubMed was downloaded using Entrez and parsed into a table restricted to records that contain a pmid, a doi, and has a title and an abstract. A value of 1 in columns indicates that the information exists in metadata and a zero indicates otherwise. Code for generating this data: https://github.com/illinois-or-research-analytics/pubmed_etl. If you use these data or code in your work, please cite https://doi.org/10.13012/B2IDB-5216575_V1.

本数据集包含五个文件。其中,(i)open_citations_jan2024_pub_ids.csv.gz、open_citations_jan2024_iid_el.csv.gz、open_citations_jan2024_el.csv.gz以及open_citation_jan2024_pubs.csv.gz文件将开放引文转换为使用我方分配的整数ID的边列表。这些整数ID可通过open_citation_jan2024_pubs.csv和open_citations_jan2024_pub_ids.scv文件映射至omids、pmids和dois。该网络由121,052,490个节点和1,962,840,983条边构成。生成这些数据的代码可于以下链接查阅:https://github.com/chackoge/ERNIE_Plus/tree/master/OpenCitations。(ii)第五个文件,baseline2024.csv.gz,提供了PubMed论文元数据的信息。通过Entrez下载了2024版本的PubMed,并将其解析成表格,仅限于包含pmid、doi以及标题和摘要的记录。列中的值为1表示该信息存在于元数据中,为零则表示不存在。生成此数据的代码可于以下链接查阅:https://github.com/illinois-or-research-analytics/pubmed_etl。如在本研究中使用这些数据或代码,请引用https://doi.org/10.13012/B2IDB-5216575_V1。
提供机构:
Illinois Data Bank
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作