Pubmed Journal Recommendation System dataset
收藏Mendeley Data2024-05-10 更新2024-06-28 收录
下载链接:
https://zenodo.org/records/8386011
下载链接
链接失效反馈官方服务:
资源简介:
Dataset for Journal recommendation, includes title, abstract, keywords, and journal. We extracted the journals and more information of: Jiasheng Sheng. (2022). PubMed-OA-Extraction-dataset [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6330817. Dataset Components: data_pubmed_all: This dataset encompasses all articles, each containing the following columns: 'pubmed_id', 'title', 'keywords', 'journal', 'abstract', 'conclusions', 'methods', 'results', 'copyrights', 'doi', 'publication_date', 'authors', 'AKE_pubmed_id', 'AKE_pubmed_title', 'AKE_abstract', 'AKE_keywords', 'File_Name'. data_pubmed: To focus on recent and relevant publications, we have filtered this dataset to include articles published within the last five years, from January 1, 2018, to December 13, 2022—the latest date in the dataset. Additionally, we have exclusively retained journals with more than 200 published articles, resulting in 262,870 articles from 469 different journals. data_pubmed_train, data_pubmed_val, and data_pubmed_test: For machine learning and model development purposes, we have partitioned the 'data_pubmed' dataset into three subsets—training, validation, and test—using a random 60/20/20 split ratio. Notably, this division was performed on a per-journal basis, ensuring that each journal's articles are proportionally represented in the training (60%), validation (20%), and test (20%) sets. The resulting partitions consist of 157,540 articles in the training set, 52,571 articles in the validation set, and 52,759 articles in the test set.
创建时间:
2023-10-05



