Output datasets from ML–assisted bibliometric workflow in African phytochemical metabolomics research
收藏DataCite Commons2025-10-19 更新2026-02-09 收录
下载链接:
https://figshare.com/articles/dataset/Output_datasets_from_ML_assisted_bibliometric_workflow_in_African_phytochemical_metabolomics_research/30396481/1
下载链接
链接失效反馈官方服务:
资源简介:
This collection contains supplementary datasets generated during the machine learning–assisted bibliometric workflow for metabolomics and phytochemical research. The datasets represent sequential outputs derived from the integration and harmonisation of bibliographic metadata from <b>Scopus</b>, <b>Web of Science (WoS)</b>, and <b>Dimensions</b>, processed via R and Python environments.The datasets were produced through distinct workflow stages:<b>Dataset 1A (merged_dataset2.xlsx):</b> Consolidated metadata produced in R from the merged raw bibliographic exports of Scopus, WoS, and Dimensions.<b>Dataset 1B (sampled_data.xlsx):</b> A stratified random sample generated in Python for pretraining and manual annotation.<b>Dataset 1C (sample_data_pretrained.xlsx):</b> Annotated sample dataset manually screened according to inclusion and exclusion criteria.<b>Dataset 1D (highlighted_full_data_with_predictions.xlsx):</b> The complete harmonised dataset automatically classified using the trained XGBoost model.<b>Dataset 1E (absolute_metabolomics_data.xlsx):</b> Final curated dataset of relevant records extracted from the ML-filtered corpus.Importantly, the <b>file names of each dataset</b> presented here were <b>renamed from their original Google Drive file paths</b> (referenced in the Python Google Colab scripts) to ensure <b>sequential, descriptive, and logically ordered naming</b>. This adjustment enhances clarity, reproducibility, and cross-reference consistency across all linked repositories.
提供机构:
figshare
创建时间:
2025-10-19



