five

Output datasets from ML–assisted bibliometric workflow in African phytochemical metabolomics research

收藏
DataCite Commons2025-10-19 更新2026-02-09 收录
下载链接:
https://figshare.com/articles/dataset/Output_datasets_from_ML_assisted_bibliometric_workflow_in_African_phytochemical_metabolomics_research/30396481/1
下载链接
链接失效反馈
官方服务:
资源简介:
This collection contains supplementary datasets generated during the machine learning–assisted bibliometric workflow for metabolomics and phytochemical research. The datasets represent sequential outputs derived from the integration and harmonisation of bibliographic metadata from <b>Scopus</b>, <b>Web of Science (WoS)</b>, and <b>Dimensions</b>, processed via R and Python environments.The datasets were produced through distinct workflow stages:<b>Dataset 1A (merged_dataset2.xlsx):</b> Consolidated metadata produced in R from the merged raw bibliographic exports of Scopus, WoS, and Dimensions.<b>Dataset 1B (sampled_data.xlsx):</b> A stratified random sample generated in Python for pretraining and manual annotation.<b>Dataset 1C (sample_data_pretrained.xlsx):</b> Annotated sample dataset manually screened according to inclusion and exclusion criteria.<b>Dataset 1D (highlighted_full_data_with_predictions.xlsx):</b> The complete harmonised dataset automatically classified using the trained XGBoost model.<b>Dataset 1E (absolute_metabolomics_data.xlsx):</b> Final curated dataset of relevant records extracted from the ML-filtered corpus.Importantly, the <b>file names of each dataset</b> presented here were <b>renamed from their original Google Drive file paths</b> (referenced in the Python Google Colab scripts) to ensure <b>sequential, descriptive, and logically ordered naming</b>. This adjustment enhances clarity, reproducibility, and cross-reference consistency across all linked repositories.
提供机构:
figshare
创建时间:
2025-10-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作