Output datasets from ML–assisted bibliometric workflow in African phytochemical metabolomics research

Name: Output datasets from ML–assisted bibliometric workflow in African phytochemical metabolomics research
Creator: figshare
Published: 2025-10-19 22:25:34
License: 暂无描述

DataCite Commons2025-10-19 更新2026-02-09 收录

下载链接：

https://figshare.com/articles/dataset/Output_datasets_from_ML_assisted_bibliometric_workflow_in_African_phytochemical_metabolomics_research/30396481/1

下载链接

链接失效反馈

官方服务：

资源简介：

This collection contains supplementary datasets generated during the machine learning–assisted bibliometric workflow for metabolomics and phytochemical research. The datasets represent sequential outputs derived from the integration and harmonisation of bibliographic metadata from Scopus, Web of Science (WoS), and Dimensions, processed via R and Python environments.The datasets were produced through distinct workflow stages:Dataset 1A (merged_dataset2.xlsx): Consolidated metadata produced in R from the merged raw bibliographic exports of Scopus, WoS, and Dimensions.Dataset 1B (sampled_data.xlsx): A stratified random sample generated in Python for pretraining and manual annotation.Dataset 1C (sample_data_pretrained.xlsx): Annotated sample dataset manually screened according to inclusion and exclusion criteria.Dataset 1D (highlighted_full_data_with_predictions.xlsx): The complete harmonised dataset automatically classified using the trained XGBoost model.Dataset 1E (absolute_metabolomics_data.xlsx): Final curated dataset of relevant records extracted from the ML-filtered corpus.Importantly, the file names of each dataset presented here were renamed from their original Google Drive file paths (referenced in the Python Google Colab scripts) to ensure sequential, descriptive, and logically ordered naming. This adjustment enhances clarity, reproducibility, and cross-reference consistency across all linked repositories.

提供机构：

figshare

创建时间：

2025-10-19

5,000+

优质数据集

54 个

任务类型

进入经典数据集