Dates related to the research results presented in the article: A Novel Multi-Criteria AI Approach for Selecting Word Embedding Models to Personalize Rankings of Scholar Publications

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://data.mendeley.com/datasets/7ysnrrwn3f

下载链接

链接失效反馈

官方服务：

资源简介：

Dataset Description for the Article: "A Novel Multi-Criteria AI Approach for Selecting Word Embedding Models to Personalize Rankings of Scholar Publications" This dataset contains the results of experiments described in the article, focusing on the personalization of scholarly publication rankings using text embedding models. Contents of the CSV Files (32 in total): Each file provides a ranked list of scholarly publications retrieved from the Scopus database, enriched with natural language processing (NLP) features. Data Structure: Standard Metadata Columns (from Scopus export): These include typical bibliographic information such as: Title, Authors, Year, DOI, Source title, Abstract, Keywords, Affiliations, Document Type, Cited by, and others. Computed Columns (for experimentation): combined_embeddings: A list of numeric values representing the semantic embedding vector generated from the concatenated title and abstract of the publication. distance_cosine: The cosine distance between the publication’s embedding and a reference embedding (e.g., based on a user query). Values range from 0 to 1, where lower values indicate higher semantic similarity. Purpose and Use: These data support the analysis and comparison of embedding models in the context of personalized scholarly recommendation. They also serve as reproducible material for the experiments presented in the paper.

创建时间：

2025-06-03