biblio
收藏Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/bd8wfgv39n
下载链接
链接失效反馈官方服务:
资源简介:
The dataset “2190 scopus.csv” is a structured bibliographic database exported from Scopus, containing metadata for 2,190 indexed scholarly publications. Each row represents one publication, and the dataset includes 46 variables capturing detailed information related to authorship, journal source, publication characteristics, indexing identifiers, and accessibility status.
The unit of analysis is the individual publication. The dataset includes various document types such as research articles, reviews, conference papers, editorials, and letters, with “Article” being the predominant category. Most records are marked as “Final” under publication stage, indicating completed and officially published works.
Authorship information is comprehensively represented through three main fields: abbreviated author names, full author names with Scopus Author IDs, and numeric Author IDs. The presence of unique Scopus Author IDs enhances disambiguation, enabling accurate tracking of author productivity, collaboration patterns, and network analysis. Multiple authors per paper can be identified using semicolon-separated formatting.
The “Title” field provides the full publication title, enabling thematic analysis, keyword extraction, and topic modeling. The “Year” variable supports temporal trend analysis, allowing examination of annual research output and growth patterns. The dataset includes recent publications up to 2025, indicating contemporary coverage.
Journal-related metadata includes Source Title, Abbreviated Source Title, Volume, Issue, Article Number, and pagination details. These variables allow journal frequency analysis, Bradford’s Law applications, and disciplinary mapping. Some journals use article numbering instead of traditional page ranges, leading to minor missing values in pagination fields.
Several standardized identifiers are included, such as the Scopus Electronic Identifier (EID), PubMed ID (for biomedical articles), ISBN (for books or proceedings), and CODEN. The EID serves as a unique document-level identifier, supporting deduplication and database merging. PubMed IDs enable cross-referencing with biomedical databases.
The dataset also contains an “Open Access” variable, specifying whether publications are Gold Open Access, Hybrid, or Closed Access. This allows analysis of accessibility trends and open science adoption. The language field indicates the original publication language, predominantly English.
Some fields contain missing values, particularly ISBN, CODEN, PubMed ID (for non-biomedical articles), and pagination where article numbering is used. Missingness appears systematic rather than random and reflects publication format differences.
创建时间:
2026-02-26



