five

Indexing status of journals using Open Journal Systems and related properties

收藏
DataCite Commons2025-05-12 更新2025-04-15 收录
下载链接:
https://dataverse.harvard.edu/citation?persistentId=doi:10.7910/DVN/QBJE3V
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset is a comprehensive, journal-level collection of metadata for 47,625 active journals that publish using the Open Journal Systems (OJS) platform. It covers the period from 2020 to 2023 and aggregates information from multiple sources, including the PKP Beacon, ISSN.org, DOI resolution services, and bibliographic indices such as OpenAlex, DOAJ, and Scopus. The dataset not only captures the basic journal identifiers and descriptive metadata but also a rich set of indicators that support multifaceted analysis of scholarly publishing practices. <br>Key characteristics of the dataset include: <ul> <b>Identifiers and Journal Metadata</b>:  – Primary and secondary ISSNs validated against the official registry.  – Journal titles as registered within OJS, with standardization measures applied (e.g., transliteration and phonetic comparisons).  – A consolidated country of publication determined through multiple sources such as ISSN records, DOAJ listings, and IP-address based geolocation. </ul> <ul> <b> Publication Activity</b>:  – Annual record counts for the years 2020 through 2023 along with cumulative document counts.  – Detailed measures of scholarly output per journal that allow evaluation of publication volume, which serves as a proxy for journal activity and editorial engagement. </ul> <ul> <b> Indexing and DOI Usage: </b>  – Indicators showing whether a journal is indexed in key bibliographic databases like OpenAlex, Scopus, and DOAJ.  – Variables indicating whether the journal assigns Digital Object Identifiers (DOIs) through registration agencies (with specific fields for Crossref, DataCite, Medra, JALC, Airiti, etc.).  – Matched counts of DOIs verified against external resolvers, highlighting the reliability and completeness of a journal's metadata. </ul> <ul> <b> Economic and Regional Context</b>:  – Data on the country’s income group and GDP per capita, which serve as proxies for the resource environment and infrastructural capacity available to each journal.  – The total number of JUOJS identified per country, providing a measure of the national landscape of scholarly publishing. </ul> <ul> <b> Digital Presence and Repository Characteristics</b>:  – Web visibility metrics provided by Open PageRank scores for both the individual journal’s webpage and its hosting repository’s endpoint.  – The size of the OJS repository (i.e., the number of journals hosted on the same installation), offering insight into shared infrastructure and editorial scale. </ul> <ul> <b> Linguistic and Disciplinary Classification: </b>  – Automated language detection results and aggregated language proportions, highlighting the degree to which journals publish in English versus non‑English languages.  – A machine-learning derived subject classification assigning each journal to a main scholarly discipline, which enables discipline-specific analysis. </ul> Designed for bibliometric and scientometric research, the dataset enables users to explore the relationships between a journal’s editorial practices, its digital identifier usage, national and economic contexts, and its likelihood of being indexed in inclusive scholarly databases. The extensive metadata and derived metrics support complex analyses, such as classification modeling to identify determinants of indexing in OpenAlex and factors associated with the adoption of Crossref DOIs. The dataset is contributes to exploring trends in global scholarly communication and assess structural disparities in the digital dissemination of knowledge.

本数据集为47625种基于开放期刊系统(Open Journal Systems, OJS)平台运营的活跃期刊提供了全面的期刊级元数据集合。数据集覆盖2020至2023年的时间范围,整合了多源信息,包括PKP Beacon、ISSN官方网站、DOI解析服务,以及OpenAlex、DOAJ、Scopus等书目索引库。本数据集不仅收录了基础期刊标识符与描述性元数据,还包含一系列丰富的指标,可用于多维度分析学术出版实践。 本数据集的核心特征如下: – **标识符与期刊元数据**: · 经官方注册表验证的核心与辅助国际标准刊号(ISSN) · OJS平台内登记的期刊名称,并已应用标准化处理(如音译与语音比对) · 通过ISSN记录、DOAJ收录信息及基于IP地址的地理定位等多源数据整合得到的出版国家 – **出版活动**: · 2020至2023年的年度文献记录量及累计文献总量 · 各期刊的学术产出量化指标,可用于评估出版规模,以此作为期刊活跃度与编辑参与度的代理变量 – **索引与DOI使用情况**: · 期刊是否被OpenAlex、Scopus、DOAJ等核心书目索引数据库收录的标识变量 · 期刊是否通过注册机构分配数字对象标识符(Digital Object Identifiers, DOIs)的变量(包含Crossref、DataCite、Medra、JALC、Airiti等具体注册机构字段) · 经外部解析服务验证的DOI匹配数量,可反映期刊元数据的可靠性与完整性 – **经济与区域背景**: · 期刊所属国家的收入层级与人均GDP数据,可作为各期刊可获得的资源环境与基础设施能力的代理变量 · 各国已识别的JUOJS总量,可用于衡量一国学术出版的整体格局 – **数字影响力与仓储特征**: · 基于Open PageRank分值的网页可见性指标,涵盖期刊单独网页及其所属仓储端点的得分 · OJS仓储规模(即同一部署环境下托管的期刊数量),可反映共享基础设施情况与编辑运营规模 – **语言与学科分类**: · 自动语言检测结果及聚合语言占比,可体现期刊以英语与非英语语言出版的分布情况 · 机器学习生成的主题分类结果,可将各期刊归至对应的核心学术学科,支持分学科分析 本数据集专为文献计量与科学计量研究设计,可支持用户探究期刊编辑实践、数字标识符使用情况、国家与经济背景,以及其被主流学术数据库收录概率之间的关联。丰富的元数据与衍生指标可支撑复杂分析任务,例如构建分类模型以识别期刊被OpenAlex收录的决定因素,以及期刊采用Crossref DOIs的相关影响因素。 本数据集有助于探究全球学术传播趋势,并评估知识数字传播领域的结构性差异。
提供机构:
Harvard Dataverse
创建时间:
2025-01-31
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作