Combined Retracted Paper IDs from Retraction Watch Data and OpenAlex (December 31, 2024 Version)
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14921711
下载链接
链接失效反馈官方服务:
资源简介:
What is this data?
This dataset was constructed for the analysis of retracted papers by combining "Retraction Watch Data" with OpenAlex data.
What is the format of this data?
It adopts a simple schema with two fields: OpenAlex’s works_id and DOI. Other necessary information—such as the paper’s title, authors, and retraction reasons—can be obtained using these IDs from sources like OpenAlex, Crossref, and "Retraction Watch Data."
What is the source data?
"Retraction Watch Data" is the December 31, 2024 version that was acquired on January 1, 2025 via Crossref’s GitLab repository (https://gitlab.com/crossref/retraction-watch-data). Similarly, OpenAlex was obtained as a snapshot from the official S3 repository, also dated December 31, 2024.
Why is this necessary?
As is well known, OpenAlex incorporates "Retraction Watch Data", so one might assume that analyzing the retraction flags in the works object would suffice. However, as pointed out in the following literature (https://doi.org/10.48550/arXiv.2403.13339), there are many errors. Moreover, the retraction reasons are available only in the "Retraction Watch Data." Therefore, by independently merging both datasets, the accuracy and scope of the analysis can be enhanced.
How were they combined?
In merging "Retraction Watch Data" with OpenAlex, the paper’s DOI (Digital Object Identifier) is used as the primary key.
Note that "Retraction Watch Data"contains two types of DOIs:
RetractionDOI: The DOI of the retraction notice (i.e., the article announcing the retraction).
OriginalPaperDOI: The DOI of the retracted paper (i.e., the regular paper that is being retracted).
There are cases where the two DOIs are identical or where only one is available. For the merging process described above, the OriginalPaperDOI is used. The number of unique OriginalPaperDOIs is 50,831, representing the maximum number of mergeable candidates. In comparison, the simple unique count of RetractionDOIs is 51,050; excluding those identical to the OriginalPaperDOIs leaves 38,314, and further limiting to records that do not have an OriginalPaperDOI yields 2,605 unique RetractionDOIs.
Even among papers with DOIs, some are not included in OpenAlex, resulting in 50,073 records being linked. Among these linked retracted papers, several were published in the 1960s. Therefore, the analysis was limited to a 10-year period from 2013 to 2022, resulting in 32,368 papers being included in the study.
创建时间:
2025-02-25



