False Authorship: Methods and materials package

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://zenodo.org/record/13745075

下载链接

链接失效反馈

官方服务：

资源简介：

This package contains Python, shell, awk scripts, and data used to obtain the curated table and excerpt associated with the above named article. Data Contents The following data files are included. * README.md: This file * article-details.xlsx: Curated table with details of published articles in Microsoft Excel file format * index.html: HTML document with * links to GIJIR materials saved in the Internet Archive * a list of all the GIJIR articles’ citation data according to Crossref and links to each article’s locally available landing page, full-text PDF, plus links to Crossref metadata and the article via DOI and original journal URL. (Note that non-local, non-archived links may rot over time.) * ybs-works.json: Results of Crossref query to obtain all the publisher’s works made on 2024-09-22 * ChatGPT: Prompts and responses associated with the generation of a fake article in one of the journal’s topics. * global-us/metadata/: Article metadata as HTML files collected on 2024-09-10 * global-us/global-us.mellbaou.com/index.php/global/article/download/: A copy of the journal’s article PDFs as crawled on 2024-09-10 * spinellis business - Google Scholar.pdf: Printout of a Google Search query for the terms spinellis business made on 2025-02-06. Executable Contents The following programs and scripts are used to obtain the above contents. Makefile: Commands that orchestrate the articles’ analysis get-metadata.sh: Obtain article metadata pages from the journal’s web site apply-to-pdfs.sh: Apply the specified Python script to all article PDFs extract-citations-emails.py: Extract number of probable in-text citations and corresponding author email from article PDF extract-doi-affiliations.py: Extract article DOI and affiliations from an article’s metadata extract-all-doi-affiliations.sh: Extract article DOI and affiliations from all articles’ metadata emails-to-csv.awk: Convert emails and article numbers to CSV with URL for sending emails

本套件包含用于获取上述指定文章相关的精选整理表格及节选内容的Python、Shell、Awk脚本及配套数据。数据内容包含以下数据文件： * README.md：本说明文件 * article-details.xlsx：以微软Excel（Microsoft Excel）格式存储的已发表文章详情精选整理表格 * index.html：超文本标记语言（HTML）文档，包含以下内容： * 存档于互联网档案馆（Internet Archive）的GIJIR相关材料链接 * 所有GIJIR文章的引用数据清单（基于交叉引用（Crossref）），以及各文章的本地可用着陆页、全文便携式文档格式（PDF）的链接，同时提供交叉引用（Crossref）元数据、文章数字对象标识符（DOI）链接与原期刊网址链接。（注：非本地、未存档的链接可能随时间失效。） * ybs-works.json：2024年9月22日通过交叉引用（Crossref）查询获取的该出版社全部作品的查询结果 * ChatGPT：与该期刊某一主题下虚构文章生成相关的提示词与回复内容 * global-us/metadata/：2024年9月10日采集的以超文本标记语言（HTML）文件存储的文章元数据 * global-us/global-us.mellbaou.com/index.php/global/article/download/：2024年9月10日爬取的该期刊文章PDF副本 * spinellis business - Google Scholar.pdf：2025年2月6日针对“spinellis business”关键词的Google Scholar搜索结果打印件可执行内容以下程序与脚本用于获取上述数据内容： * Makefile：统筹文章分析流程的命令脚本 * get-metadata.sh：从期刊网站采集文章元数据页面的Shell脚本 * apply-to-pdfs.sh：将指定Python脚本应用于所有文章PDF的脚本 * extract-citations-emails.py：从文章PDF中提取疑似文本引用数量及对应作者邮箱的Python脚本 * extract-doi-affiliations.py：从文章元数据中提取文章DOI与作者机构的Python脚本 * extract-all-doi-affiliations.sh：从所有文章的元数据中提取文章DOI与作者机构的Shell脚本 * emails-to-csv.awk：将邮箱与文章编号转换为带发送邮件链接的CSV文件的Awk脚本

创建时间：

2025-03-18

5,000+

优质数据集

54 个

任务类型

进入经典数据集