five

Publication dates for ArXiv publication versions

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/11088186
下载链接
链接失效反馈
官方服务:
资源简介:
Lookup tables in plain JSON, mapping ArXiv publication version identifiers to their respective publications dates. The JSON files are archived in arxiv-publication-dates-by-identifier-prefix.tar.gz.The archive contains files named after the date prefix of the ArXiv publication version identifiers they contain.E.g., the file 1908.json will contain the data for identifiers 1908.12345v1, 1908.12345v2, 1908.23456v1, etc.Publication dates are given in the format YYYY-MM-DD. Reproducibility The Snakemake workflow that has produced this dataset has been archived and is available in arxiv-publication-dates-workflow.tar.gz. Changes in version 1.2 Version 1.2 includes a JSON file that contains a JSON array with all file names included in the dataset: file_names.json. Changes in version 1.1 For version 1.1, the dataset was extended manually to include a single missing date for arXiv:0906.3421v3: 2010-02-02. As of 2024-05-13, the date for the respective version had not been provided in the arXivRaw OAI-PMH data (http://export.arxiv.org/oai2?verb=GetRecord&identifier=oai:arXiv.org:0906.3421&metadataPrefix=arXivRaw). Running the workflow To reproduce the dataset on a Linux machine, you need a version of the conda package manager installed on your system. Run the following: # Extract the archived workflow tar -xf my-workflow.tar.gz # Create conda environment from lock file conda env create -n arxiv-metadata --file conda-environment.lock.yaml # Activate the environment conda activate arxiv-metadata # Optionally, dry-run the workflow snakemake -n # Produce the output files snakemake --keep-storage-local-copies --software-deployment-method conda -c Then, append the file 0906.json (included in the tar.gz output) with value 2010-02-02 for a new key 0906.3421v3. Workflow To adapt/change the workflow, clone it from https://github.com/sdruskat/arxiv-publication-metadata.The workflow version used to produce this dataset is available at https://doi.org/10.5281/zenodo.11507183.
创建时间:
2024-06-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作