assafvayner/arxiv-papers-by-subject
收藏Hugging Face2026-04-21 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/assafvayner/arxiv-papers-by-subject
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含超过250万篇arXiv论文的元数据,按主题代码、年份和月份进行分区,以便高效选择性访问。数据集重新组织了原始[nick007x/arxiv-papers](https://huggingface.co/datasets/nick007x/arxiv-papers)数据集,使其能够按需下载特定主题和时间段的数据,而不是整个数据集。数据集结构包括主题代码(如`cs.AI`、`astro-ph.CO`、`math.NA`)、年份(1989–2025)和月份(01–12),支持按研究领域、时间范围进行高效下载和增量更新。
This dataset contains metadata for over 2.5 million arXiv papers, organised into a hierarchical directory structure that allows users to download only the specific subjects and time periods they need, rather than the entire dataset. Derived from the original [nick007x/arxiv-papers](https://huggingface.co/datasets/nick007x/arxiv-papers) dataset, it partitions data into small, focused parquet files by subject code (e.g., `cs.AI`, `astro-ph.CO`, `math.NA`), year (1989–2025), and month (01–12), enabling efficient access to specific research domains and time ranges.
提供机构:
assafvayner



