emmah7/AI-arXiv
收藏Hugging Face2026-03-31 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/emmah7/AI-arXiv
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: corpus_id
dtype: string
- name: arxiv_id
dtype: string
- name: date
dtype: string
- name: title
dtype: string
- name: abstract
dtype: string
- name: categories
list: string
- name: roles
list: string
- name: key_references
list:
- name: corpus_id
dtype: string
- name: num_citations
dtype: float64
- name: authors
list:
- name: author_id
dtype: string
- name: h_index
dtype: float64
- name: name
dtype: string
- name: num_citations
dtype: float64
- name: num_papers
dtype: float64
- name: publication_history
list: string
- name: citation_trajectory
list: int64
- name: specter2_embed
list: float64
- name: primary_cluster
dtype: float64
- name: soft_membership
list: float64
- name: novelty
dtype: float64
- name: citation_percentile
dtype: float64
splits:
- name: '2015_2016'
num_bytes: 18432868
num_examples: 8384
- name: '2016_2017'
num_bytes: 27934504
num_examples: 13082
- name: '2017_2018'
num_bytes: 41077893
num_examples: 19929
- name: '2018_2019'
num_bytes: 60834015
num_examples: 30380
- name: '2019_2020'
num_bytes: 78424275
num_examples: 40382
- name: '2020_2021'
num_bytes: 91059981
num_examples: 48642
- name: '2021_2022'
num_bytes: 93072065
num_examples: 51389
- name: '2022_2023'
num_bytes: 109511855
num_examples: 63165
- name: '2023_2024'
num_bytes: 138019982
num_examples: 82616
- name: '2024_2025'
num_bytes: 167148834
num_examples: 103955
- name: '2025_2026'
num_bytes: 76283920
num_examples: 48596
download_size: 559696172
dataset_size: 901800192
configs:
- config_name: default
data_files:
- split: '2015_2016'
path: data/2015_2016-*
- split: '2016_2017'
path: data/2016_2017-*
- split: '2017_2018'
path: data/2017_2018-*
- split: '2018_2019'
path: data/2018_2019-*
- split: '2019_2020'
path: data/2019_2020-*
- split: '2020_2021'
path: data/2020_2021-*
- split: '2021_2022'
path: data/2021_2022-*
- split: '2022_2023'
path: data/2022_2023-*
- split: '2023_2024'
path: data/2023_2024-*
- split: '2024_2025'
path: data/2024_2025-*
- split: '2025_2026'
path: data/2025_2026-*
---
提供机构:
emmah7



