emmah7/AI-arXiv-enriched
收藏Hugging Face2026-04-16 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/emmah7/AI-arXiv-enriched
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: corpus_id
dtype: string
- name: arxiv_id
dtype: string
- name: date
dtype: string
- name: title
dtype: string
- name: abstract
dtype: string
- name: categories
list: string
- name: roles
list: string
- name: key_references
list:
- name: corpus_id
dtype: string
- name: num_citations
dtype: float64
- name: authors
list:
- name: author_id
dtype: string
- name: h_index
dtype: float64
- name: name
dtype: string
- name: num_citations
dtype: float64
- name: num_papers
dtype: float64
- name: publication_history
list: string
- name: citation_trajectory
list: int64
- name: specter2_embed
list: float64
- name: primary_cluster
dtype: float64
- name: soft_membership
list: float64
- name: novelty
dtype: float64
- name: citation_percentile
dtype: float64
- name: authors_openalex
list:
- name: name
dtype: string
- name: openalex_id
dtype: string
- name: orcid
dtype: string
- name: position
dtype: string
- name: citation_count
dtype: int64
splits:
- name: '2015_2016'
num_bytes: 20602680
num_examples: 8384
- name: '2016_2017'
num_bytes: 31377512
num_examples: 13082
- name: '2017_2018'
num_bytes: 46405775
num_examples: 19929
- name: '2018_2019'
num_bytes: 69160596
num_examples: 30380
- name: '2019_2020'
num_bytes: 89738889
num_examples: 40382
- name: '2020_2021'
num_bytes: 106358157
num_examples: 48642
- name: '2021_2022'
num_bytes: 113164742
num_examples: 51389
- name: '2022_2023'
num_bytes: 136451126
num_examples: 63165
- name: '2023_2024'
num_bytes: 174532941
num_examples: 82616
- name: '2024_2025'
num_bytes: 206865223
num_examples: 103955
- name: '2025_2026'
num_bytes: 94666191
num_examples: 48596
download_size: 740844751
dataset_size: 1089323832
configs:
- config_name: default
data_files:
- split: '2015_2016'
path: data/2015_2016-*
- split: '2016_2017'
path: data/2016_2017-*
- split: '2017_2018'
path: data/2017_2018-*
- split: '2018_2019'
path: data/2018_2019-*
- split: '2019_2020'
path: data/2019_2020-*
- split: '2020_2021'
path: data/2020_2021-*
- split: '2021_2022'
path: data/2021_2022-*
- split: '2022_2023'
path: data/2022_2023-*
- split: '2023_2024'
path: data/2023_2024-*
- split: '2024_2025'
path: data/2024_2025-*
- split: '2025_2026'
path: data/2025_2026-*
---
提供机构:
emmah7



