ulab-ai/ResearchArcade-arxiv-paragraphs
收藏Hugging Face2025-11-05 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/ulab-ai/ResearchArcade-arxiv-paragraphs
下载链接
链接失效反馈官方服务:
资源简介:
这是一个包含学术论文段落的结构化数据集,其中包括段落ID、内容、论文arXiv ID、论文章节、章节ID和段落在论文中的ID等字段。数据集被划分为训练集,共有超过510万个示例,数据集总大小约为3.7GB。
This is a structured dataset containing academic paper paragraphs, which includes fields such as paragraph ID, content, paper arXiv ID, paper section, section ID, and paragraph ID within the paper. The dataset is split into a training set with over 5.1 million examples, and the total size of the dataset is approximately 3.7GB.
提供机构:
ulab-ai



