KaiserML/ArxivTrainTest
收藏Hugging Face2025-05-13 更新2025-05-31 收录
下载链接:
https://hf-mirror.com/datasets/KaiserML/ArxivTrainTest
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了论文的元数据信息,如提交者(submitter)、作者(authors)、标题(title)等,以及文本内容,如评论(comments)、摘要(abstract)等。数据集分为训练集(train)和测试集(test),其中训练集包含1833645个示例,测试集包含458412个示例。但是,数据集的具体来源和用途等信息在README文件中并未明确描述。
The dataset includes metadata information of papers such as submitter, authors, title, etc., as well as text content like comments, abstracts, etc. The dataset is divided into training set (train) and test set (test), with the training set containing 1,833,645 examples and the test set containing 458,412 examples. However, the specific origin and purpose of the dataset are not explicitly described in the README file.
提供机构:
KaiserML



