qingy2024/arxiv-abstracts-filtered
收藏Hugging Face2025-07-01 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/qingy2024/arxiv-abstracts-filtered
下载链接
链接失效反馈官方服务:
资源简介:
这是一个包含文本数据的数据集,其中包括文本ID、文本内容、文本来源、创建时间、添加时间、元数据(如许可、完整文本许可、作者、提交者、URL)和清理后的文本内容等字段。数据集分为训练集,样本数量超过110万。数据集的总大小约为2.55GB。
This is a dataset containing text data, which includes fields such as text ID, text content, text source, creation time, addition time, metadata (such as license, full text license, authors, submitter, URL), and cleaned text content. The dataset is split into a training set with over 1.1 million samples. The total size of the dataset is approximately 2.55GB.
提供机构:
qingy2024



