bioRxiv
收藏arXiv2025-09-30 收录
下载链接:
https://huggingface.co/datasets/hazylavender/biorxiv-abstract
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了从生物学论文中抓取的5.7万个摘要,每个摘要被视为一个字符串形式的客户端数据集。摘要样本是由64个标记组成的摘要跨度,并且该数据集每六个月更新一次。规模上,数据集包含了5.7万个摘要,其任务应用于联邦学习。
This dataset comprises 57,000 abstracts scraped from biology scholarly papers. Each abstract is treated as a client-side dataset in string format, with each abstract sample being a span of the abstract composed of 64 tokens. Additionally, this dataset is updated every six months and is applied to federated learning tasks.
提供机构:
Hugging Face



