five

laion/medrXiv-pdf

收藏
Hugging Face2024-10-17 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/laion/medrXiv-pdf
下载链接
链接失效反馈
官方服务:
资源简介:
MedrXiv Pdf数据集提供了截至2024年9月15日发布的所有PDF访问权限,旨在促进人工智能研究和领域特定科学模型的训练。该数据集汇集了科学领域的知识,大多数论文具有非限制性和开放访问许可,但某些PDF可能有额外的限制。数据集包含72,282个PDF,其中57,646个可用,总大小为82GB。PDF的文件名是其预印本DOI。由于下载过程中部分PDF无效或缺失,因此并非所有PDF都提供,计划在未来几天内解决此问题并上传剩余的PDF。

The MedrXiv Pdf dataset offers access to all PDFs published until September 15, 2024, aiming to facilitate artificial intelligence research and the training of domain-specific scientific models. This dataset compiles knowledge in the scientific domain, with most papers having non-restrictive and open access licenses, though some PDFs may have additional restrictions. The dataset contains 72,282 PDFs, with 57,646 available, totaling 82GB in size. Each PDFs filename is its preprint DOI. Not all PDFs are provided due to some being invalid or missing during the download process, with plans to resolve this and upload the remaining PDFs in the coming few days.
提供机构:
laion
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作