five

jablonkagroup/chempile-paper

收藏
Hugging Face2025-08-13 更新2025-07-05 收录
下载链接:
https://hf-mirror.com/datasets/jablonkagroup/chempile-paper
下载链接
链接失效反馈
官方服务:
资源简介:
ChemPile-Paper 是一个包含化学和相关领域学术论文和预印本的综合性科学文献集。该数据集涵盖了从 ArXiv、bioRxiv、medRxiv、ChemRxiv 和 EuroPMC 等不同来源的精选论文,提供了多样化的数据配置。数据集主要使用英语,并遵循 CC BY-NC-ND 4.0 许可,允许非商业用途和带归属的重分发,但不允许衍生作品。该数据集旨在用于化学科学领域的语言模型训练、研究情报、信息检索、内容生成和领域适应等应用。

ChemPile-Paper is a comprehensive collection of scientific literature focused on chemistry and related fields, including academic papers and preprints from various repositories such as ArXiv, bioRxiv, medRxiv, ChemRxiv, and EuroPMC. The dataset is curated and processed to include relevant features and is available in different subsets, each with its own configuration. The dataset is primarily in English and is licensed under CC BY-NC-ND 4.0, allowing non-commercial use and redistribution with attribution but not derivatives. The dataset is intended for applications such as language model training, research intelligence, information retrieval, content generation, and domain adaptation in the chemical sciences.
提供机构:
jablonkagroup
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作