Dataset - Papyrus - A large scale curated dataset aimed at bioactivity predictions
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/7373213
下载链接
链接失效反馈官方服务:
资源简介:
Fixed version of additional_files:
- In the previous version of 05.6_additional_files the data type of some descriptors was assigned incorrectly
- In this fixed version data types are correct
This repository contains version 05.6 of the Papyrus dataset, an aggregated dataset of small molecule bioactivities, as described in the article "Papyrus - A large scale curated dataset aimed at bioactivity predictions" doi.org/10.1186/s13321-022-00672-x.
Changes compared to version 05.5
- applied small molecule filter that filters out compounds with a MW < 200 or > 800, heavy metal containing compounds and mixtures
- include TID column which contains information on the original protein identifier
附加文件的修正版本:
- 在05.6_additional_files的过往版本中,部分描述符的数据类型被错误分配
- 本修正版本中所有数据类型均已更正无误
本数据仓库包含Papyrus数据集(Papyrus dataset)的05.6版本,该数据集为小分子生物活性聚合数据集,相关详情参见论文《Papyrus——面向生物活性预测的大规模经整理数据集》,DOI:10.1186/s1332-022-00672-x。
相较于05.5版本的更新内容如下:
- 启用小分子过滤规则,筛除分子量(MW)小于200或大于800的化合物、含重金属的化合物以及混合物
- 新增TID列,该列存储原始蛋白质标识符的相关信息
创建时间:
2024-09-20



