katielink/moleculenet-benchmark
收藏Hugging Face2023-08-28 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/katielink/moleculenet-benchmark
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
tags:
- biology
- chemistry
configs:
- config_name: bace
data_files:
- split: train
path: bace/train.csv
- split: test
path: bace/test.csv
- split: val
path: bace/valid.csv
- config_name: bbbp
data_files:
- split: train
path: bbbp/train.csv
- split: test
path: bbbp/test.csv
- split: val
path: bbbp/valid.csv
- config_name: clintox
data_files:
- split: train
path: clintox/train.csv
- split: test
path: clintox/test.csv
- split: val
path: clintox/valid.csv
- config_name: esol
data_files:
- split: train
path: esol/train.csv
- split: test
path: esol/test.csv
- split: val
path: esol/valid.csv
- config_name: freesolv
data_files:
- split: train
path: freesolv/train.csv
- split: test
path: freesolv/test.csv
- split: val
path: freesolv/valid.csv
- config_name: hiv
data_files:
- split: train
path: hiv/train.csv
- split: test
path: hiv/test.csv
- split: val
path: hiv/valid.csv
- config_name: lipo
data_files:
- split: train
path: lipo/train.csv
- split: test
path: lipo/test.csv
- split: val
path: lipo/valid.csv
- config_name: qm9
data_files:
- split: train
path: qm9/train.csv
- split: test
path: qm9/test.csv
- split: val
path: qm9/valid.csv
- config_name: sider
data_files:
- split: train
path: sider/train.csv
- split: test
path: sider/test.csv
- split: val
path: sider/valid.csv
- config_name: tox21
data_files:
- split: train
path: tox21/train.csv
- split: test
path: tox21/test.csv
- split: val
path: tox21/valid.csv
---
# MoleculeNet Benchmark ([website](https://moleculenet.org/))
MoleculeNet is a benchmark specially designed for testing machine learning methods of molecular properties. As we aim to facilitate the development of molecular machine learning method, this work curates a number of dataset collections, creates a suite of software that implements many known featurizations and previously proposed algorithms. All methods and datasets are integrated as parts of the open source DeepChem package(MIT license).
MoleculeNet is built upon multiple public databases. The full collection currently includes over 700,000 compounds tested on a range of different properties. We test the performances of various machine learning models with different featurizations on the datasets(detailed descriptions here), with all results reported in AUC-ROC, AUC-PRC, RMSE and MAE scores.
For users, please cite:
Zhenqin Wu, Bharath Ramsundar, Evan N. Feinberg, Joseph Gomes, Caleb Geniesse, Aneesh S. Pappu, Karl Leswing, Vijay Pande, MoleculeNet: A Benchmark for Molecular Machine Learning, arXiv preprint, arXiv: 1703.00564, 2017.
提供机构:
katielink
原始信息汇总
MoleculeNet Benchmark
概述
MoleculeNet 是一个专门为测试分子性质机器学习方法设计的基准。该基准集合了多个公开数据库,目前包含超过 700,000 个在不同性质上测试的化合物。通过测试不同特征化和机器学习模型在数据集上的表现,所有结果以 AUC-ROC、AUC-PRC、RMSE 和 MAE 分数报告。
数据集配置
- bace
- 训练集:
bace/train.csv - 测试集:
bace/test.csv - 验证集:
bace/valid.csv
- 训练集:
- bbbp
- 训练集:
bbbp/train.csv - 测试集:
bbbp/test.csv - 验证集:
bbbp/valid.csv
- 训练集:
- clintox
- 训练集:
clintox/train.csv - 测试集:
clintox/test.csv - 验证集:
clintox/valid.csv
- 训练集:
- esol
- 训练集:
esol/train.csv - 测试集:
esol/test.csv - 验证集:
esol/valid.csv
- 训练集:
- freesolv
- 训练集:
freesolv/train.csv - 测试集:
freesolv/test.csv - 验证集:
freesolv/valid.csv
- 训练集:
- hiv
- 训练集:
hiv/train.csv - 测试集:
hiv/test.csv - 验证集:
hiv/valid.csv
- 训练集:
- lipo
- 训练集:
lipo/train.csv - 测试集:
lipo/test.csv - 验证集:
lipo/valid.csv
- 训练集:
- qm9
- 训练集:
qm9/train.csv - 测试集:
qm9/test.csv - 验证集:
qm9/valid.csv
- 训练集:
- sider
- 训练集:
sider/train.csv - 测试集:
sider/test.csv - 验证集:
sider/valid.csv
- 训练集:
- tox21
- 训练集:
tox21/train.csv - 测试集:
tox21/test.csv - 验证集:
tox21/valid.csv
- 训练集:



