five

A Consensus of In-silico Sequence-based Modeling Techniques for Compound-Viral Protein Activity Prediction for SARS-COV-2

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://data.mendeley.com/datasets/8rrwnbcgmx
下载链接
链接失效反馈
官方服务:
资源简介:
Here we provide the datasets used for training and testing of the end-to-end supervised deep learning models as well as the datasets used with vector representations of compounds and proteins and passed to supervised state-of-the-art machine learning models (XGBoost, RF, SVM). We also provide the full list of viral proteins with their sequences used for the protein autoencoder along with the list of SMILES representations of compounds used for the compound autoencoder. Furthermore, we provide pickle files of data obtained from NCBI assay and compound-viral protein interactions downloaded through ChEMBL. The compound-viral protein interactions after filtering from both NCBI and ChEMBL. The list of compounds tested against the three main proteases of coronavirus and the three main proteases of SARS-COV-2 as a fasta file. All the test files associated with SARS-COV-2 viral proteins for end-to-end deep learning models as well as vector representation based supervised machine learning models.
创建时间:
2020-11-03
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作