five

Dataset supporting the article 'Transformer-decoder GPT models for generating virtual screening libraries of HMGCR inhibitors: effects of temperature, prompt-length and transfer-learning strategies'

收藏
DataCite Commons2024-07-24 更新2025-04-17 收录
下载链接:
https://researchdata.reading.ac.uk/id/eprint/1340
下载链接
链接失效反馈
官方服务:
资源简介:
Raw data for virtual screeing libraries generated by a generative, pre-trained transformer-decoder model. Models were pre-trained on a general drug database from ZINC15, and fine-tuned on inhibitors of HMGCR from ChEMBL. Libraries used different transfer-learning strategies, different prompt-lengths and different temperatures. The resultant libraries were screened against a deep neural network trained on experimental HMGCR IC50 values to predict IC50 values, docking scores from Autodock Vina, quantitative estimate of drug-likeness, Tanimoto similarity to known statin drugs, and other properties. This dataset contains tables of properties as well as CSV files with the generated libraries, a TKinter-based GUI to interacting with the library, and docking poses for selected molecules.

本数据集包含由生成式预训练Transformer(Transformer)解码器模型生成的虚拟筛选库原始数据。所用模型首先在ZINC15通用药物数据库上完成预训练,随后基于ChEMBL数据库中的HMGCR抑制剂进行微调。所生成的虚拟筛选库采用了不同的迁移学习策略、提示词长度(prompt-length)与温度参数(temperature)。所得虚拟筛选库均通过基于实验测得的HMGCR IC₅₀值训练的深度神经网络进行筛选,以预测分子的IC₅₀值、Autodock Vina对接得分、药物类药性定量评估值、与已知他汀类药物的塔尼莫托相似度(Tanimoto similarity)以及其他理化性质。本数据集包含各类性质数据表、存储生成虚拟筛选库的CSV文件、用于交互浏览库内分子的基于TKinter的图形用户界面(GUI),以及所选分子的对接构象数据。
提供机构:
University of Reading
创建时间:
2024-07-23
二维码
社区交流群
二维码
科研交流群
商业服务