five

datasets for QKAR study

收藏
DataCite Commons2025-08-18 更新2025-09-08 收录
下载链接:
https://figshare.com/articles/dataset/datasets_for_QKAR_study/29936189/1
下载链接
链接失效反馈
官方服务:
资源简介:
Computational toxicology plays an important role in risk assessment and drug safety. The field has been traditionally dominated by Quantitative Structure-Activity Relationships (QSARs), which predict toxicological effects based solely on chemical structure. Although QSARs have achieved successes, their structure reliance limits drug toxicity predictions, where small structural modifications may cause major toxicity changes. Advances in artificial intelligence (AI), especially text embedding and generative AI, provide an opportunity to enhance toxicity predictions by leveraging broader chemical knowledge and its integration with structural data. In this study, we propose a novel framework, Quantitative Knowledge-Activity Relationships (QKARs), which predict toxicity using domain-specific knowledge. We developed QKAR models for two drug toxicity endpoints, drug-induced liver injury (DILI) and drug-induced cardiotoxicity (DICT), using three different knowledge representations with varying levels of knowledge. The representations based on comprehensive knowledge of the drugs yielded better prediction than those with simpler knowledge. Five machine learning algorithms of distinct complexity were applied in QKAR models, and we observed little association between model complexity and performance. Further, we evaluated QKARs against QSARs on the same endpoints using identical datasets. We found that QKARs consistently outperformed QSARs for DILI and DICT. Notably, QKARs demonstrated better capability than QSARs in differentiating drugs with similar structures but different liver toxicity profiles. We also investigated integrating<br>knowledge-based and structure-based representations, Q(K+S)ARs, for further enhanced prediction<br>accuracy. Our findings demonstrate the potential of QKARs as a robust alternative to QSARs, offering additional opportunities in drug toxicity assessments by leveraging both domain-specific knowledge and structural data.<br><br>The DILI.xlsx and DICT.xlsx hold the data used in this study.

计算毒理学在风险评估与药物安全领域发挥着重要作用。该领域长期以来以定量构效关系(Quantitative Structure-Activity Relationships, QSARs)为主导,此类模型仅基于化学结构预测毒理学效应。尽管定量构效关系已取得一定进展,但其对化学结构的依赖性限制了药物毒性预测能力——微小的结构修饰可能引发毒性的显著变化。人工智能(Artificial Intelligence, AI)技术的进步,尤其是文本嵌入与生成式AI,为提升毒性预测能力提供了新机遇,可通过利用更广泛的化学知识并将其与结构数据相结合实现。 本研究提出了一种全新框架——定量知识-效关系(Quantitative Knowledge-Activity Relationships, QKARs),该框架利用领域专属知识开展毒性预测。本研究针对两种药物毒性终点——药物性肝损伤(Drug-induced Liver Injury, DILI)与药物性心脏毒性(Drug-induced Cardiotoxicity, DICT),基于三类不同知识层级的表征方式构建了QKAR模型。基于药物全面知识的表征模型,其预测性能优于基于简化知识的表征模型。本研究在QKAR模型中应用了五种复杂度各异的机器学习算法,结果显示模型复杂度与预测性能之间几乎不存在关联。此外,本研究使用完全相同的数据集,在相同的毒性终点下对QKAR与QSAR进行了对比评估。结果表明,针对药物性肝损伤与药物性心脏毒性这两个终点,QKAR的预测性能始终优于QSAR。尤为值得关注的是,在区分结构相似但肝毒性特征存在差异的药物方面,QKAR展现出比QSAR更优异的能力。本研究还探索了将基于知识的表征与基于结构的表征相融合的Q(K+S)ARs模型,以进一步提升预测精度。本研究结果证实,QKAR可作为QSAR的可靠替代方案,通过结合领域专属知识与结构数据,为药物毒性评估提供了新的研究路径。 本研究使用的数据存储于DILI.xlsx与DICT.xlsx文件中。
提供机构:
figshare
创建时间:
2025-08-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作