five

Application of Life Cycle Assessment and Machine Learning for High-Throughput Screening of Green Chemical Substitutes

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://figshare.com/articles/dataset/Application_of_Life_Cycle_Assessment_and_Machine_Learning_for_High-Throughput_Screening_of_Green_Chemical_Substitutes/12698333
下载链接
链接失效反馈
官方服务:
资源简介:
The production process of many active pharmaceutical ingredients such as sitagliptin could cause severe environmental problems because of the use of toxic chemical materials and production infrastructure, energy consumption, and waste treatment. The environmental impacts of the sitagliptin production process were estimated with a life cycle assessment (LCA) method, which suggested that the use of chemical materials provided the major environmental impacts. Both methods of Eco-indicator 99 and ReCiPe endpoint confirmed that chemical feedstock accounted for 83% and 70% of life-cycle impact, respectively. Among all the chemical materials used in the sitagliptin production process, trifluoroacetic anhydride was identified as the largest influential factor in most impact categories according to the results of the ReCiPe midpoints’ method. Therefore, high-throughput screening was performed to seek for greener chemical substitutes to replace the target chemical (i.e., trifluoroacetic anhydride) by the following three steps. First, the 30 most similar chemicals were obtained from 2 million candidate alternatives in the PubChem database on the basis of their molecular descriptors. Thereafter, deep learning neural network models were developed to predict life-cycle impact according to the chemicals in Ecoinvent v3.5 database with known LCA values and corresponding molecular descriptors. Finally, 1,2-ethanediyl ester was proved to be one of the potential greener substitutes after the LCA data of these similar chemicals were predicted using the well-trained machine learning models. The case study demonstrated the applicability of the novel framework to screen green chemical substitutes and optimize the pharmaceutical manufacturing process.

许多活性药物成分(active pharmaceutical ingredients)的生产过程,例如西格列汀(sitagliptin),因涉及有毒化工原料的使用、生产基础设施投入、能源消耗及废弃物处理等环节,可能引发严重的环境问题。本研究采用生命周期评估(life cycle assessment, LCA)方法,对西格列汀生产过程的环境影响进行了测算,结果表明化工原料的使用是主要的环境影响来源。Eco-indicator 99与ReCiPe终点评估法均证实,化工原料分别占生命周期总环境影响的83%与70%。基于ReCiPe中点法的分析结果显示,在西格列汀生产所用的全部化工原料中,三氟乙酸酐(trifluoroacetic anhydride)是多数影响类别中影响力最大的因素。为此,研究通过以下三步开展高通量筛选,以寻求更环保的化工替代物,替换目标化学品(即三氟乙酸酐):第一步,基于分子描述符,从PubChem数据库的200万候选替代物中筛选出30种相似度最高的化学品;第二步,开发深度学习神经网络模型,依托Ecoinvent v3.5数据库中已知LCA值及对应分子描述符的化学品,预测其生命周期环境影响;第三步,利用训练完成的机器学习模型预测上述相似化学品的LCA数据后,证实1,2-乙二基酯(1,2-ethanediyl ester)是潜在的绿色替代物之一。本案例研究证明,该新型框架可有效用于绿色化工替代物筛选与医药生产流程优化。
创建时间:
2020-07-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作