Application of Life Cycle Assessment and Machine Learning for High-Throughput Screening of Green Chemical Substitutes
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://figshare.com/articles/dataset/Application_of_Life_Cycle_Assessment_and_Machine_Learning_for_High-Throughput_Screening_of_Green_Chemical_Substitutes/12698333
下载链接
链接失效反馈官方服务:
资源简介:
The production process
of many active pharmaceutical ingredients
such as sitagliptin could cause severe environmental problems because
of the use of toxic chemical materials and production infrastructure,
energy consumption, and waste treatment. The environmental impacts
of the sitagliptin production process were estimated with a life cycle
assessment (LCA) method, which suggested that the use of chemical
materials provided the major environmental impacts. Both methods of
Eco-indicator 99 and ReCiPe endpoint confirmed that chemical feedstock
accounted for 83% and 70% of life-cycle impact, respectively. Among
all the chemical materials used in the sitagliptin production process,
trifluoroacetic anhydride was identified as the largest influential
factor in most impact categories according to the results of the ReCiPe
midpoints’ method. Therefore, high-throughput screening was
performed to seek for greener chemical substitutes to replace the
target chemical (i.e., trifluoroacetic anhydride) by the following
three steps. First, the 30 most similar chemicals were obtained from
2 million candidate alternatives in the PubChem database on the basis
of their molecular descriptors. Thereafter, deep learning neural network
models were developed to predict life-cycle impact according to the
chemicals in Ecoinvent v3.5 database with known LCA values and corresponding
molecular descriptors. Finally, 1,2-ethanediyl ester was proved to
be one of the potential greener substitutes after the LCA data of
these similar chemicals were predicted using the well-trained machine
learning models. The case study demonstrated the applicability of
the novel framework to screen green chemical substitutes and optimize
the pharmaceutical manufacturing process.
许多活性药物成分(active pharmaceutical ingredients)的生产过程,例如西格列汀(sitagliptin),因涉及有毒化工原料的使用、生产基础设施投入、能源消耗及废弃物处理等环节,可能引发严重的环境问题。本研究采用生命周期评估(life cycle assessment, LCA)方法,对西格列汀生产过程的环境影响进行了测算,结果表明化工原料的使用是主要的环境影响来源。Eco-indicator 99与ReCiPe终点评估法均证实,化工原料分别占生命周期总环境影响的83%与70%。基于ReCiPe中点法的分析结果显示,在西格列汀生产所用的全部化工原料中,三氟乙酸酐(trifluoroacetic anhydride)是多数影响类别中影响力最大的因素。为此,研究通过以下三步开展高通量筛选,以寻求更环保的化工替代物,替换目标化学品(即三氟乙酸酐):第一步,基于分子描述符,从PubChem数据库的200万候选替代物中筛选出30种相似度最高的化学品;第二步,开发深度学习神经网络模型,依托Ecoinvent v3.5数据库中已知LCA值及对应分子描述符的化学品,预测其生命周期环境影响;第三步,利用训练完成的机器学习模型预测上述相似化学品的LCA数据后,证实1,2-乙二基酯(1,2-ethanediyl ester)是潜在的绿色替代物之一。本案例研究证明,该新型框架可有效用于绿色化工替代物筛选与医药生产流程优化。
创建时间:
2020-07-23



