five

Predicting PFAS Diffusion Coefficients with Active Learning and Molecular Dynamics

收藏
Figshare2025-11-11 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Predicting_PFAS_Diffusion_Coefficients_with_Active_Learning_and_Molecular_Dynamics/30593864
下载链接
链接失效反馈
官方服务:
资源简介:
Per- and polyfluoroalkyl substances (PFAS) are over 14 000 synthetic compounds with exceptional environmental persistence. Used extensively in industrial and consumer applications, PFAS resist degradation and accumulate in environmental media and living organisms, causing serious health risks, including cancer, liver damage, and immune dysfunction. This persistence and toxicity create urgent needs for environmental fate assessment. Predicting PFAS environmental transport remains challenging due to the lack of reliable diffusion coefficient data, critical for modeling contaminant mobility and designing remediation strategies. Experimental measurements are time-consuming and expensive, while fully computational approaches are infeasible due to chemical space scale. We developed an integrated machine learning and molecular dynamics framework using active learning to predict diffusion coefficients across the PFAS chemical space. Starting with measured diffusion coefficients, we train models using chemical graph-based representations and physicochemical descriptors. The approach iteratively identifies molecules with highest prediction uncertainty, performs targeted MD simulations, and retrains models to efficiently explore chemical space while minimizing computational cost. The framework achieved significant performance improvements, reducing mean relative error by 88% and increasing R2 from 0.095 to 0.907. Uncertainty-based sampling consistently outperformed random selection at optimal batch sizes of 50–100 compounds. This data-efficient approach enables transport property prediction across thousands of PFAS molecules, supporting environmental risk assessment and remediation planning.
创建时间:
2025-11-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作