five

Machine learning-based discovery of molecular descriptors that control polymer gas permeation

收藏
DataONE2024-02-29 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/sha256:e918006f44e83535ce5f76caa5595b1c0a73e5712aab0e1f4f10de3fef868761
下载链接
链接失效反馈
官方服务:
资源简介:
While machine learning has found increasing use in predicting the properties of polymeric materials with only a knowledge of chain architecture, determining the molecular factors underpinning properties (\"interpretable AI\") has remained less well explored. We show that encoding chain chemistry in commonly employed formats, e.g., binary-valued fingerprints, leads to uniqueness issues during the hashing process to save storage space. This is because the hashing algorithm can map several chemical moieties into the same bit. These issues carry over into the ML algorithms, especially for “inverse” design and interpretable AI, and cannot be avoided by changing the length of the fingerprint. Using MACCS key featurizations of monomer repeats resolves some of these issues, and we show that a few substructures consistently appear in top features for maximizing permeability across several gases and ML models. These are carbon-carbon double bonds (as in polyacetylenes) especially when they are asso..., , , # Machine learning-based discovery of molecular descriptors that control polymer gas permeation [https://doi.org/10.5061/dryad.5x69p8dbm](https://doi.org/10.5061/dryad.5x69p8dbm) ## Description of the data and file structure Machine learning-based discovery of molecular descriptors that control polymer gas permeation Dataset, Shastry et al. Journal of Membrane Science (2024) The dataset within Perm_Data.csv contains pure gas permeability values for the polymer membranes used to train machine learning models in the paper. Perm_Data_refs.csv contains citation information for any publications used in the analysis, with specific lines left blank due to removal of redundant data without offsetting the indexing within the main database file. Columns 1 and 2 contain the polymer name and Simplified Molecular Input Line Entry System (SMILES) strings. Columns 3-8 contain permeability values corresponding to each of the 6 gases under study. Columns 9 and 10 contain experimental temperature ...
创建时间:
2025-07-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作