Machine learning-based discovery of molecular descriptors that control polymer gas permeation
收藏DataONE2024-02-29 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/sha256:e918006f44e83535ce5f76caa5595b1c0a73e5712aab0e1f4f10de3fef868761
下载链接
链接失效反馈官方服务:
资源简介:
While machine learning has found increasing use in predicting the properties of polymeric materials with only a knowledge of chain architecture, determining the molecular factors underpinning properties (\"interpretable AI\") has remained less well explored. We show that encoding chain chemistry in commonly employed formats, e.g., binary-valued fingerprints, leads to uniqueness issues during the hashing process to save storage space. This is because the hashing algorithm can map several chemical moieties into the same bit. These issues carry over into the ML algorithms, especially for âinverseâ design and interpretable AI, and cannot be avoided by changing the length of the fingerprint. Using MACCS key featurizations of monomer repeats resolves some of these issues, and we show that a few substructures consistently appear in top features for maximizing permeability across several gases and ML models. These are carbon-carbon double bonds (as in polyacetylenes) especially when they are asso..., , , # Machine learning-based discovery of molecular descriptors that control polymer gas permeation
[https://doi.org/10.5061/dryad.5x69p8dbm](https://doi.org/10.5061/dryad.5x69p8dbm)
## Description of the data and file structure
Machine learning-based discovery of molecular descriptors that control polymer gas permeation
Dataset, Shastry et al. Journal of Membrane Science (2024)
The dataset within Perm_Data.csv contains pure gas permeability values for the polymer membranes used to train machine learning models in the paper. Perm_Data_refs.csv contains citation information for any publications used in the analysis, with specific lines left blank due to removal of redundant data without offsetting the indexing within the main database file.
Columns 1 and 2 contain the polymer name and Simplified Molecular Input Line Entry System (SMILES) strings. Columns 3-8 contain permeability values corresponding to each of the 6 gases under study. Columns 9 and 10 contain experimental temperature ...
创建时间:
2025-07-28



