five

XGBoost odor prediction model: finding the structure-odor relationship of odorant molecules using the extreme gradient boosting algorithm

收藏
DataCite Commons2024-11-08 更新2025-01-06 收录
下载链接:
https://tandf.figshare.com/articles/dataset/XGBoost_odor_prediction_model_finding_the_structure-odor_relationship_of_odorant_molecules_using_the_extreme_gradient_boosting_algorithm/27637933
下载链接
链接失效反馈
官方服务:
资源简介:
Determining the structure-odor relationship has always been a very challenging task. The main challenge in investigating the correlation between the molecular structure and its associated odor is the ambiguous and obscure nature of verbally defined odor descriptors, particularly when the odorant molecules are from different sources. With the recent developments in machine learning (ML) technology, ML and data analytic techniques are significantly being used for quantitative structure-activity relationship (QSAR) in the chemistry domain toward knowledge discovery where the traditional Edisonian methods have not been useful. The smell perception of odorant molecules is one of the aforementioned tasks, as olfaction is one of the least understood senses as compared to other senses. In this study, the XGBoost odor prediction model was generated to classify smells of odorant molecules from their SMILES strings. We first collected the dataset of 1278 odorant molecules with seven basic odor descriptors, and then 1875 physicochemical properties of odorant molecules were calculated. To obtain relevant physicochemical features, a feature reduction algorithm called PCA was also employed. The ML model developed in this study was able to predict all seven basic smells with high precision (>99%) and high sensitivity (>99%) when tested on an independent test dataset. The results of the proposed study were also compared with three recently conducted studies. The results indicate that the XGBoost–PCA model performed better than the other models for predicting common odor descriptors. The methodology and ML model developed in this study may be helpful in understanding the structure-odor relationship. Communicated by Ramaswamy H. Sarma
提供机构:
Taylor & Francis
创建时间:
2024-11-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作