five

Using Autonomous Outlier Detection Methods for Thermophysical Property Data

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://figshare.com/articles/dataset/Using_Autonomous_Outlier_Detection_Methods_for_Thermophysical_Property_Data/24992835
下载链接
链接失效反馈
官方服务:
资源简介:
The reliability and accuracy of thermophysical property data are of central importance for the development of models that describe these properties. In this work, we compare different autonomous algorithms for identifying the outliers in an existing database. Therefore, the comprehensive database on thermophysical property data for the Lennard-Jones fluid [J. Chem. Inf. Model. 2019, 59, 4248–4265] is used. We focus on homogeneous state property data at given temperature and density for the pressure p, thermal expansion coefficient α, isothermal compressibility β, thermal pressure coefficient γ, internal energy u, isochoric heat capacity cv, isobaric heat capacity cp, Grüneisen coefficient Γ, Joule–Thomson coefficient μJT, speed of sound w, chemical potential μ, (reduced) Helmholtz energy ã = a/T, and its derivatives ãnm. A comprehensive comparison of 19 outlier detection methods is carried out, which provides insights into the applicability of generic outlier detection algorithms for thermophysical property data. Different classes of outlier detection algorithms are included in the study, namely, machine learning, distance-based, density-based, statistical, ensemble, and model-informed. Two approaches are used for the method evaluation: in approach (a), the original database (comprising real outliers) is used. In approach (b), synthetic outliers are introduced. The results and findings from both approaches are consistent. Machine learning methods yield in some cases better performance compared to that of the distance-based, density-based, ensemble, and statistical methods. The best performance is obtained from the model-informed method (called MoDOD). The results also provide insights into the nature of the outliers in the Lennard-Jones database.
创建时间:
2024-01-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作