five

Model performance results of different models.

收藏
Figshare2025-05-19 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Model_performance_results_of_different_models_/29101833
下载链接
链接失效反馈
官方服务:
资源简介:
The automated valuation model (AVM) has been widely used by real estate stakeholders to provide accurate property value estimations automatically. Traditional valuation models are subjective and inaccurate, and previous studies have shown that machine learning (ML) approaches perform better in real estate valuation. These valuation models are based on structured tabular data, and few consider integrating multi-source unstructured data such as images. Most previous studies use fixed feature space for model training without considering the model performance variation brought by various feature configuration parameters. To fill these gaps, this study uses Hong Kong as a case study and proposes an enhanced ML-based real estate valuation framework with feature configuration and multi-source image data fusion, including exterior housing photos, street view and remote sensing images. ‌‌Eight ML regressors, namely, Random Forest, Extra Tree, XGBoost, Light Gradient Boosting Machine (LightGBM), K-Nearest Neighbors (KNN), Support Vector Regression (SVR), Multilayer Perceptron (MLP), and Multiple Linear Regression (MLR) are used to formulate ML pipelines for training. The SHapley Additive exPlanations (SHAP) method is used to examine the effects of images on housing prices. The experimental results show that the model performances using different feature configuration parameters are significantly different, indicating the necessity of feature configuration to obtain more accurate and reliable predictions. Extra Tree performs significantly better than other models. Half of the top 10 significant features are image features, and incorporating multi-source image features can improve property valuation accuracy. Nonlinear associations exist between image features and housing prices, and the spatial distribution patterns of image feature values and corresponding SHAP main effects vary significantly from the city centre to the suburbs. These findings contribute to a better understanding of AVM development with image fusion and the nonlinear associations between image features and housing prices for public authorities, urban planners, and real estate developers.

自动估值模型(Automated Valuation Model,AVM)已被房地产领域利益相关方广泛用于自动化精准的物业价值评估。传统估值模型主观性较强且精度欠佳,过往研究表明,机器学习(Machine Learning,ML)方法在房地产估值场景中表现更优。现有估值模型多以结构化表格数据为基础,极少考虑整合图像等多源非结构化数据。多数过往研究采用固定特征空间开展模型训练,未考量不同特征配置参数引发的模型性能波动。为填补上述研究空白,本研究以香港为案例对象,提出一种具备特征配置与多源图像数据融合的增强型机器学习房地产估值框架,所涉多源图像涵盖住宅外观照片、街景影像与遥感图像。本研究选用八种机器学习回归器,分别为随机森林(Random Forest)、极端随机树(Extra Tree)、XGBoost、轻量级梯度提升机(Light Gradient Boosting Machine,LightGBM)、K近邻(K-Nearest Neighbors,KNN)、支持向量回归(Support Vector Regression,SVR)、多层感知机(Multilayer Perceptron,MLP)以及多元线性回归(Multiple Linear Regression,MLR),以此构建机器学习训练管道。本研究采用夏普利可加解释(SHapley Additive exPlanations,SHAP)方法,分析图像特征对住宅价格的影响效应。实验结果表明,采用不同特征配置参数的模型性能差异显著,证实了特征配置对于获取更精准可靠的预测结果的必要性。极端随机树的模型性能显著优于其余七种回归器。排名前十的重要特征中有半数为图像特征,且引入多源图像特征可有效提升物业估值精度。图像特征与住宅价格之间存在非线性关联,且从城市中心到郊区,图像特征值及其对应的SHAP主效应的空间分布模式存在显著差异。上述研究成果可帮助公共管理部门、城市规划者以及房地产开发商更好地理解融合多源图像的自动估值模型的发展逻辑,以及图像特征与住宅价格之间的非线性关联关系。
创建时间:
2025-05-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作