five

S1 Data. Analytical data set for the Housing_price and Machine learning study

收藏
DataCite Commons2025-06-01 更新2025-01-06 收录
下载链接:
https://figshare.com/articles/dataset/S1_Data_Analytical_data_set_for_the_Housing_price_and_Machine_learning_study/26965252/1
下载链接
链接失效反馈
官方服务:
资源简介:
The Hedonic Price Model, used in existing house price modeling, may not address the relationship between house prices and streetscapes perceived at the human eye level. Therefore, in this study, we analyzed the relationship between streetscapes perceived at eye level and single-family home prices in Seoul, Korea, using computer vision technology and machine learning algorithms. We used transaction data for 13,776 single-family housing sales between 2017 and 2019. To measure visually perceived streetscapes, this study used the Deeplab V3+ deep-learning model with 233,106 Google Street View panoramic images. Then, the best machine-learning model was selected by comparing the explanatory powers of the hedonic price model and all alternative machine-learning models. According to the results, the Gradient Boost model, a representative ensemble machine learning model, performed better than XGBoost, Random Forest, and Linear Regression models in predicting single-family house prices. In addition, this study used an interpretable machine learning model of the SHAP method to identify key features that affect single-family home price prediction. This solves the "black box" problem of machine learning models. Finally, by analyzing the nonlinear relationship and interaction effects between perceived streetscape characteristics and house prices, we easily and quickly identified the relationship between variables the hedonic price model partially considers.

特征价格模型(Hedonic Price Model)在现有房价建模中可能无法充分刻画房价与人类视域层面感知的街景之间的关系。为此,本研究采用计算机视觉技术与机器学习算法,分析了韩国首尔市视域层面感知的街景与独栋住宅价格之间的关联。我们使用了2017至2019年间13,776套独栋住宅的交易数据。为量化视觉感知的街景特征,本研究采用Deeplab V3+深度学习模型,对233,106张谷歌街景(Google Street View)全景图像进行分析。随后,通过比较特征价格模型与其他备选机器学习模型的解释力,筛选出最优模型。结果显示,作为典型的集成机器学习模型,梯度提升模型(Gradient Boost)在独栋住宅价格预测中的表现优于XGBoost、随机森林(Random Forest)及线性回归(Linear Regression)模型。此外,本研究采用SHAP方法构建可解释机器学习模型,识别影响独栋住宅价格预测的关键特征,有效解决了机器学习模型的“黑箱”问题。最后,通过分析感知街景特征与房价之间的非线性关系及交互效应,我们能够便捷且快速地识别出特征价格模型仅部分考虑的变量关联。
提供机构:
figshare
创建时间:
2024-09-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作