Scaled Dataset.xlsx

Name: Scaled Dataset.xlsx
Creator: figshare
Published: 2020-09-02 21:09:04
License: 暂无描述

DataCite Commons2020-09-02 更新2024-07-25 收录

下载链接：

https://figshare.com/articles/dataset/Scaled_Dataset_xlsx/4491101

下载链接

链接失效反馈

官方服务：

资源简介：

The partner company’s historical data could be utilized in developing a data-driven prediction model with project division details as its inputs and project division labor-hours as the desired output. The BIM models contain 42 design features and 1559 records, each record denoting a division of fabrication. The BIM design features are listed in Table 1. Labor-hours spent on each division were extracted from job costing databases serving as the output parameter in the regression model. Although the variables in Table 1 are all considered related, there are certain inter-correlations between them and some variables can be explained by others. For instance, material length and weight are highly correlated; by knowing one, the other can be deduced. Therefore, a variable selection technique is instrumental in removing these inter-correlations in an analytical manner. It is noteworthy that the dataset was linearly scaled prior to performing analyses in order not to reveal sensitive information of the partner company without distorting patterns and relationships inherent in the data.

合作方的历史数据可用于构建数据驱动型预测模型，该模型以项目分项细节作为输入特征，以项目分项耗用工时作为期望输出目标。BIM模型（Building Information Modeling）包含42项设计特征与1559条记录，每条记录对应一个制作分项。表1列出了该BIM模型的各项设计特征。各分项的耗用工时从作业成本数据库中提取，作为回归模型的输出参数。尽管表1中的所有变量均被认为具有相关性，但变量间存在一定的多重共线性，且部分变量可由其他变量推导得出。例如，材料长度与重量呈现高度相关，已知其中一项即可推导出另一项。因此，采用变量选择技术可通过分析手段消除此类多重共线性问题。值得注意的是，为了在不扭曲数据内在模式与关联关系的前提下避免泄露合作方的敏感信息，研究人员在开展分析前已对数据集进行了线性缩放处理。

提供机构：

figshare

创建时间：

2016-12-22

5,000+

优质数据集

54 个

任务类型

进入经典数据集