five

"Calibrating Prediction Timeliness Through Multi-Objective Hyperparameter Optimization for Remaining Useful Life Prediction"

收藏
DataCite Commons2026-03-12 更新2026-05-03 收录
下载链接:
https://ieee-dataport.org/documents/calibrating-prediction-timeliness-through-multi-objective-hyperparameter-optimization
下载链接
链接失效反馈
官方服务:
资源简介:
"Dataset SummaryThis study uses two publicly available prognostics benchmarks from distinct industrial domains: the NASA C-MAPSS turbofan engine simulation and the BackBlaze hard disk drive operational dataset. Together, these benchmarks span simulated and real-world degradation regimes, enabling assessment of cross-domain generalizability.1. NASA C-MAPSS Turbofan Engine DatasetSource: NASA Prognostics Center of Excellence (Saxena et al., 2008a). Publicly available at the NASA Prognostics Data Repository.Description: Run-to-failure degradation trajectories of simulated turbofan engines under varying operating conditions and fault modes. Each engine unit records multivariate time-series readings from 14 sensors at each operational cycle until failure. The dataset comprises four subsets of increasing complexity: FD001 (1 operating condition, 1 fault mode), FD002 (6 operating conditions, 1 fault mode), FD003 (1 operating condition, 2 fault modes), and FD004 (6 operating conditions, 2 fault modes).Preprocessing: A piecewise linear degradation model caps the maximum RUL at 125 cycles, following the standard assumption that engines in early operational stages are functionally healthy (Heimes, 2008; Li et al., 2018). Training partitions are further split into training and validation subsets at the engine level for hyperparameter selection.Splits: FD001 and FD003: 85 training \/ 15 validation \/ 100 test engines. FD002: 221 training \/ 39 validation \/ 259 test engines. FD004: 212 training \/ 37 validation \/ 248 test engines.2. BackBlaze Hard Disk Drive DatasetSource: BackBlaze, Inc. (BackBlaze, 2023). Publicly available at https:\/\/www.backblaze.com\/cloud-storage\/resources\/hard-drive-test-data.Description: Daily SMART (Self-Monitoring, Analysis and Reporting Technology) attribute readings from operational data center hard drives. Data spans three consecutive quarters (Q1\u2013Q3 2024), merged with cross-quarter deduplication. Unlike C-MAPSS, BackBlaze presents noisier sensor signals, less predictable degradation patterns, and more heterogeneous failure modes.Preprocessing: Only drives with a recorded failure event are retained; censored (non-failed) drives are excluded. Drives with fewer than 30 days of operational history or fewer than 30 observations are filtered out. From the available SMART attributes, 14 features with established failure-predictive relevance are selected, including Reallocated Sectors Count, Reported Uncorrectable Errors, Current Pending Sector Count, Power-On Hours, and Temperature. Each daily record is treated as one operational cycle. RUL is computed as days remaining until the recorded failure date, clipped at a maximum of 60 days.Splits: 2,444 failed drives total (140,279 observations across 14 features), split at the drive level into training (1,956 drives; 111,971 observations) and test (488 drives; 28,308 observations) with a fixed random seed.3. Dataset Characteristics Summary FD001FD002FD003FD004BackBlazeTrain \/ Val \/ Test units85 \/ 15 \/ 100221 \/ 39 \/ 25985 \/ 15 \/ 100212 \/ 37 \/ 2481,956 \/ \u2014 \/ 488Train observations20,63153,75924,72061,249111,971Test observations13,09633,99116,59641,21428,308Features14 sensors14 sensors14 sensors14 sensors14 SMART attr.Mean cycles\/unit (train)206.3206.8247.2246.057.6Mean train RUL86.886.993.193.030.0% at RUL cap (train)39.4%39.5%49.4%49.2%10.9%RUL cap125 cycles125 cycles125 cycles125 cycles60 daysOperating conditions1616Real-worldFault modes1122Heterogeneous4. Data AvailabilityBoth datasets are publicly available. The C-MAPSS dataset is accessible through the NASA Prognostics Data Repository. The BackBlaze hard drive dataset is available at https:\/\/www.backblaze.com\/cloud-storage\/resources\/hard-drive-test-data. The preprocessed datasets and code used for feature selection, RUL labeling, and train\/test splitting are provided alongside this submission to ensure full reproducibility."
提供机构:
IEEE DataPort
创建时间:
2026-03-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作