five

Forgis/Mechanical-Components

收藏
Hugging Face2026-04-02 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Forgis/Mechanical-Components
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit configs: - config_name: source_metadata data_files: - split: train path: source_metadata/*.parquet - config_name: bearings data_files: - split: train path: bearings/*.parquet - config_name: gearboxes data_files: - split: train path: gearboxes/*.parquet - config_name: v2_train data_files: - split: train path: v2_train/*.parquet --- # Mechanical Components Vibration Dataset Comprehensive multi-source mechanical vibration dataset for training cross-component fault diagnosis and prognostics models. Designed for the Mechanical-JEPA project. **Total: ~12,000+ samples | 9.5 GB | 16 sources | 5 component types** ## Quick Start ```python from datasets import load_dataset bearings = load_dataset("Forgis/Mechanical-Components", "bearings", split="train") gearboxes = load_dataset("Forgis/Mechanical-Components", "gearboxes", split="train") sources = load_dataset("Forgis/Mechanical-Components", "source_metadata", split="train") ``` ## Two-Level Schema **source_metadata** (16 entries): One row per source dataset with constant properties. **bearings/gearboxes** configs: Per-sample data linked via `source_id` foreign key. ## Dataset Sources ### Bearings (~10,000 samples from 10 sources) | Source | Samples | Component | Sensors | Unique Value | |--------|---------|-----------|---------|--------------| | [CWRU](https://engineering.case.edu/bearingdatacenter) | 40 | Ball bearing | Vibration | Standard benchmark | | [MFPT](https://www.mfpt.org/fault-data-sets/) | 20 | Ball bearing | Vibration | Variable load | | [FEMTO](https://www.nasa.gov/content/prognostics-center-of-excellence-data-set-repository) | 3,569 | Ball bearing | Vibration, temperature | Run-to-failure (RUL) | | [Mendeley](https://data.mendeley.com/datasets/vxkj334rzv/7) | 280 | Ball bearing | Vibration | **Speed transitions** (action-conditioning) | | [XJTU-SY](https://github.com/WangBiaoXJTU/xjtu-sy-bearing-datasets) | 1,370 | Ball bearing | Vibration | Run-to-failure (RUL) | | [IMS/NASA](https://data.nasa.gov/dataset/ims-bearings) | 1,256 | Ball bearing | Vibration | Run-to-failure (RUL) | | [Paderborn](https://mb.uni-paderborn.de/kat/forschung/bearing-datacenter/) | 384 | Ball bearing | Vibration, **current** | Real + artificial faults | | [MAFAULDA](https://www02.smt.ufrj.br/~offshore/mfs/page_01.html) | 800 | **Shaft+bearing** | Vibration, **acoustic**, tachometer | **Imbalance, misalignment** (shaft faults!) | | [Ottawa](https://data.mendeley.com/datasets/y2px5tg92h/1) | 180 | Ball bearing | Vibration, **acoustic** | **Cage faults**, 3 health stages | | [SCA Pulp Mill](https://data.mendeley.com/datasets/tdn96mkkpt/2) | 2,663 | Industrial bearing | Vibration | **Real industrial data** | | [VBL-VA001](https://zenodo.org/records/7006575) | 800 | Shaft+bearing | Vibration (triaxial) | Misalignment, unbalance | | [SEU](https://github.com/cathysiyu/Mechanical-datasets) | 140 | Drivetrain bearing | 8-ch (motor+gearbox) | Cross-component rig | ### Gearboxes (~1,225 samples from 4 sources) | Source | Samples | Component | Sensors | Unique Value | |--------|---------|-----------|---------|--------------| | [OEDI](https://data.openei.org/submissions/623) | 20 | Spur gear | Vibration (4-ch) | Healthy vs gear crack | | [PHM 2009](https://phmsociety.org/public-data-sets/) | 109 | Spur gear | Vibration, tachometer | Challenge data | | [MCC5-THU](https://github.com/liuzy0708/MCC5-THU-Gearbox-Benchmark-Datasets) | 956 | Spur gear | Vibration | **Speed/load transitions** | | [SEU](https://github.com/cathysiyu/Mechanical-datasets) | 140 | Planetary+parallel | 8-ch (motor+gearbox) | Cross-component rig | ## Component Types Covered | Component | Sources | Fault Types | |-----------|---------|-------------| | **Bearings** | CWRU, MFPT, FEMTO, Mendeley, XJTU-SY, IMS, Paderborn, Ottawa, SCA | inner_race, outer_race, ball, cage, compound, degrading | | **Gears** | OEDI, PHM2009, MCC5-THU, SEU | gear_crack, gear_wear, missing_tooth, tooth_break | | **Shafts** | MAFAULDA, VBL-VA001 | imbalance, misalignment_horizontal, misalignment_vertical | | **Drivetrains** | SEU | Combined motor+gearbox+bearing from single rig | | **Industrial** | SCA Pulp Mill | Naturally occurring faults in real machinery | ## Sensor Modalities | Modality | Sources | Channels | |----------|---------|----------| | **Vibration** (accelerometer) | All 16 | 1-8 channels per sample | | **Motor current** | Paderborn, Mendeley (partial) | 2-3 phase current | | **Acoustic** (microphone) | MAFAULDA, Ottawa | 1 channel | | **Tachometer** | MAFAULDA, PHM2009, OEDI | 1 channel | | **Temperature** | FEMTO | Scalar in slow_signals | | **Torque** | SEU, MCC5-THU | 1 channel | ## Key Features - **412 transition samples** for action-conditioning (Mendeley speed ramps + MCC5-THU speed/load transitions) - **Episode/RUL fields** for prognostics (FEMTO, XJTU-SY, IMS, SCA) - **Real industrial data** from SCA pulp mill (not lab) - **Shaft faults** (imbalance, misalignment) from MAFAULDA and VBL-VA001 - **Acoustic data** from MAFAULDA and Ottawa (microphone alongside accelerometer) - **Cross-component** drivetrain data from SEU (motor+gearbox+bearing on single rig) ## Per-Sample Schema ```python { "source_id": "cwru", # FK to source_metadata "sample_id": "cwru_105", "signal": [[0.1, 0.2, ...]], # (n_channels, signal_length) "n_channels": 2, "channel_names": ["DE_accel", "FE_accel"], "channel_modalities": ["vibration", "vibration"], "health_state": "faulty", # healthy | faulty | degrading "fault_type": "inner_race", "fault_severity": None, "rpm": 1750, "load": 2.0, "load_unit": "hp", "episode_id": None, # For run-to-failure "episode_position": None, # 0.0 to 1.0 "rul_percent": None, # Remaining useful life "is_transition": False, # Speed/load change "transition_type": None, # ramp_speed | ramp_load } ``` ## v2 Training-Ready Config (Planned) A standardized config for direct model training: - Fixed sampling rate: 12,800 Hz - Fixed window: 16,384 samples (1.28 seconds) - Vibration-only (single modality) - Per-sample instance normalization - Source-disjoint train/val/test splits ## v2 Training-Ready Config (LIVE) Standardized for direct model training. All sources resampled to common format. ```python # Load training-ready data (all splits in one, filter by 'split' column) v2 = load_dataset("Forgis/Mechanical-Components", "v2_train", split="train") v2_train = v2.filter(lambda x: x["split"] == "train") # 20,143 samples v2_val = v2.filter(lambda x: x["split"] == "val") # 1,332 samples v2_test = v2.filter(lambda x: x["split"] == "test") # 6,363 samples ``` | Parameter | Value | |-----------|-------| | Sampling rate | 12,800 Hz | | Window length | 16,384 samples (1.28 seconds) | | Channels | 1 (primary vibration) | | Normalization | Per-sample instance norm | | Splits | Source-disjoint (train/val/test) | **Train** (12 sources): CWRU, MFPT, FEMTO, XJTU-SY, IMS, OEDI, PHM2009, MCC5-THU, SEU, MAFAULDA, VBL, SCA-train **Val** (2 sources): Paderborn, Ottawa **Test** (2 sources): Mendeley, SCA-test ## Citations Please cite the original datasets. See source_metadata config for full citations per source.

许可证:MIT协议 配置项: - 配置名称:source_metadata 数据文件: - 拆分方式:训练集 路径:source_metadata/*.parquet - 配置名称:bearings 数据文件: - 拆分方式:训练集 路径:bearings/*.parquet - 配置名称:gearboxes 数据文件: - 拆分方式:训练集 路径:gearboxes/*.parquet - 配置名称:v2_train 数据文件: - 拆分方式:训练集 路径:v2_train/*.parquet # 机械组件振动数据集 ## 数据集简介 本数据集为多源综合机械振动数据集,用于训练跨组件故障诊断与预后模型,专为Mechanical-JEPA项目打造。 **总规模:约12000+样本 | 9.5 GB | 16个数据源 | 5类组件** ## 快速入门 python from datasets import load_dataset bearings = load_dataset("Forgis/Mechanical-Components", "bearings", split="train") gearboxes = load_dataset("Forgis/Mechanical-Components", "gearboxes", split="train") sources = load_dataset("Forgis/Mechanical-Components", "source_metadata", split="train") ## 二级架构 **source_metadata**(共16条条目):每个数据源对应一行,包含固定属性。 **bearings/gearboxes** 配置:通过`source_id`外键关联的单样本数据。 ## 数据集数据源 ### 轴承数据集(来自10个数据源,约10000样本) | 数据源 | 样本量 | 组件类型 | 传感器类型 | 特色参数 | |--------|---------|-----------|---------|--------------| | [CWRU](https://engineering.case.edu/bearingdatacenter) | 40 | 滚动轴承(Ball bearing) | 振动 | 标准基准数据集 | | [MFPT](https://www.mfpt.org/fault-data-sets/) | 20 | 滚动轴承 | 振动 | 变负载工况 | | [FEMTO](https://www.nasa.gov/content/prognostics-center-of-excellence-data-set-repository) | 3,569 | 滚动轴承 | 振动、温度 | 全寿命退化数据(剩余使用寿命(Remaining Useful Life, RUL)) | | [Mendeley](https://data.mendeley.com/datasets/vxkj334rzv/7) | 280 | 滚动轴承 | 振动 | **转速过渡工况**(动作调节型) | | [XJTU-SY](https://github.com/WangBiaoXJTU/xjtu-sy-bearing-datasets) | 1,370 | 滚动轴承 | 振动 | 全寿命退化数据(RUL) | | [IMS/NASA](https://data.nasa.gov/dataset/ims-bearings) | 1,256 | 滚动轴承 | 振动 | 全寿命退化数据(RUL) | | [Paderborn](https://mb.uni-paderborn.de/kat/forschung/bearing-datacenter/) | 384 | 滚动轴承 | 振动、电机电流 | 真实+人工故障样本 | | [MAFAULDA](https://www02.smt.ufrj.br/~offshore/mfs/page_01.html) | 800 | 轴+轴承 | 振动、声学(麦克风(microphone))、转速计(tachometer) | 不平衡、不对中(轴类故障) | | [Ottawa](https://data.mendeley.com/datasets/y2px5tg92h/1) | 180 | 滚动轴承 | 振动、声学 | 保持架故障、3种健康状态 | | [SCA Pulp Mill](https://data.mendeley.com/datasets/tdn96mkkpt/2) | 2,663 | 工业轴承 | 振动 | 真实工业现场数据 | | [VBL-VA001](https://zenodo.org/records/7006575) | 800 | 轴+轴承 | 三轴振动 | 不对中、不平衡 | | [SEU](https://github.com/cathysiyu/Mechanical-datasets) | 140 | 传动系统轴承 | 8通道(电机+齿轮箱) | 跨组件试验台 | ### 齿轮箱数据集(来自4个数据源,约1225样本) | 数据源 | 样本量 | 组件类型 | 传感器类型 | 特色参数 | |--------|---------|-----------|---------|--------------| | [OEDI](https://data.openei.org/submissions/623) | 20 | 直齿轮 | 4通道振动 | 健康vs齿轮裂纹 | | [PHM 2009](https://phmsociety.org/public-data-sets/) | 109 | 直齿轮 | 振动、转速计 | 竞赛数据集 | | [MCC5-THU](https://github.com/liuzy0708/MCC5-THU-Gearbox-Benchmark-Datasets) | 956 | 直齿轮 | 振动 | 转速/负载过渡工况 | | [SEU](https://github.com/cathysiyu/Mechanical-datasets) | 140 | 行星+平行轴齿轮箱 | 8通道(电机+齿轮箱) | 跨组件试验台 | ## 覆盖的组件类型 | 组件类型 | 数据源 | 故障类型 | |-----------|---------|-------------| | **轴承** | CWRU、MFPT、FEMTO、Mendeley、XJTU-SY、IMS、Paderborn、Ottawa、SCA | 内圈故障(inner_race)、外圈故障(outer_race)、滚珠故障(ball)、保持架故障(cage)、复合故障(compound)、退化故障(degrading) | | **齿轮** | OEDI、PHM2009、MCC5-THU、SEU | 齿轮裂纹(gear_crack)、齿轮磨损(gear_wear)、断齿(missing_tooth)、齿断裂(tooth_break) | | **轴类** | MAFAULDA、VBL-VA001 | 不平衡(imbalance)、水平不对中(misalignment_horizontal)、垂直不对中(misalignment_vertical) | | **传动系统** | SEU | 单试验台集成电机+齿轮箱+轴承的复合故障 | | **工业组件** | SCA纸浆厂 | 真实工业设备的自然故障 | ## 传感器模态 | 模态类型 | 数据源 | 通道数 | |----------|---------|----------| | **振动(加速度计(accelerometer))** | 全部16个数据源 | 每样本1-8通道 | | **电机电流** | Paderborn、Mendeley(部分数据源) | 2-3相电流 | | **声学(麦克风)** | MAFAULDA、Ottawa | 1通道 | | **转速计** | MAFAULDA、PHM2009、OEDI | 1通道 | | **温度** | FEMTO | 慢信号中的标量数据 | | **扭矩** | SEU、MCC5-THU | 1通道 | ## 核心特性 - **412个工况过渡样本**:涵盖Mendeley的转速斜坡数据 + MCC5-THU的转速/负载过渡样本 - **预后任务专用字段**:包含Episode/RUL字段,覆盖FEMTO、XJTU-SY、IMS、SCA数据源 - **真实工业现场数据**:来自SCA纸浆厂,非实验室采集的工业设备数据 - **轴类故障数据集**:来自MAFAULDA与VBL-VA001,包含不平衡、不对中故障样本 - **同步声学数据**:来自MAFAULDA与Ottawa,与加速度计信号同步采集 - **跨组件传动系统数据**:来自SEU试验台,集成电机、齿轮箱与轴承组件 ## 单样本数据架构 python { "source_id": "cwru", # 指向source_metadata的外键 "sample_id": "cwru_105", "signal": [[0.1, 0.2, ...]], # 形状为(n_channels, signal_length)的信号数组 "n_channels": 2, "channel_names": ["DE_accel", "FE_accel"], "channel_modalities": ["vibration", "vibration"], "health_state": "faulty", # 可选值:healthy | faulty | degrading "fault_type": "inner_race", "fault_severity": None, "rpm": 1750, "load": 2.0, "load_unit": "hp", "episode_id": None, # 全寿命退化任务专用ID "episode_position": None, # 退化进度,范围0.0到1.0 "rul_percent": None, # 剩余使用寿命百分比 "is_transition": False, # 是否为工况过渡样本 "transition_type": None, # 过渡类型:ramp_speed | ramp_load } ## v2 训练就绪配置(规划版) 专为直接模型训练设计的标准化配置: - 固定采样率:12800 Hz - 固定窗口长度:16384样本(对应1.28秒) - 仅振动单模态数据 - 逐样本实例归一化 - 按数据源划分的训练/验证/测试集 ## v2 训练就绪配置(正式上线版) 已标准化以支持直接模型训练,所有数据源均重采样为统一格式。 python # 加载训练就绪数据(包含全部拆分,可通过'split'字段筛选) v2 = load_dataset("Forgis/Mechanical-Components", "v2_train", split="train") v2_train = v2.filter(lambda x: x["split"] == "train") # 20143个样本 v2_val = v2.filter(lambda x: x["split"] == "val") # 1332个样本 v2_test = v2.filter(lambda x: x["split"] == "test") # 6363个样本 | 参数项 | 参数值 | |-----------|-------| | 采样率 | 12800 Hz | | 窗口长度 | 16384样本(1.28秒) | | 通道数 | 1(主振动通道) | | 归一化方式 | 逐样本实例归一化 | | 拆分规则 | 按数据源划分(训练/验证/测试集互不重叠) | **训练集(12个数据源)**:CWRU、MFPT、FEMTO、XJTU-SY、IMS、OEDI、PHM2009、MCC5-THU、SEU、MAFAULDA、VBL、SCA-train **验证集(2个数据源)**:Paderborn、Ottawa **测试集(2个数据源)**:Mendeley、SCA-test ## 引用说明 请引用各原始数据集的官方文献。详见source_metadata配置中各数据源的完整引用信息。
提供机构:
Forgis
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作