five

ByteCameraDepth

收藏
魔搭社区2026-01-06 更新2025-09-06 收录
下载链接:
https://modelscope.cn/datasets/ByteDance-Seed/ByteCameraDepth
下载链接
链接失效反馈
官方服务:
资源简介:
# ByteCameraDepth Dataset [Paper](https://huggingface.co/papers/2509.02530) | [Project Page](https://manipulation-as-in-simulation.github.io/) | [Code](https://github.com/ByteDance-Seed/manip-as-in-sim-suite) ByteCameraDepth is a multi-camera depth estimation dataset containing synchronized depth, color, and auxiliary data captured from various 3D cameras. The dataset provides comprehensive depth sensing from multiple cameras in various in-door scenarios, making it ideal for developing and evaluating depth estimation algorithms. ## Dataset Overview - **Purpose**: Multi-camera depth estimation research and benchmarking - **Total Sessions**: 39 recording sessions - **Unpacked Size**: ~2.7TB - **Data Collection System**: [Multi-Camera Depth Recording System](https://github.com/Ericonaldo/depth_recording) - **License**: CC-BY-4.0 ## Quick Start ### Data Extraction The dataset is provided as split archive files. To extract the complete dataset: ```bash cat recorded_data.tar.part.* | tar -xvf - ``` This will create a `recorded_data` folder containing all 39 recording sessions. ## Dataset Structure ### Archive Organization ``` recorded_data_packed/ ├── recorded_data.tar.part.000 ├── recorded_data.tar.part.001 ├── ... └── recorded_data.tar.part.136 ``` ### Extracted Data Structure After extraction, the data is organized as follows: ``` recorded_data/ └── YYYYMMDD_HHMM/ # Timestamp-based session folder (39 sessions total) ├── camera_realsense_455/ # Intel RealSense D455 │ ├── depth_000.png # 16-bit depth images │ ├── color_000.png # 8-bit color images │ └── ... ├── camera_realsense_d405/ # Intel RealSense D405 │ ├── depth_000.png │ ├── color_000.png │ └── ... ├── camera_realsense_d415/ # Intel RealSense D415 │ ├── depth_000.png │ ├── color_000.png │ └── ... ├── camera_realsense_d435/ # Intel RealSense D435 │ ├── depth_000.png │ ├── color_000.png │ └── ... ├── camera_realsense_l515/ # Intel RealSense L515 │ ├── depth_000.png │ ├── color_000.png │ └── ... ├── camera_kinect/ # Microsoft Azure Kinect │ ├── depth_000.png # 16-bit depth images │ ├── color_000.png # 8-bit color images │ ├── ir_000.png # Infrared images │ └── ... ├── camera_zed2i_neural/ # Stereolabs ZED2i (Neural mode) │ ├── raw_depth_000.npy # 32-bit float depth arrays │ ├── depth_000.png # 16-bit depth images │ ├── color_000.png # Color images │ ├── pcd_000.npy # Point cloud data (X,Y,Z) │ ├── normal_000.npy # Surface normal vectors │ └── ... ├── camera_zed2i_performance/ ├── camera_zed2i_quality/ ├── camera_zed2i_ultra/ └── ... ``` ## Camera Systems and Specifications The dataset includes data collected by our [depth recording toolkit](https://github.com/Ericonaldo/depth_recording): ### Intel RealSense Cameras - **Models**: D405, D415, D435, D455, L515 - **Output**: `depth_xxx.png` (16-bit), `color_xxx.png` (8-bit) ### Microsoft Azure Kinect - **Depth Resolution**: Wide FOV unbinned - **Output**: `depth_xxx.png` (16-bit), `color_xxx.png` (8-bit), `ir_xxx.png` (infrared) ### Stereolabs ZED2i - **Depth Resolution**: 1280×720 - **Depth Modes**: 4 different modes (neural, performance, quality, ultra) - **Output**: - `raw_depth_xxx.npy` (32-bit float depth arrays) - `depth_xxx.png` (16-bit depth images) - `color_xxx.png` (8-bit color images) - `pcd_xxx.npy` (point cloud data) - `normal_xxx.npy` (surface normal vectors) ## Data Formats ### File Types and Specifications | Data Type | Format | Bit Depth | Description | |-----------|--------|-----------|-------------| | Depth Images | PNG | 16-bit | Standard depth maps | | Color Images | PNG | 8-bit RGB | Color/texture images | | Raw Depth | NPY | 32-bit float | High-precision depth (ZED2i only) | | Point Clouds | NPY | 32-bit float | 3D point coordinates (X,Y,Z) | | Surface Normals | NPY | 32-bit float | Surface normal vectors | | Infrared | PNG | 8-bit | IR images (Kinect only) | ### Depth Data The unit of the depth data is 'mm' for most of the cameras, which means that we can obtain the 'm'-scale by dividing the raw depth by 1000. Note that RealSense D405/L515 has different scales, which are 2500 and 10000, respectively. In other words, we should divide the raw depth by 2500 and 10000 to obtain the 'm'-scale depth. ### File Naming Convention - Sequential numbering: `xxx` represents frame index (000, 001, 002, ...) - Synchronized capture: Same frame numbers across cameras represent simultaneous capture - Camera identification: Folder names clearly identify camera type and model ## 📄 Citation If you use this dataset in your research, please cite: ```bibtex @article{liu2025manipulation, title={Manipulation as in Simulation: Enabling Accurate Geometry Perception in Robots}, author={Liu, Minghuan and Zhu, Zhengbang and Han, Xiaoshen and Hu, Peng and Lin, Haotong and Li, Xinyao and Chen, Jingxiao and Xu, Jiafeng and Yang, Yichu and Lin, Yunfeng and Li, Xinghang and Yu, Yong and Zhang, Weinan and Kong, Tao and Kang, Bingyi}, journal={arXiv preprint}, year={2025}, url={https://huggingface.co/papers/2509.02530} } ``` ## License This dataset is released under the CC BY 4.0 License.

# ByteCameraDepth 数据集 [论文](https://huggingface.co/papers/2509.02530) | [项目主页](https://manipulation-as-in-simulation.github.io/) | [代码仓库](https://github.com/ByteDance-Seed/manip-as-in-sim-suite) ByteCameraDepth 是一款多相机深度估计数据集,收录了从多款3D相机采集的同步深度数据、彩色图像及辅助信息。该数据集覆盖多种室内场景下的多相机深度感知数据,可用于开发与评估深度估计算法,是相关研究的理想基准数据集。 ## 数据集概览 - **用途**:多相机深度估计研究与基准测试 - **总录制会话数**:39次 - **解压后总大小**:约2.7TB - **数据采集系统**:[多相机深度录制系统](https://github.com/Ericonaldo/depth_recording) - **授权协议**:CC-BY-4.0 ## 快速上手 ### 数据提取 本数据集以分卷归档文件形式提供,完整提取完整数据集的命令如下: bash cat recorded_data.tar.part.* | tar -xvf - 执行该命令后,将生成包含全部39个录制会话的`recorded_data`文件夹。 ## 数据集结构 ### 归档文件组织形式 recorded_data_packed/ ├── recorded_data.tar.part.000 ├── recorded_data.tar.part.001 ├── ... └── recorded_data.tar.part.136 ### 解压后的数据结构 完成解压后,数据组织形式如下: recorded_data/ └── YYYYMMDD_HHMM/ # 基于时间戳的会话文件夹(共39个会话) ├── camera_realsense_455/ # 英特尔RealSense(Intel RealSense)D455 │ ├── depth_000.png # 16位深度图像 │ ├── color_000.png # 8位彩色图像 │ └── ... ├── camera_realsense_d405/ # 英特尔RealSense D405 │ ├── depth_000.png │ ├── color_000.png │ └── ... ├── camera_realsense_d415/ # 英特尔RealSense D415 │ ├── depth_000.png │ ├── color_000.png │ └── ... ├── camera_realsense_d435/ # 英特尔RealSense D435 │ ├── depth_000.png │ ├── color_000.png │ └── ... ├── camera_realsense_l515/ # 英特尔RealSense L515 │ ├── depth_000.png │ ├── color_000.png │ └── ... ├── camera_kinect/ # 微软Azure Kinect(Microsoft Azure Kinect) │ ├── depth_000.png # 16位深度图像 │ ├── color_000.png # 8位彩色图像 │ ├── ir_000.png # 红外图像 │ └── ... ├── camera_zed2i_neural/ # 斯特莱 Labs(Stereolabs)ZED2i(神经网络模式) │ ├── raw_depth_000.npy # 32位浮点深度数组 │ ├── depth_000.png # 16位深度图像 │ ├── color_000.png # 彩色图像 │ ├── pcd_000.npy # 点云数据(X,Y,Z) │ ├── normal_000.npy # 表面法向量 │ └── ... ├── camera_zed2i_performance/ ├── camera_zed2i_quality/ ├── camera_zed2i_ultra/ └── ... ## 相机系统与技术规格 本数据集使用我们的[深度录制工具包](https://github.com/Ericonaldo/depth_recording)采集数据。 ### 英特尔RealSense相机系列 - **型号**:D405、D415、D435、D455、L515 - **输出文件**:`depth_xxx.png`(16位)、`color_xxx.png`(8位) ### 微软Azure Kinect - **深度分辨率**:宽视场未进行像素合并 - **输出文件**:`depth_xxx.png`(16位)、`color_xxx.png`(8位)、`ir_xxx.png`(红外图像) ### 斯特莱 Labs ZED2i - **深度分辨率**:1280×720 - **深度模式**:4种不同模式(神经网络、性能、质量、超精度) - **输出文件**: - `raw_depth_xxx.npy`(32位浮点深度数组) - `depth_xxx.png`(16位深度图像) - `color_xxx.png`(8位彩色图像) - `pcd_xxx.npy`(点云数据) - `normal_xxx.npy`(表面法向量) ## 数据格式 ### 文件类型与技术规格 | 数据类型 | 格式 | 位深度 | 说明 | |----------------|--------|----------|--------------------------| | 深度图像 | PNG | 16位 | 标准深度图 | | 彩色图像 | PNG | 8位RGB | 彩色/纹理图像 | | 原始深度数据 | NPY | 32位浮点 | 高精度深度(仅ZED2i支持)| | 点云数据 | NPY | 32位浮点 | 三维点坐标(X,Y,Z) | | 表面法向量 | NPY | 32位浮点 | 表面法向量 | | 红外图像 | PNG | 8位 | 仅微软Azure Kinect支持 | ### 深度数据 大部分相机的深度数据单位为毫米(mm),即通过将原始深度值除以1000即可得到以米(m)为单位的深度值。请注意,英特尔RealSense D405和L515的缩放系数不同,分别为2500和10000,换言之,需将其原始深度值分别除以2500和10000以得到米级深度。 ### 文件命名规则 - **序号**:`xxx`代表帧索引(000、001、002……) - **同步采集**:不同相机的相同帧编号代表同时刻采集的数据 - **相机标识**:文件夹名称清晰标注了相机类型与型号 ## 📄 引用说明 若您在研究中使用该数据集,请引用以下文献: bibtex @article{liu2025manipulation, title={Manipulation as in Simulation: Enabling Accurate Geometry Perception in Robots}, author={Liu, Minghuan and Zhu, Zhengbang and Han, Xiaoshen and Hu, Peng and Lin, Haotong and Li, Xinyao and Chen, Jingxiao and Xu, Jiafeng and Yang, Yichu and Lin, Yunfeng and Li, Xinghang and Yu, Yong and Zhang, Weinan and Kong, Tao and Kang, Bingyi}, journal={arXiv preprint}, year={2025}, url={https://huggingface.co/papers/2509.02530} } ## 授权协议 本数据集采用CC BY 4.0协议发布。
提供机构:
maas
创建时间:
2025-09-03
搜集汇总
数据集介绍
main_image_url
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作