ZhengGuangze/Kubric_vlbm
收藏Hugging Face2026-03-27 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/ZhengGuangze/Kubric_vlbm
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
---
# Kubric (converted to VLBM format)
This dataset contains 11,000 sequences from the Kubric (TAPVid3D) dataset converted to the VLBM-compatible format using `preprocess_kubric.py`. The sequences have been compressed into `.tar.gz` archives in chunks of 50 sequences per archive.
## Dataset Description
- **Source**: [Kubric / TAPVid3D](https://github.com/google-deepmind/kubric) (converted)
- **Format**: VLBM-compatible per-sequence layout
- **Contents**: RGB images, dense depth maps, 2D/3D trajectories, camera intrinsics and extrinsics, visibilities, and scene metadata
### Scale
| Metric | Value |
|---|---|
| Total sequences | 11,000 |
| Frames per sequence | 24 |
| Image resolution | 512 x 512 px |
| Depth type | Dense (ground truth) |
## Dataset Structure
Each sequence directory follows this layout:
```
<seq_id>/
├── rgbs/
│ ├── rgb_00000.jpg
│ ├── rgb_00001.jpg
│ └── ...
├── depths/
│ ├── depth_00000.npz
│ ├── depth_00001.npz
│ └── ...
├── intrinsics.npy
├── extrinsics.npy
├── trajs_2d.npy
├── trajs_3d.npy
├── visibilities.npy
└── scene_info.json
```
### File Descriptions
- `rgbs/`: RGB frames saved as JPEG (`rgb_XXXXX.jpg`). Resolution is 512x512 pixels (converted from source PNG).
- `depths/`: Dense depth maps saved as compressed NumPy archives (`depth_XXXXX.npz`). Each archive stores a float16 array under the key `depth` of shape `(H, W)` in meters. Converted from kubric 16-bit depth PNGs using linear mapping from `depth_range` (near, far).
- `intrinsics.npy`: Camera intrinsic matrices for each frame `(T, 3, 3)` float16. Converted from kubric NDC intrinsics to pixel-space.
- `extrinsics.npy`: World-to-camera extrinsic matrices (W2C, OpenCV convention) for each frame `(T, 4, 4)` float16. Converted from kubric OpenGL camera-to-world matrices.
- `trajs_2d.npy`: 2D trajectories `(T, N, 2)` float16 -- pixel coordinates (x, y).
- `trajs_3d.npy`: 3D trajectories `(T, N, 3)` float16 -- world-space coordinates (x, y, z).
- `visibilities.npy`: Visibility flags `(T, N)` float16 (1.0 visible, 0.0 not visible). Inverted from kubric's `occluded` flag.
- `scene_info.json`: JSON file with per-sequence metadata including `num_frames`, `image_size`, `num_trajectories`, `source`, and `depth_range`.
## Conversion Details
Key conversions from the original kubric_tapip3d format:
1. **Coordinate system**: OpenGL → OpenCV (Y and Z axes flipped)
2. **Intrinsics**: NDC → pixel-space (focal length scaled by image dimensions, principal point at image center)
3. **Extrinsics**: Camera-to-world (OpenGL) → World-to-camera (OpenCV) via inversion + GL2CV transform
4. **Depth**: uint16 PNG → float16 meters using `depth_m = uint16 / 65535 * (far - near) + near`
5. **Visibility**: `occluded` (True=occluded) → `visibility` (1.0=visible)
6. **Array layout**: `(N, T, *)` → `(T, N, *)`
## Data Specifications
- **Image format**: JPEG (RGB), 512x512 px
- **Depth format**: NPZ (float16), dense (ground truth from Kubric)
- **Annotation format**: Individual `.npy` files (float16)
- **Coordinate system**: x=right, y=down, z=forward (OpenCV camera space)
- **Extrinsics**: World-to-camera (W2C) 4x4 matrices (OpenCV convention)
## Usage Example (Python)
```python
import numpy as np
from PIL import Image
from pathlib import Path
import json
seq_dir = Path("data/kubric_vlbm/000035")
# Load annotations
trajs_2d = np.load(seq_dir / "trajs_2d.npy") # (T, N, 2)
trajs_3d = np.load(seq_dir / "trajs_3d.npy") # (T, N, 3)
vis = np.load(seq_dir / "visibilities.npy") # (T, N)
intrinsics = np.load(seq_dir / "intrinsics.npy") # (T, 3, 3)
extrinsics = np.load(seq_dir / "extrinsics.npy") # (T, 4, 4)
# Load an image and depth map
frame_idx = 0
rgb = Image.open(seq_dir / "rgbs" / f"rgb_{frame_idx:05d}.jpg")
depth_npz = np.load(seq_dir / "depths" / f"depth_{frame_idx:05d}.npz")
depth = depth_npz['depth'] # float16 array (H, W)
# Load scene info
with open(seq_dir / "scene_info.json", 'r') as f:
scene_info = json.load(f)
print(scene_info)
```
## Citation
Please cite the original Kubric dataset when using the converted data:
```bibtex
@inproceedings{greff2022kubric,
title={Kubric: A scalable dataset generator},
author={Greff, Klaus and Belletti, Francois and Beber, Lucas and Curth, Carlos and Franber, Tom and Goel, Shreshth and Goodman, Xavier and Jimenez, Victor and Kabelka, Matthieu and Tagliasacchi, Andrea and others},
booktitle={CVPR},
year={2022}
}
```
提供机构:
ZhengGuangze



