Voxel51/reflect3er

Name: Voxel51/reflect3er
Creator: Voxel51
Published: 2026-03-17 18:24:28
License: 暂无描述

Hugging Face2026-03-17 更新2026-04-05 收录

下载链接：

https://hf-mirror.com/datasets/Voxel51/reflect3er

下载链接

链接失效反馈

官方服务：

资源简介：

--- annotations_creators: [] language: en size_categories: - n<1K task_categories: - image-segmentation task_ids: [] pretty_name: reflect3r tags: - fiftyone - group - image-segmentation - 3d - threed dataset_summary: > This is a [FiftyOne](https://github.com/voxel51/fiftyone) dataset with 16 samples. ## Installation If you haven't already, install FiftyOne: ```bash pip install -U fiftyone ``` ## Usage ```python import fiftyone as fo from fiftyone.utils.huggingface import load_from_hub # Load the dataset # Note: other available arguments include 'max_samples', etc dataset = load_from_hub("Voxel51/reflect3er") # Launch the App session = fo.launch_app(dataset) ``` license: mit --- # Dataset Card for reflect3r ![image/png](relfect3r.gif) This is a [FiftyOne](https://github.com/voxel51/fiftyone) grouped dataset containing the synthetic evaluation benchmark from [Reflect3r: Single-View 3D Stereo Reconstruction Aided by Mirror Reflections](https://arxiv.org/abs/2509.20607) (3DV 2026). It contains 16 synthetic Blender interior scenes, each with a mirror, rendered from both a real camera and a geometrically derived virtual mirror camera, along with ground-truth point clouds. ## Installation ```bash pip install -U fiftyone openexr ``` ## Usage ```python import fiftyone as fo from huggingface_hub import snapshot_download # Download the dataset snapshot to the current working directory snapshot_download( repo_id="Voxel51/reflect3er", local_dir=".", repo_type="dataset" ) # Load dataset from current directory using FiftyOne's native format dataset = fo.Dataset.from_dir( dataset_dir=".", # Current directory contains the dataset files dataset_type=fo.types.FiftyOneDataset, # Specify FiftyOne dataset format name="reflect3er" # Assign a name to the dataset for identification ) ``` ## Dataset Details ### Dataset Description Reflect3r is a synthetic evaluation dataset constructed to benchmark single-view 3D reconstruction methods in the presence of mirror reflections. The core insight of the accompanying paper is that a mirror in a scene provides a second, geometrically consistent viewpoint for free — a virtual camera whose pose is fully determined by reflecting the real camera pose across the mirror plane. This transforms an ostensibly single-view problem into a stereo reconstruction problem. The dataset consists of 16 photorealistic interior Blender scenes (bedrooms, living rooms, gyms, bathrooms, etc.), each manually augmented with a mirror surface positioned in a plausible location. Each scene is rendered from two cameras: the real physical camera (`Cam_Main`) and the virtual mirror camera (`Cam_Mirror`), whose extrinsics are derived via the Householder reflection matrix. Ground-truth XYZRGB point clouds are provided for quantitative evaluation. - **Created by:** Jing Wu, Zirui Wang, Iro Laina, Victor Adrian Prisacariu (University of Oxford) - **Funded by:** Zirui Wang is supported by an ARIA research gift grant from Meta Reality Lab - **License:** MIT - **Paper:** [arXiv:2509.20607](https://arxiv.org/abs/2509.20607) - **Project page:** [https://jingwu2121.github.io/reflect3r/](https://jingwu2121.github.io/reflect3r/) - **GitHub:** [https://github.com/jingwu2121/reflect3r](https://github.com/jingwu2121/reflect3r) --- ## FiftyOne Dataset Structure This dataset is loaded as a **FiftyOne grouped dataset** with 16 groups (one per scene) and 3 slices per group. The default slice is `cam_main`. ### Group Slices | Slice | Media type | Primary file | Fields | |---|---|---|---| | `cam_main` | Image | `Cam_Main/rgb_0001.png` | `depth`, `inside_mask`, `outside_mask`, `intrinsics`, `extrinsics`, `clip_params`, `scene_name` | | `cam_mirror` | Image | `Cam_Mirror/rgb_0001.png` | `depth`, `flipped_inside_mask`, `intrinsics`, `extrinsics`, `clip_params`, `scene_name` | | `cam_mirror_3d` | 3D | `point_cloud_gt.fo3d` | `scene_name` | ### Field Descriptions **`depth`** (`fo.Heatmap`) — Per-pixel metric depth rendered from the Blender scene. Stored as a normalized uint8 grayscale PNG derived from the original EXR files. See [Depth Normalization](#depth-normalization) below. **`inside_mask` / `outside_mask`** (`fo.Segmentation`, on `cam_main`) — Binary segmentation masks separating mirror interior (`inside_mask`, class 1) from the surrounding real scene (`outside_mask`, class 1). These are in the coordinate frame of `Cam_Main`. **`flipped_inside_mask`** (`fo.Segmentation`, on `cam_mirror`) — The mirror region mask horizontally flipped to align with the virtual camera's coordinate frame. This is the mask used by the Reflect3r pipeline to isolate the reflection region as seen from the virtual camera's perspective. **`intrinsics`** (`list[list[float]]`) — 3×3 camera intrinsic matrix K stored as a nested Python list. Both cameras share the same intrinsics per scene, though focal length varies across scenes (e.g. `gym` uses fx=1600 while most others use fx≈2667). **`extrinsics`** (`list[list[float]]`) — 4×4 camera-to-world transform stored as a nested Python list. The `Cam_Mirror` extrinsics are the reflection of `Cam_Main` extrinsics across the mirror plane, derived via the Householder matrix: `C_vir = diag(-1,1,1,1) · (I - 2nn⊤) · C_real`. **`clip_params`** (`list[float]`) — `[near, far]` clipping distances in metres used during Blender rendering. **`scene_name`** (`str`) — The scene identifier (e.g. `archiviz`, `gym`, `terrazzo`). --- ## Parsing Decisions Several non-trivial choices were made when converting the raw rendered data into FiftyOne format. ### What Was Ignored The `imgs/` subdirectory in each scene contains pre-composited and masked variants of the main image (`image.png`, `image_outside_masked.png`, `flipped_image_inside_masked.png`, `outside.png`). These are fully derivable from `Cam_Main/rgb_0001.png` combined with the masks and were excluded to avoid redundancy. The `blender_source_files/` directory containing raw `.blend` files and texture assets was also excluded. ### Depth Normalization The Blender-rendered `depth_png_0001.png` files are unusable — they are all-white because Blender normalizes depth over the full `[near, far]` clip range (typically 0.1 m to 1000 m), which collapses all real scene depth variation into a tiny portion of the value range. Instead, the raw `depth_exr_0001.exr` files are read directly. Blender stores metric depth identically in the R, G, B channels of the EXR. Some scenes (e.g. `gym`, `terrazzo`, `livingroom`) contain pixels with a sentinel value of ~1×10¹⁰ m assigned to background geometry, transparent surfaces, and the mirror plane itself (which has no real depth). These pixels are excluded from the normalization range and mapped to 255 (farthest depth) in the output. Valid pixels are min-max normalized per-image to uint8 and saved as `depth_norm_0001.png`. ### Mask Binarization The source mask PNGs (`inside_mask.png`, `outside_mask.png`, `flipped_inside_mask.png`) use pixel values `{0, 255}`. FiftyOne's `fo.Segmentation` treats pixel values as integer class indices, and class 255 has no guaranteed color in the viewer's default palette, causing masks to render as invisible. The masks are remapped to `{0, 1}` and saved as `*_bin.png` files, ensuring class 1 is reliably rendered. ### 3D Slice Placement Each scene's ground-truth point cloud (`point_cloud_gt.ply`) is associated with the `cam_mirror_3d` slice rather than `cam_main`. This is a deliberate semantic choice: the GT point cloud is the reconstruction target for the virtual mirror camera, which is the central contribution of the paper. FiftyOne allows only one 3D slice per sample, so this placement best reflects the paper's intent. The `.fo3d` scene is written with `up="Z"` to match Blender's coordinate convention. --- ## Dataset Creation ### Source Scenes The 16 Blender scenes were sourced from [Blender Demo](https://download.blender.org/demo/), [BlenderKit](https://www.blenderkit.com/), and [CGTrader](https://www.cgtrader.com/). Each was manually augmented by the authors with a mirror surface. In some scenes, additional geometry was modelled to ensure consistent scene complexity. A full list of original source URLs is provided in the dataset README. ### Rendering Scenes were rendered using Blender Cycles. The Blender toolkit provided with the dataset ([`render_depth.py`](https://github.com/jingwu2121/reflect3r/blob/main/data_toolkit/render_depth.py)) renders RGB, depth (EXR and PNG), and camera parameters for both cameras. The virtual camera pose is computed via the reflection transformation in [`add_mirrored_cam.py`](https://github.com/jingwu2121/reflect3r/blob/main/data_toolkit/add_mirrored_cam.py). All images are 1920×1080. ### Ground-Truth Point Clouds GT point clouds were generated using [`syn_gt_point_cloud_gen.py`](https://github.com/jingwu2121/reflect3r/blob/main/data_toolkit/syn_gt_point_cloud_gen.py) and saved as binary little-endian PLY files with XYZRGB vertex attributes (Open3D format). Point counts range from hundreds of thousands to several million points per scene. --- ## Evaluation The dataset is used to evaluate 3D reconstruction quality using four metrics: **accuracy**, **completeness**, **F1 score**, and **Chamfer distance**. Accuracy and completeness measure the percentage of predicted-to-GT and GT-to-predicted nearest-neighbour distances below a 1 cm threshold. Chamfer distance measures average nearest-neighbour distance between the two point sets. The paper benchmarks Reflect3r against DUSt3R, MASt3R, VGGT, and MoGe. All baselines fail to correctly handle mirror regions — either hallucinating false geometry or producing degenerate flat reconstructions — while Reflect3r recovers correct geometry for both the real and reflected portions of the scene. --- ## Citation ```bibtex @article{wu2026reflect3r, author = {Wu, Jing and Wang, Zirui and Laina, Iro and Prisacariu, Victor}, title = {{Reflect3r: Single-View 3D Stereo Reconstruction Aided by Mirror Reflections}}, journal = {3DV}, year = {2026}, } ```

annotations_creators: [] language: en size_categories: - n<1K task_categories: - image-segmentation task_ids: [] pretty_name: reflect3r tags: - fiftyone - group - image-segmentation - 3d - threed dataset_summary: > 本数据集为FiftyOne（https://github.com/voxel51/fiftyone）格式，包含16个样本。 # reflect3r 数据集卡片 ![image/png](relfect3r.gif) 本数据集为**FiftyOne（https://github.com/voxel51/fiftyone）分组数据集**，源自论文《Reflect3r: Single-View 3D Stereo Reconstruction Aided by Mirror Reflections》（发表于3DV 2026）的合成评估基准。数据集包含16个合成Blender室内场景，每个场景均配有一面镜子，分别从真实相机以及通过几何推导得到的虚拟镜像相机进行渲染，并附带真值点云。 ## 安装 bash pip install -U fiftyone openexr ## 使用方法 python import fiftyone as fo from huggingface_hub import snapshot_download # 将数据集快照下载至当前工作目录 snapshot_download( repo_id="Voxel51/reflect3er", local_dir=".", repo_type="dataset" ) # 使用FiftyOne原生格式从当前目录加载数据集 dataset = fo.Dataset.from_dir( dataset_dir=".", # 当前目录包含数据集文件 dataset_type=fo.types.FiftyOneDataset, # 指定FiftyOne数据集格式 name="reflect3er" # 为数据集分配名称以方便识别 ) ## 数据集详情 ### 数据集概述 Reflect3r是一款合成评估数据集，用于基准测试存在镜面反射场景下的单视图3D重建方法。配套论文的核心观点为：场景中的镜子可免费提供第二个几何一致的观测视角——虚拟相机的位姿可通过将真实相机位姿相对于镜面反射得到，由此可将表面上的单视图问题转化为立体重建问题。本数据集包含16个照片级真实感的Blender室内场景（涵盖卧室、客厅、健身房、浴室等），每个场景均由研究人员手动添加一块放置于合理位置的镜面。每个场景均通过两台相机进行渲染：真实物理相机（`Cam_Main`）与虚拟镜像相机（`Cam_Mirror`），后者的外参通过Householder反射矩阵推导得到。数据集附带真值XYZRGB点云，用于定量评估。 - **创作者：** Jing Wu、Zirui Wang、Iro Laina、Victor Adrian Prisacariu（牛津大学） - **资助方：** Zirui Wang 获得Meta Reality Lab的ARIA研究捐赠基金支持 - **许可证：** MIT - **论文：** [arXiv:2509.20607](https://arxiv.org/abs/2509.20607) - **项目主页：** [https://jingwu2121.github.io/reflect3r/](https://jingwu2121.github.io/reflect3r/) - **GitHub仓库：** [https://github.com/jingwu2121/reflect3r](https://github.com/jingwu2121/reflect3r) --- ## FiftyOne数据集结构本数据集将以**FiftyOne分组数据集**格式加载，共包含16个组（每个组对应一个场景），每个组包含3个切片。默认切片为`cam_main`。 ### 组切片 | 切片名称 | 媒体类型 | 主文件路径 | 字段 | |---|---|---|---| | `cam_main` | 图像 | `Cam_Main/rgb_0001.png` | `depth`、`inside_mask`、`outside_mask`、`intrinsics`、`extrinsics`、`clip_params`、`scene_name` | | `cam_mirror` | 图像 | `Cam_Mirror/rgb_0001.png` | `depth`、`flipped_inside_mask`、`intrinsics`、`extrinsics`、`clip_params`、`scene_name` | | `cam_mirror_3d` | 3D数据 | `point_cloud_gt.fo3d` | `scene_name` | ### 字段说明 **`depth`**（`fo.Heatmap`类型）—— 从Blender场景渲染得到的逐像素度量深度。存储为从原始EXR文件转换而来的归一化uint8灰度PNG图像。详见下文[深度归一化](#depth-normalization)。 **`inside_mask` / `outside_mask`**（`cam_main`切片的`fo.Segmentation`类型）—— 用于区分镜面内部区域（`inside_mask`，类别1）与周围真实场景（`outside_mask`，类别1）的二值分割掩码。二者均处于`Cam_Main`的坐标系下。 **`flipped_inside_mask`**（`cam_mirror`切片的`fo.Segmentation`类型）—— 经过水平翻转的镜面区域掩码，以匹配虚拟相机的坐标系。该掩码是Reflect3r管线中用于隔离虚拟相机视角下反射区域的关键数据。 **`intrinsics`**（`list[list[float]]`类型）—— 以嵌套Python列表存储的3×3相机内参矩阵K。每个场景下的两台相机共享相同的内参，但不同场景的焦距存在差异（例如`gym`场景的fx=1600，其余多数场景的fx≈2667）。 **`extrinsics`**（`list[list[float]]`类型）—— 以嵌套Python列表存储的4×4相机到世界坐标系的变换矩阵。`Cam_Mirror`的外参是`Cam_Main`外参相对于镜面的反射结果，通过Householder矩阵推导得到：`C_vir = diag(-1,1,1,1) · (I - 2nn⊤) · C_real`。 **`clip_params`**（`list[float]`类型）—— Blender渲染时使用的`[近裁剪面, 远裁剪面]`距离参数，单位为米。 **`scene_name`**（`str`类型）—— 场景标识符（例如`archiviz`、`gym`、`terrazzo`）。 --- ## 格式转换决策在将原始渲染数据转换为FiftyOne格式的过程中，我们做出了多项需要慎重考量的关键决策。 ### 未包含的内容每个场景的`imgs/`子目录中包含主图像的预合成与掩码变体（`image.png`、`image_outside_masked.png`、`flipped_image_inside_masked.png`、`outside.png`）。这些数据均可通过`Cam_Main/rgb_0001.png`结合掩码文件推导得到，因此为避免冗余未纳入数据集。包含原始`.blend`文件与纹理资源的`blender_source_files/`目录同样未被包含。 ### 深度归一化 Blender渲染得到的`depth_png_0001.png`文件无法直接使用——它们全为白色，这是因为Blender会在完整的`[近裁剪面, 远裁剪面]`范围（通常为0.1米至1000米）内对深度进行归一化，这会将所有真实场景的深度变化压缩到数值范围的极小一部分中。因此我们直接读取原始`depth_exr_0001.exr`文件。Blender将度量深度以相同的值存储在EXR文件的R、G、B三个通道中。部分场景（例如`gym`、`terrazzo`、`livingroom`）中存在哨兵值约为1×10¹⁰米的像素，这些像素被分配给背景几何体、透明表面以及镜面本身（镜面本身无实际深度）。这些像素会被排除在归一化范围之外，并在输出中映射为255（最远深度）。有效像素会按图像进行最小-最大归一化，转换为uint8格式并保存为`depth_norm_0001.png`。 ### 掩码二值化原始掩码PNG文件（`inside_mask.png`、`outside_mask.png`、`flipped_inside_mask.png`）使用`{0, 255}`的像素值。FiftyOne的`fo.Segmentation`将像素值视为整数类别索引，而类别255在查看器的默认调色板中没有指定颜色，这会导致掩码渲染为不可见。因此我们将掩码重新映射为`{0, 1}`，并保存为`*_bin.png`文件，确保类别1能够被可靠渲染。 ### 3D切片放置每个场景的真值点云（`point_cloud_gt.ply`）被关联至`cam_mirror_3d`切片而非`cam_main`。这是一项经过深思熟虑的语义选择：真值点云是虚拟镜像相机的重建目标，而这正是本文的核心贡献。由于FiftyOne每个样本仅允许一个3D切片，因此该放置方式最贴合论文的设计意图。`.fo3d`场景文件以`up="Z"`的格式写入，以匹配Blender的坐标约定。 --- ## 数据集创建 ### 源场景 16个Blender场景分别源自[Blender Demo](https://download.blender.org/demo/)、[BlenderKit](https://www.blenderkit.com/)与[CGTrader](https://www.cgtrader.com/)。研究人员手动为每个场景添加了镜面。部分场景中还额外建模了几何体，以确保场景复杂度一致。原始来源URL的完整列表已在数据集README中提供。 ### 渲染场景使用Blender Cycles进行渲染。数据集附带的Blender工具包（[`render_depth.py`](https://github.com/jingwu2121/reflect3r/blob/main/data_toolkit/render_depth.py)）可渲染两台相机的RGB图像、深度图像（EXR与PNG格式）以及相机参数。虚拟相机的位姿通过[`add_mirrored_cam.py`](https://github.com/jingwu2121/reflect3r/blob/main/data_toolkit/add_mirrored_cam.py)中的反射变换计算得到。所有图像的分辨率均为1920×1080。 ### 真值点云真值点云通过[`syn_gt_point_cloud_gen.py`](https://github.com/jingwu2121/reflect3r/blob/main/data_toolkit/syn_gt_point_cloud_gen.py)生成，并保存为带有XYZRGB顶点属性的二进制小端序PLY文件（Open3D格式）。每个场景的点数量从数十万到数百万不等。 --- ## 评估本数据集通过四项指标评估3D重建质量：**准确率（accuracy）**、**完整率（completeness）**、**F1分数（F1 score）**与**查莫夫距离（Chamfer distance）**。准确率与完整率分别衡量预测点云到真值点云、真值点云到预测点云的最近邻距离小于1厘米阈值的比例。查莫夫距离则衡量两个点集之间的平均最近邻距离。论文将Reflect3r与DUSt3R、MASt3R、VGGT以及MoGe进行了基准测试对比。所有基线方法均无法正确处理镜面区域——要么会幻觉出虚假几何体，要么会生成退化的平面重建结果，而Reflect3r则能够恢复场景中真实部分与反射部分的正确几何体。 --- ## 引用 bibtex @article{wu2026reflect3r, author = {Wu, Jing and Wang, Zirui and Laina, Iro and Prisacariu, Victor}, title = {{Reflect3r: Single-View 3D Stereo Reconstruction Aided by Mirror Reflections}}, journal = {3DV}, year = {2026}, }

提供机构：

Voxel51

5,000+

优质数据集

54 个

任务类型

进入经典数据集