taegyoun88/egoxtreme
收藏Hugging Face2026-04-06 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/taegyoun88/egoxtreme
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
license: cc-by-nc-4.0
size_categories:
- 1M<n<10M
pretty_name: EgoXtreme
task_categories:
- object-detection
tags:
- 6d-pose-estimation
- egocentric-vision
- bop-format
---
# EgoXtreme: A Dataset for Robust Object Pose Estimation in Egocentric Views under Extreme Conditions
[](https://taegyoun88.github.io/EgoXtreme/)
[](https://github.com/taegyoun88/EgoXtreme)
[](https://arxiv.org/abs/2603.25135)
[](https://huggingface.co/datasets/taegyoun88/egoxtreme-test)
## 📖 Dataset Information
EgoXtreme is a novel large-scale dataset designed for robust egocentric 6D object pose estimation under extreme environmental conditions. The dataset comprises approximately 1.3 million frames with a total duration of 775.5 minutes (~12.9 hours). It was captured at 30 fps using Aria glasses, providing high-resolution 1408 x 1408 raw fisheye RGB images along with their undistorted versions.
The dataset features 15 participants performing diverse interactions with 13 different objects (including sports equipment, assembly blocks, and emergency supplies). It is divided into training (518.8 min), validation (80.7 min), and test (176 min) sets across three challenging scenarios: Industrial Maintenance, Sports, and Emergency Rescue.
> **Note on Test Set:** For fair evaluation, the GT annotations for the test set are withheld. The test images can be downloaded from our separate repository: [taegyoun88/egoxtreme-test](https://huggingface.co/datasets/taegyoun88/egoxtreme-test).
## 🛠️ Sample Usage
The [official repository](https://github.com/taegyoun88/EgoXtreme) provides tools to process and visualize the data.
### Undistortion
Due to the large file size, undistorted versions of the data are generated via scripts. To generate undistorted RGB images and masks:
```bash
# Process a specific scene
python tools/undistortion.py --data_dir ./data/train --scene_id 000000
# Process all scenes in train/test set
python tools/undistortion.py --data_dir ./data/train --all
```
### Visualization
To visualize the Ground Truth 6D pose on the images:
```bash
# Visualize specific scene (Add --undist for undistorted images, --im_id for single frame)
python tools/visualization.py --data_dir ./data/test --scene_id 000000 --models_dir ./models [--undist] [--im_id 0]
```
## 🎛️ Scenario Configurations
The detailed configurations of illumination and environmental conditions for each scenario are summarized below:
<table>
<thead>
<tr>
<th rowspan="2">Scenario</th>
<th rowspan="2">Standard<br><span style="font-size: 0.8em; font-weight: normal;">(normal, middle, high)</span></th>
<th colspan="5">Extreme</th>
<th rowspan="2">Smoke</th>
<th rowspan="2">Object</th>
</tr>
<tr style="font-size: 0.85em; font-weight: normal;">
<th> low </th>
<th> head </th>
<th> flash </th>
<th> warning </th>
<th style="border-right: 1px solid rgba(128, 128, 128, 0.2);"> green </th>
</tr>
</thead>
<tbody style="text-align: center;">
<tr>
<td style="text-align: left;"><strong>Maintenance</strong></td>
<td>✔️</td>
<td>✔️</td>
<td>✔️</td>
<td>✔️</td>
<td></td>
<td></td>
<td>✔️</td>
<td>5</td>
</tr>
<tr>
<td style="text-align: left;"><strong>Sports</strong></td>
<td>✔️</td>
<td>✔️</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>5</td>
</tr>
<tr>
<td style="text-align: left;"><strong>Emergency</strong></td>
<td>✔️</td>
<td>✔️</td>
<td></td>
<td></td>
<td>✔️</td>
<td>✔️</td>
<td>✔️</td>
<td>3</td>
</tr>
</tbody>
</table>
Below is the mapping of Scene IDs to their corresponding scenarios across the dataset splits:
| Split | Scenario | Scene IDs |
| :--- | :--- | :--- |
| **Train** | Maintenance | `000000` - `000211` |
| | Sports | `000212` - `000417` |
| | Emergency | `000418` - `000573` |
| **Validation** | Maintenance | `000000` - `000039` |
| | Sports | `000040` - `000067` |
| | Emergency | `000068` - `000079` |
For further fine-grained environmental attributes (e.g., specific light conditions and the presence of smoke) of each sequence, please refer to the sequence-level metadata JSON files.
## 📁 Dataset Structure & Format
All files (`*.json`) and 3d model information follow the **BOP format**.
The structure of the data hosted here is organized as follows:
```text
EgoXtreme
├── models/ # 3D CAD models (.ply) and info
├── train/
│ ├── 000000/ # Scene ID
│ │ ├── rgb/ # Raw fisheye RGB images
│ │ ├── mask/ # Full object masks
│ │ ├── scene_camera.json
│ │ ├── scene_gt.json
│ │ ├── scene_gt_info.json
│ │ └── scene_camera_undist.json
│ └── ...
├── val/ ...
│ ├── 000000/
│ └── ...
├── camera.json
├── metadata_train.json # Sequence-level metadata (light, smoke, scenario)
└── metadata_val.json # Sequence-level metadata (light, smoke, scenario)
```
## Citation
```bibtex
@inproceedings{egoxtreme2026,
title={EgoXtreme: A Dataset for Robust Object Pose Estimation in Egocentric Views under Extreme Conditions},
author={Yoon, Taegyoon and Han, Yegyu and Ji, Seojin and Park, Jaewoo and Kim, Sojeong and Kwon, Taein and Kim, Hyung-Sin},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2026}
}
```
提供机构:
taegyoun88



