EPFL-VILAB/TST-Scannet-pp
收藏Hugging Face2026-04-08 更新2026-05-10 收录
下载链接:
https://hf-mirror.com/datasets/EPFL-VILAB/TST-Scannet-pp
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
tags:
- multimodal-learning
---
# Dataset Card for TST-Scannet++
## Dataset Description
- **Homepage:** [https://tst-vision.epfl.ch](https://tst-vision.epfl.ch)
- **Repository:** [TST official repository](https://github.com)
- **Paper:** [Arxiv](https://arxiv.org)
### Dataset Summary
This custom TST-Scannet++ dataset is used in research work "Multimodality as Supervision: Self-Supervised Specialization to the Test Environment via Multimodality".
- `pretrain/` is a multimodal pretraining dataset modified from the original ScanNet++ 3D dataset. It contains the 9 annotated tokenized modalities.
## Dataset Structure Pretraining Data
```python
TST-Scannet-pp/
├── pretrain/
│ ├── dslr/
│ │ ├── crop_settings/ # Contains .tar shards
│ │ ├── det/ # Contains .tar shards
│ │ ├── tok_canny_edge@224/ # Contains .tar shards
│ │ ├── ... # More tokenized feature directories
│ │ └── tok_semseg@224/ # Contains .tar shards
│ ├── iphone/
│ │ ├── crop_settings/ # Contains .tar shards
│ │ ├── det/ # Contains .tar shards
│ │ ├── tok_canny_edge@224/ # Contains .tar shards
│ │ ├── ... # More tokenized feature directories
│ │ └── tok_semseg@224/ # Contains .tar shards
│ └── transfer/
│ ├── crop_settings/ # Contains .tar shards
│ ├── det/ # Contains .tar shards
│ ├── tok_canny_edge@224/ # Contains .tar shards
│ ├── ... # More tokenized feature directories
│ └── tok_semseg@224/ # Contains .tar shards
└── README.md
```
## Dataset Creation
We use [ScanNet++](https://kaldir.vc.in.tum.de/scannetpp/), which is a large-scale dataset of real-world indoor spaces containing sub-millimeter resolution scans, paired with DSLR and iPhone RGB images. We use 8 scenes as the test space, and use a mix of iPhone and DSLR images from these scenes for pre-training. For generating the pretrain RGB images and the transfer/test sets please follow the instructions provided in the project repo.
### Source Data
Original dataset samples are collected from ScanNet++ framework.
### Citation Information
```
@inproceedings{singh2026tst,
title={Multimodality as Supervision: Self-Supervised Specialization to the Test Environment via Multimodality},
author={Kunal Pratap Singh and Ali Garjani and Rishubh Singh and Muhammad Uzair Khattak and Efe Tarhan and Jason Toskov and Andrei Atanov and O{\u{g}}uzhan Fatih Kar and Amir Zamir},
booktitle={International Conference on Learning Representations (ICLR)},
year={2026}
}
```
提供机构:
EPFL-VILAB



