five

3D IQ Test Task (3D-IQTT) - A Dataset for Quantitative Evaluation of 3D Reconstruction from 2D Images

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://zenodo.org/record/2573264
下载链接
链接失效反馈
官方服务:
资源简介:
3D reconstruction is mostly evaluated qualitatively. With this dataset, we are introducing a new difficult quantitative task, the 3D IQ test task (3D-IQTT). It is designed to be similar to mental rotation questions found in some IQ tests. Each element in the dataset consists of 4 images: reference object and answers 1-3. One of the answers is the reference object but randomly rotated. For every question, dataset users have to use their model to pick the rotated model out of the 3 possible answers. The dataset encourages semi-supervised or unsupervised 3D reconstruction because it contains a large corpus of unlabeled data and only a small set of labeled data where the correct answer is known. All the images are of blocky 3D shapes floating in space in front of a black background. Demo scripts for loading/processing the dataset can be found at https://github.com/fgolemo/3D-IQTT The dataset consists of: 3diqtt-v2-train.h5 (XZ-compressed) (Training Dataset) /labeled /questions format: [10,000 x 4 x 128 x 128 x 3], corresponding to (10k items) x (reference + 3 answers) x (img width) x (img height) x (RGB), np.float32 in range [0,1] /answers format: [10,000], corresponding to (10k answers), np.uint8, one of the following three items: [0,1,2] /unlabeled /questions format: [100,000 x 4 x 128 x 128 x 3], corresponding to (100k items) x (reference + 3 answers) x (img width) x (img height) x (RGB), np.float32 in range [0,1] 3diqtt-v2-test.h5 (Test Dataset) /questions format: [10,000 x 4 x 128 x 128 x 3], corresponding to (10k items) x (reference + 3 answers) x (img width) x (img height) x (RGB), np.float32 in range [0,1]. Important! This is what you have to evaluate yourself on. We have the correct answers but they are not public. 3diqtt-v2-val.h5 (Validation Dataset) /questions format: [10,000 x 4 x 128 x 128 x 3], corresponding to (10k items) x (reference + 3 answers) x (img width) x (img height) x (RGB), np.float32 in range [0,1] /answers format [10,000], corresponding to (10k answers), np.uint8, one of the following three items: [0,1,2]   Important: Before use, the main training dataset (3diqtt-v2-train.h5.xz) needs to be decompressed. This can take up to 24h depending on your hardware. We apologize for any inconvenience caused by this. The uncompressed file has a size of ~74GB. The reason for this compression was a restriction on the size of individual files. The command for decompression is "unxz 3diqtt-v2-train.h5.xz" on Unix machines. If you use this dataset, please cite it.
创建时间:
2020-01-24
二维码
社区交流群
二维码
科研交流群
商业服务