turhancan97/SpaRRTa
收藏Hugging Face2026-03-10 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/turhancan97/SpaRRTa
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- image-classification
language:
- en
size_categories:
- 100K<n<1M
dataset_info:
features:
- name: sample_id
dtype: string
- name: scene
dtype: string
- name: variant
dtype: int32
- name: scene_variant
dtype: string
- name: frame_id
dtype: int32
- name: image
dtype: image
- name: image_relpath
dtype: string
- name: params_relpath
dtype: string
- name: raw_params_json
dtype: string
- name: camera_json
dtype: string
- name: actors_json
dtype: string
- name: source_json
dtype: string
- name: actor_labels
sequence: string
- name: has_label_mapping
dtype: bool
- name: label_mapping_json
dtype: string
- name: label_mapping_keys
sequence: string
- name: label_mapping_values
sequence: string
- name: original_params_name
dtype: string
- name: upload_batch_utc
dtype: string
configs:
- config_name: default
data_files:
- split: train
path: data/train/*.parquet
pretty_name: SpaRRTa
tags:
- spatial_intelligence
---
<h1 style="display:flex; align-items:center; gap:10px;">
<img src="assets/logo.png" alt="Logo" width="55">
<span style="color:#FF7096;">SpaRRTa</span>: A Synthetic Benchmark for Evaluating Spatial Intelligence in Visual Foundation Models
</h1>
## Summary
- Format: parquet shards with one row per sample.
- Split: `train` only.
- Images are embedded in parquet under the `image` struct column (`bytes`, `path`).
- Full raw metadata is preserved in `raw_params_json`.
## Run Stats
- Discovered paired samples: **149145**
- Skipped broken/missing pairs or parse errors: **0**
## Scene Coverage
| scene_variant | scene | variant | paired_samples | new_rows | skipped_existing | missing_pairs |
|---|---:|---:|---:|---:|---:|---:|
| `bridge` | `bridge` | `1` | 9834 | 0 | 9834 | 0 |
| `bridge_2` | `bridge` | `2` | 9834 | 0 | 9834 | 0 |
| `bridge_3` | `bridge` | `3` | 9834 | 0 | 9834 | 0 |
| `city` | `city` | `1` | 10000 | 0 | 10000 | 0 |
| `city_2` | `city` | `2` | 10000 | 0 | 10000 | 0 |
| `city_3` | `city` | `3` | 10000 | 0 | 10000 | 0 |
| `desert` | `desert` | `1` | 10000 | 0 | 10000 | 0 |
| `desert_2` | `desert` | `2` | 10000 | 0 | 10000 | 0 |
| `desert_3` | `desert` | `3` | 10000 | 0 | 10000 | 0 |
| `forest` | `forest` | `1` | 10000 | 0 | 10000 | 0 |
| `forest_2` | `forest` | `2` | 10000 | 0 | 10000 | 0 |
| `forest_3` | `forest` | `3` | 10000 | 0 | 10000 | 0 |
| `winter_town` | `winter_town` | `1` | 9881 | 0 | 9881 | 0 |
| `winter_town_2` | `winter_town` | `2` | 9881 | 0 | 9881 | 0 |
| `winter_town_3` | `winter_town` | `3` | 9881 | 0 | 9881 | 0 |
## Columns
- `sample_id` (string): stable unique id (`scene_variant:frame_id`)
- `scene` (string): base scene name (e.g. `bridge`)
- `variant` (int): numeric variant from folder suffix (`bridge_3` -> `3`, base folder -> `1`)
- `scene_variant` (string): source folder name
- `frame_id` (int): numeric frame id from filename
- `image` (struct): embedded image bytes + relative path
- `image_relpath` (string): relative source image path
- `params_relpath` (string): relative source JSON path
- `raw_params_json` (string): full original JSON text
- `camera_json` (string): `camera` section
- `actors_json` (string): `actors` section
- `source_json` (string): `source` section
- `actor_labels` (list[string]): unique labels found in `actors`
- `has_label_mapping` (bool): whether `source.label_mapping` exists
- `label_mapping_json` (string): full mapping JSON
- `label_mapping_keys` (list[string]): mapping keys
- `label_mapping_values` (list[string]): mapping values
- `original_params_name` (string): `source.original_params` when present
- `upload_batch_utc` (string): UTC timestamp of upload run
## Loading
```python
from datasets import load_dataset, Image
ds = load_dataset("turhancan97/SpaRRTa", split="train")
# Convert struct column to datasets Image feature if needed:
ds = ds.cast_column("image", Image())
```
## Download to Local Machine
Download the dataset repo files (parquet shards, README, manifest):
```bash
huggingface-cli download turhancan97/SpaRRTa --repo-type dataset --local-dir ./hf_SpaRRTa
```
## Rebuild Original Folder Structure
The script below reconstructs:
`<output>/<scene_variant>/img_XXXX.jpg` and `params_XXXX.json`
```python
from pathlib import Path
from datasets import load_dataset, Image
repo_id = "turhancan97/SpaRRTa"
output_root = Path("position_between_objects")
output_root.mkdir(parents=True, exist_ok=True)
ds = load_dataset(repo_id, split="train")
ds = ds.cast_column("image", Image(decode=False))
for row in ds:
mid = output_root / row["scene_variant"]
mid.mkdir(parents=True, exist_ok=True)
image_name = Path(row["image_relpath"]).name
params_name = Path(row["params_relpath"]).name
image_bytes = row["image"]["bytes"]
if image_bytes is None:
raise RuntimeError(f"Missing embedded image bytes for {row['sample_id']}")
(mid / image_name).write_bytes(image_bytes)
(mid / params_name).write_text(row["raw_params_json"], encoding="utf-8")
```
## Citation
If you find this research useful, please consider citing:
```bibtex
@misc{kargin2026sparrta,
title={SpaRRTa: A Synthetic Benchmark for Evaluating Spatial Intelligence in Visual Foundation Models},
author={Turhan Can Kargin and Wojciech Jasiński and Adam Pardyl and Bartosz Zieliński and Marcin Przewięźlikowski},
year={2026},
eprint={2601.11729},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2601.11729}
}
<h1 style="display:flex; align-items:center; gap:10px;"><img src="assets/logo.png" alt="Logo" width="55"><span style="color:#FF7096;">SpaRRTa</span>:用于评估视觉基础模型空间智能的合成基准数据集</h1>
### 数据集元信息
yaml
许可证:MIT许可证
任务类别:图像分类
语言:英语
样本规模:100000 < n < 1000000
数据集信息:
特征字段:
- 字段名:sample_id,数据类型:字符串
- 字段名:scene,数据类型:字符串
- 字段名:variant,数据类型:int32
- 字段名:scene_variant,数据类型:字符串
- 字段名:frame_id,数据类型:int32
- 字段名:image,数据类型:图像
- 字段名:image_relpath,数据类型:字符串
- 字段名:params_relpath,数据类型:字符串
- 字段名:raw_params_json,数据类型:字符串
- 字段名:camera_json,数据类型:字符串
- 字段名:actors_json,数据类型:字符串
- 字段名:source_json,数据类型:字符串
- 字段名:actor_labels,数据类型:字符串序列
- 字段名:has_label_mapping,数据类型:布尔值
- 字段名:label_mapping_json,数据类型:字符串
- 字段名:label_mapping_keys,数据类型:字符串序列
- 字段名:label_mapping_values,数据类型:字符串序列
- 字段名:original_params_name,数据类型:字符串
- 字段名:upload_batch_utc,数据类型:字符串
配置项:
- 配置名:default
数据文件:
- 划分:训练集(train)
路径:data/train/*.parquet
数据集显示名:SpaRRTa
标签:空间智能
## 概述
- 存储格式:单样本一行的Parquet分片文件
- 数据集划分:仅包含训练集(train)
- 图像以结构体列`image`的形式嵌入Parquet文件中,包含`bytes`与`path`字段
- 完整原始元数据存储于`raw_params_json`字段中
## 运行统计
- 已发现配对样本数:**149145**
- 因损坏/缺失配对或解析错误跳过的样本数:**0**
## 场景覆盖
| 场景变体名 | 基础场景名 | 变体编号 | 配对样本数 | 新增行数 | 已跳过样本数 | 缺失配对数 |
|---|---:|---:|---:|---:|---:|---:|
| `bridge` | `桥梁` | `1` | 9834 | 0 | 9834 | 0 |
| `bridge_2` | `桥梁` | `2` | 9834 | 0 | 9834 | 0 |
| `bridge_3` | `桥梁` | `3` | 9834 | 0 | 9834 | 0 |
| `city` | `城市` | `1` | 10000 | 0 | 10000 | 0 |
| `city_2` | `城市` | `2` | 10000 | 0 | 10000 | 0 |
| `city_3` | `城市` | `3` | 10000 | 0 | 10000 | 0 |
| `desert` | `沙漠` | `1` | 10000 | 0 | 10000 | 0 |
| `desert_2` | `沙漠` | `2` | 10000 | 0 | 10000 | 0 |
| `desert_3` | `沙漠` | `3` | 10000 | 0 | 10000 | 0 |
| `forest` | `森林` | `1` | 10000 | 0 | 10000 | 0 |
| `forest_2` | `森林` | `2` | 10000 | 0 | 10000 | 0 |
| `forest_3` | `森林` | `3` | 10000 | 0 | 10000 | 0 |
| `winter_town` | `冬季小镇` | `1` | 9881 | 0 | 9881 | 0 |
| `winter_town_2` | `冬季小镇` | `2` | 9881 | 0 | 9881 | 0 |
| `winter_town_3` | `冬季小镇` | `3` | 9881 | 0 | 9881 | 0 |
## 字段说明
- `sample_id`(字符串类型):稳定唯一标识符,格式为`scene_variant:frame_id`
- `scene`(字符串类型):基础场景名称(例如`bridge`对应桥梁)
- `variant`(整数类型):源自文件夹后缀的数字变体编号(例如`bridge_3`对应`3`,基础文件夹对应`1`)
- `scene_variant`(字符串类型):源文件夹名称
- `frame_id`(整数类型):源自文件名的数字帧ID
- `image`(结构体类型):嵌入的图像字节流与相对路径
- `image_relpath`(字符串类型):源图像的相对路径
- `params_relpath`(字符串类型):源JSON文件的相对路径
- `raw_params_json`(字符串类型):完整的原始JSON文本
- `camera_json`(字符串类型):JSON中的`camera`字段内容
- `actors_json`(字符串类型):JSON中的`actors`字段内容
- `source_json`(字符串类型):JSON中的`source`字段内容
- `actor_labels`(字符串列表类型):`actors`中出现的唯一标签集合
- `has_label_mapping`(布尔类型):标识`source.label_mapping`是否存在
- `label_mapping_json`(字符串类型):完整的标签映射JSON文本
- `label_mapping_keys`(字符串列表类型):标签映射的键集合
- `label_mapping_values`(字符串列表类型):标签映射的值集合
- `original_params_name`(字符串类型):当存在`source.original_params`时的对应值
- `upload_batch_utc`(字符串类型):上传任务的UTC时间戳
## 加载方式
python
from datasets import load_dataset, Image
ds = load_dataset("turhancan97/SpaRRTa", split="train")
# 如需将结构体列转换为数据集Image特征,请执行以下操作:
ds = ds.cast_column("image", Image())
## 本地下载
下载数据集仓库文件(包括Parquet分片、README与清单文件):
bash
huggingface-cli download turhancan97/SpaRRTa --repo-type dataset --local-dir ./hf_SpaRRTa
## 重建原始文件夹结构
以下脚本可重建如下目录结构:
`<output>/<scene_variant>/img_XXXX.jpg` 与 `params_XXXX.json`
python
from pathlib import Path
from datasets import load_dataset, Image
repo_id = "turhancan97/SpaRRTa"
output_root = Path("position_between_objects")
output_root.mkdir(parents=True, exist_ok=True)
ds = load_dataset(repo_id, split="train")
ds = ds.cast_column("image", Image(decode=False))
for row in ds:
mid = output_root / row["scene_variant"]
mid.mkdir(parents=True, exist_ok=True)
image_name = Path(row["image_relpath"]).name
params_name = Path(row["params_relpath"]).name
image_bytes = row["image"]["bytes"]
if image_bytes is None:
raise RuntimeError(f"样本{row['sample_id']}缺失嵌入的图像字节流")
(mid / image_name).write_bytes(image_bytes)
(mid / params_name).write_text(row["raw_params_json"], encoding="utf-8")
## 引用
如您的研究使用本数据集,请引用如下文献:
bibtex
@misc{kargin2026sparrta,
title={SpaRRTa: 用于评估视觉基础模型空间智能的合成基准数据集},
author={Turhan Can Kargin and Wojciech Jasiński and Adam Pardyl and Bartosz Zieliński and Marcin Przewięźlikowski},
year={2026},
eprint={2601.11729},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2601.11729}
}
提供机构:
turhancan97



