turhancan97/SpaRRTa

Name: turhancan97/SpaRRTa
Creator: turhancan97
Published: 2026-03-10 08:35:52
License: 暂无描述

Hugging Face2026-03-10 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/turhancan97/SpaRRTa

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit task_categories: - image-classification language: - en size_categories: - 100K<n<1M dataset_info: features: - name: sample_id dtype: string - name: scene dtype: string - name: variant dtype: int32 - name: scene_variant dtype: string - name: frame_id dtype: int32 - name: image dtype: image - name: image_relpath dtype: string - name: params_relpath dtype: string - name: raw_params_json dtype: string - name: camera_json dtype: string - name: actors_json dtype: string - name: source_json dtype: string - name: actor_labels sequence: string - name: has_label_mapping dtype: bool - name: label_mapping_json dtype: string - name: label_mapping_keys sequence: string - name: label_mapping_values sequence: string - name: original_params_name dtype: string - name: upload_batch_utc dtype: string configs: - config_name: default data_files: - split: train path: data/train/*.parquet pretty_name: SpaRRTa tags: - spatial_intelligence --- <h1 style="display:flex; align-items:center; gap:10px;"> <img src="assets/logo.png" alt="Logo" width="55"> <span style="color:#FF7096;">SpaRRTa</span>: A Synthetic Benchmark for Evaluating Spatial Intelligence in Visual Foundation Models </h1> ## Summary - Format: parquet shards with one row per sample. - Split: `train` only. - Images are embedded in parquet under the `image` struct column (`bytes`, `path`). - Full raw metadata is preserved in `raw_params_json`. ## Run Stats - Discovered paired samples: **149145** - Skipped broken/missing pairs or parse errors: **0** ## Scene Coverage | scene_variant | scene | variant | paired_samples | new_rows | skipped_existing | missing_pairs | |---|---:|---:|---:|---:|---:|---:| | `bridge` | `bridge` | `1` | 9834 | 0 | 9834 | 0 | | `bridge_2` | `bridge` | `2` | 9834 | 0 | 9834 | 0 | | `bridge_3` | `bridge` | `3` | 9834 | 0 | 9834 | 0 | | `city` | `city` | `1` | 10000 | 0 | 10000 | 0 | | `city_2` | `city` | `2` | 10000 | 0 | 10000 | 0 | | `city_3` | `city` | `3` | 10000 | 0 | 10000 | 0 | | `desert` | `desert` | `1` | 10000 | 0 | 10000 | 0 | | `desert_2` | `desert` | `2` | 10000 | 0 | 10000 | 0 | | `desert_3` | `desert` | `3` | 10000 | 0 | 10000 | 0 | | `forest` | `forest` | `1` | 10000 | 0 | 10000 | 0 | | `forest_2` | `forest` | `2` | 10000 | 0 | 10000 | 0 | | `forest_3` | `forest` | `3` | 10000 | 0 | 10000 | 0 | | `winter_town` | `winter_town` | `1` | 9881 | 0 | 9881 | 0 | | `winter_town_2` | `winter_town` | `2` | 9881 | 0 | 9881 | 0 | | `winter_town_3` | `winter_town` | `3` | 9881 | 0 | 9881 | 0 | ## Columns - `sample_id` (string): stable unique id (`scene_variant:frame_id`) - `scene` (string): base scene name (e.g. `bridge`) - `variant` (int): numeric variant from folder suffix (`bridge_3` -> `3`, base folder -> `1`) - `scene_variant` (string): source folder name - `frame_id` (int): numeric frame id from filename - `image` (struct): embedded image bytes + relative path - `image_relpath` (string): relative source image path - `params_relpath` (string): relative source JSON path - `raw_params_json` (string): full original JSON text - `camera_json` (string): `camera` section - `actors_json` (string): `actors` section - `source_json` (string): `source` section - `actor_labels` (list[string]): unique labels found in `actors` - `has_label_mapping` (bool): whether `source.label_mapping` exists - `label_mapping_json` (string): full mapping JSON - `label_mapping_keys` (list[string]): mapping keys - `label_mapping_values` (list[string]): mapping values - `original_params_name` (string): `source.original_params` when present - `upload_batch_utc` (string): UTC timestamp of upload run ## Loading ```python from datasets import load_dataset, Image ds = load_dataset("turhancan97/SpaRRTa", split="train") # Convert struct column to datasets Image feature if needed: ds = ds.cast_column("image", Image()) ``` ## Download to Local Machine Download the dataset repo files (parquet shards, README, manifest): ```bash huggingface-cli download turhancan97/SpaRRTa --repo-type dataset --local-dir ./hf_SpaRRTa ``` ## Rebuild Original Folder Structure The script below reconstructs: `<output>/<scene_variant>/img_XXXX.jpg` and `params_XXXX.json` ```python from pathlib import Path from datasets import load_dataset, Image repo_id = "turhancan97/SpaRRTa" output_root = Path("position_between_objects") output_root.mkdir(parents=True, exist_ok=True) ds = load_dataset(repo_id, split="train") ds = ds.cast_column("image", Image(decode=False)) for row in ds: mid = output_root / row["scene_variant"] mid.mkdir(parents=True, exist_ok=True) image_name = Path(row["image_relpath"]).name params_name = Path(row["params_relpath"]).name image_bytes = row["image"]["bytes"] if image_bytes is None: raise RuntimeError(f"Missing embedded image bytes for {row['sample_id']}") (mid / image_name).write_bytes(image_bytes) (mid / params_name).write_text(row["raw_params_json"], encoding="utf-8") ``` ## Citation If you find this research useful, please consider citing: ```bibtex @misc{kargin2026sparrta, title={SpaRRTa: A Synthetic Benchmark for Evaluating Spatial Intelligence in Visual Foundation Models}, author={Turhan Can Kargin and Wojciech Jasiński and Adam Pardyl and Bartosz Zieliński and Marcin Przewięźlikowski}, year={2026}, eprint={2601.11729}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2601.11729} }

<h1 style="display:flex; align-items:center; gap:10px;"><img src="assets/logo.png" alt="Logo" width="55"><span style="color:#FF7096;">SpaRRTa</span>：用于评估视觉基础模型空间智能的合成基准数据集</h1> ### 数据集元信息 yaml 许可证：MIT许可证任务类别：图像分类语言：英语样本规模：100000 < n < 1000000 数据集信息：特征字段： - 字段名：sample_id，数据类型：字符串 - 字段名：scene，数据类型：字符串 - 字段名：variant，数据类型：int32 - 字段名：scene_variant，数据类型：字符串 - 字段名：frame_id，数据类型：int32 - 字段名：image，数据类型：图像 - 字段名：image_relpath，数据类型：字符串 - 字段名：params_relpath，数据类型：字符串 - 字段名：raw_params_json，数据类型：字符串 - 字段名：camera_json，数据类型：字符串 - 字段名：actors_json，数据类型：字符串 - 字段名：source_json，数据类型：字符串 - 字段名：actor_labels，数据类型：字符串序列 - 字段名：has_label_mapping，数据类型：布尔值 - 字段名：label_mapping_json，数据类型：字符串 - 字段名：label_mapping_keys，数据类型：字符串序列 - 字段名：label_mapping_values，数据类型：字符串序列 - 字段名：original_params_name，数据类型：字符串 - 字段名：upload_batch_utc，数据类型：字符串配置项： - 配置名：default 数据文件： - 划分：训练集（train）路径：data/train/*.parquet 数据集显示名：SpaRRTa 标签：空间智能 ## 概述 - 存储格式：单样本一行的Parquet分片文件 - 数据集划分：仅包含训练集（train） - 图像以结构体列`image`的形式嵌入Parquet文件中，包含`bytes`与`path`字段 - 完整原始元数据存储于`raw_params_json`字段中 ## 运行统计 - 已发现配对样本数：**149145** - 因损坏/缺失配对或解析错误跳过的样本数：**0** ## 场景覆盖 | 场景变体名 | 基础场景名 | 变体编号 | 配对样本数 | 新增行数 | 已跳过样本数 | 缺失配对数 | |---|---:|---:|---:|---:|---:|---:| | `bridge` | `桥梁` | `1` | 9834 | 0 | 9834 | 0 | | `bridge_2` | `桥梁` | `2` | 9834 | 0 | 9834 | 0 | | `bridge_3` | `桥梁` | `3` | 9834 | 0 | 9834 | 0 | | `city` | `城市` | `1` | 10000 | 0 | 10000 | 0 | | `city_2` | `城市` | `2` | 10000 | 0 | 10000 | 0 | | `city_3` | `城市` | `3` | 10000 | 0 | 10000 | 0 | | `desert` | `沙漠` | `1` | 10000 | 0 | 10000 | 0 | | `desert_2` | `沙漠` | `2` | 10000 | 0 | 10000 | 0 | | `desert_3` | `沙漠` | `3` | 10000 | 0 | 10000 | 0 | | `forest` | `森林` | `1` | 10000 | 0 | 10000 | 0 | | `forest_2` | `森林` | `2` | 10000 | 0 | 10000 | 0 | | `forest_3` | `森林` | `3` | 10000 | 0 | 10000 | 0 | | `winter_town` | `冬季小镇` | `1` | 9881 | 0 | 9881 | 0 | | `winter_town_2` | `冬季小镇` | `2` | 9881 | 0 | 9881 | 0 | | `winter_town_3` | `冬季小镇` | `3` | 9881 | 0 | 9881 | 0 | ## 字段说明 - `sample_id`（字符串类型）：稳定唯一标识符，格式为`scene_variant:frame_id` - `scene`（字符串类型）：基础场景名称（例如`bridge`对应桥梁） - `variant`（整数类型）：源自文件夹后缀的数字变体编号（例如`bridge_3`对应`3`，基础文件夹对应`1`） - `scene_variant`（字符串类型）：源文件夹名称 - `frame_id`（整数类型）：源自文件名的数字帧ID - `image`（结构体类型）：嵌入的图像字节流与相对路径 - `image_relpath`（字符串类型）：源图像的相对路径 - `params_relpath`（字符串类型）：源JSON文件的相对路径 - `raw_params_json`（字符串类型）：完整的原始JSON文本 - `camera_json`（字符串类型）：JSON中的`camera`字段内容 - `actors_json`（字符串类型）：JSON中的`actors`字段内容 - `source_json`（字符串类型）：JSON中的`source`字段内容 - `actor_labels`（字符串列表类型）：`actors`中出现的唯一标签集合 - `has_label_mapping`（布尔类型）：标识`source.label_mapping`是否存在 - `label_mapping_json`（字符串类型）：完整的标签映射JSON文本 - `label_mapping_keys`（字符串列表类型）：标签映射的键集合 - `label_mapping_values`（字符串列表类型）：标签映射的值集合 - `original_params_name`（字符串类型）：当存在`source.original_params`时的对应值 - `upload_batch_utc`（字符串类型）：上传任务的UTC时间戳 ## 加载方式 python from datasets import load_dataset, Image ds = load_dataset("turhancan97/SpaRRTa", split="train") # 如需将结构体列转换为数据集Image特征，请执行以下操作： ds = ds.cast_column("image", Image()) ## 本地下载下载数据集仓库文件（包括Parquet分片、README与清单文件）： bash huggingface-cli download turhancan97/SpaRRTa --repo-type dataset --local-dir ./hf_SpaRRTa ## 重建原始文件夹结构以下脚本可重建如下目录结构： `<output>/<scene_variant>/img_XXXX.jpg` 与 `params_XXXX.json` python from pathlib import Path from datasets import load_dataset, Image repo_id = "turhancan97/SpaRRTa" output_root = Path("position_between_objects") output_root.mkdir(parents=True, exist_ok=True) ds = load_dataset(repo_id, split="train") ds = ds.cast_column("image", Image(decode=False)) for row in ds: mid = output_root / row["scene_variant"] mid.mkdir(parents=True, exist_ok=True) image_name = Path(row["image_relpath"]).name params_name = Path(row["params_relpath"]).name image_bytes = row["image"]["bytes"] if image_bytes is None: raise RuntimeError(f"样本{row['sample_id']}缺失嵌入的图像字节流") (mid / image_name).write_bytes(image_bytes) (mid / params_name).write_text(row["raw_params_json"], encoding="utf-8") ## 引用如您的研究使用本数据集，请引用如下文献： bibtex @misc{kargin2026sparrta, title={SpaRRTa: 用于评估视觉基础模型空间智能的合成基准数据集}, author={Turhan Can Kargin and Wojciech Jasiński and Adam Pardyl and Bartosz Zieliński and Marcin Przewięźlikowski}, year={2026}, eprint={2601.11729}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2601.11729} }

提供机构：

turhancan97

5,000+

优质数据集

54 个

任务类型

进入经典数据集