five

turhancan97/SpaRRTa

收藏
Hugging Face2026-03-10 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/turhancan97/SpaRRTa
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit task_categories: - image-classification language: - en size_categories: - 100K<n<1M dataset_info: features: - name: sample_id dtype: string - name: scene dtype: string - name: variant dtype: int32 - name: scene_variant dtype: string - name: frame_id dtype: int32 - name: image dtype: image - name: image_relpath dtype: string - name: params_relpath dtype: string - name: raw_params_json dtype: string - name: camera_json dtype: string - name: actors_json dtype: string - name: source_json dtype: string - name: actor_labels sequence: string - name: has_label_mapping dtype: bool - name: label_mapping_json dtype: string - name: label_mapping_keys sequence: string - name: label_mapping_values sequence: string - name: original_params_name dtype: string - name: upload_batch_utc dtype: string configs: - config_name: default data_files: - split: train path: data/train/*.parquet pretty_name: SpaRRTa tags: - spatial_intelligence --- <h1 style="display:flex; align-items:center; gap:10px;"> <img src="assets/logo.png" alt="Logo" width="55"> <span style="color:#FF7096;">SpaRRTa</span>: A Synthetic Benchmark for Evaluating Spatial Intelligence in Visual Foundation Models </h1> ## Summary - Format: parquet shards with one row per sample. - Split: `train` only. - Images are embedded in parquet under the `image` struct column (`bytes`, `path`). - Full raw metadata is preserved in `raw_params_json`. ## Run Stats - Discovered paired samples: **149145** - Skipped broken/missing pairs or parse errors: **0** ## Scene Coverage | scene_variant | scene | variant | paired_samples | new_rows | skipped_existing | missing_pairs | |---|---:|---:|---:|---:|---:|---:| | `bridge` | `bridge` | `1` | 9834 | 0 | 9834 | 0 | | `bridge_2` | `bridge` | `2` | 9834 | 0 | 9834 | 0 | | `bridge_3` | `bridge` | `3` | 9834 | 0 | 9834 | 0 | | `city` | `city` | `1` | 10000 | 0 | 10000 | 0 | | `city_2` | `city` | `2` | 10000 | 0 | 10000 | 0 | | `city_3` | `city` | `3` | 10000 | 0 | 10000 | 0 | | `desert` | `desert` | `1` | 10000 | 0 | 10000 | 0 | | `desert_2` | `desert` | `2` | 10000 | 0 | 10000 | 0 | | `desert_3` | `desert` | `3` | 10000 | 0 | 10000 | 0 | | `forest` | `forest` | `1` | 10000 | 0 | 10000 | 0 | | `forest_2` | `forest` | `2` | 10000 | 0 | 10000 | 0 | | `forest_3` | `forest` | `3` | 10000 | 0 | 10000 | 0 | | `winter_town` | `winter_town` | `1` | 9881 | 0 | 9881 | 0 | | `winter_town_2` | `winter_town` | `2` | 9881 | 0 | 9881 | 0 | | `winter_town_3` | `winter_town` | `3` | 9881 | 0 | 9881 | 0 | ## Columns - `sample_id` (string): stable unique id (`scene_variant:frame_id`) - `scene` (string): base scene name (e.g. `bridge`) - `variant` (int): numeric variant from folder suffix (`bridge_3` -> `3`, base folder -> `1`) - `scene_variant` (string): source folder name - `frame_id` (int): numeric frame id from filename - `image` (struct): embedded image bytes + relative path - `image_relpath` (string): relative source image path - `params_relpath` (string): relative source JSON path - `raw_params_json` (string): full original JSON text - `camera_json` (string): `camera` section - `actors_json` (string): `actors` section - `source_json` (string): `source` section - `actor_labels` (list[string]): unique labels found in `actors` - `has_label_mapping` (bool): whether `source.label_mapping` exists - `label_mapping_json` (string): full mapping JSON - `label_mapping_keys` (list[string]): mapping keys - `label_mapping_values` (list[string]): mapping values - `original_params_name` (string): `source.original_params` when present - `upload_batch_utc` (string): UTC timestamp of upload run ## Loading ```python from datasets import load_dataset, Image ds = load_dataset("turhancan97/SpaRRTa", split="train") # Convert struct column to datasets Image feature if needed: ds = ds.cast_column("image", Image()) ``` ## Download to Local Machine Download the dataset repo files (parquet shards, README, manifest): ```bash huggingface-cli download turhancan97/SpaRRTa --repo-type dataset --local-dir ./hf_SpaRRTa ``` ## Rebuild Original Folder Structure The script below reconstructs: `<output>/<scene_variant>/img_XXXX.jpg` and `params_XXXX.json` ```python from pathlib import Path from datasets import load_dataset, Image repo_id = "turhancan97/SpaRRTa" output_root = Path("position_between_objects") output_root.mkdir(parents=True, exist_ok=True) ds = load_dataset(repo_id, split="train") ds = ds.cast_column("image", Image(decode=False)) for row in ds: mid = output_root / row["scene_variant"] mid.mkdir(parents=True, exist_ok=True) image_name = Path(row["image_relpath"]).name params_name = Path(row["params_relpath"]).name image_bytes = row["image"]["bytes"] if image_bytes is None: raise RuntimeError(f"Missing embedded image bytes for {row['sample_id']}") (mid / image_name).write_bytes(image_bytes) (mid / params_name).write_text(row["raw_params_json"], encoding="utf-8") ``` ## Citation If you find this research useful, please consider citing: ```bibtex @misc{kargin2026sparrta, title={SpaRRTa: A Synthetic Benchmark for Evaluating Spatial Intelligence in Visual Foundation Models}, author={Turhan Can Kargin and Wojciech Jasiński and Adam Pardyl and Bartosz Zieliński and Marcin Przewięźlikowski}, year={2026}, eprint={2601.11729}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2601.11729} }

<h1 style="display:flex; align-items:center; gap:10px;"><img src="assets/logo.png" alt="Logo" width="55"><span style="color:#FF7096;">SpaRRTa</span>:用于评估视觉基础模型空间智能的合成基准数据集</h1> ### 数据集元信息 yaml 许可证:MIT许可证 任务类别:图像分类 语言:英语 样本规模:100000 < n < 1000000 数据集信息: 特征字段: - 字段名:sample_id,数据类型:字符串 - 字段名:scene,数据类型:字符串 - 字段名:variant,数据类型:int32 - 字段名:scene_variant,数据类型:字符串 - 字段名:frame_id,数据类型:int32 - 字段名:image,数据类型:图像 - 字段名:image_relpath,数据类型:字符串 - 字段名:params_relpath,数据类型:字符串 - 字段名:raw_params_json,数据类型:字符串 - 字段名:camera_json,数据类型:字符串 - 字段名:actors_json,数据类型:字符串 - 字段名:source_json,数据类型:字符串 - 字段名:actor_labels,数据类型:字符串序列 - 字段名:has_label_mapping,数据类型:布尔值 - 字段名:label_mapping_json,数据类型:字符串 - 字段名:label_mapping_keys,数据类型:字符串序列 - 字段名:label_mapping_values,数据类型:字符串序列 - 字段名:original_params_name,数据类型:字符串 - 字段名:upload_batch_utc,数据类型:字符串 配置项: - 配置名:default 数据文件: - 划分:训练集(train) 路径:data/train/*.parquet 数据集显示名:SpaRRTa 标签:空间智能 ## 概述 - 存储格式:单样本一行的Parquet分片文件 - 数据集划分:仅包含训练集(train) - 图像以结构体列`image`的形式嵌入Parquet文件中,包含`bytes`与`path`字段 - 完整原始元数据存储于`raw_params_json`字段中 ## 运行统计 - 已发现配对样本数:**149145** - 因损坏/缺失配对或解析错误跳过的样本数:**0** ## 场景覆盖 | 场景变体名 | 基础场景名 | 变体编号 | 配对样本数 | 新增行数 | 已跳过样本数 | 缺失配对数 | |---|---:|---:|---:|---:|---:|---:| | `bridge` | `桥梁` | `1` | 9834 | 0 | 9834 | 0 | | `bridge_2` | `桥梁` | `2` | 9834 | 0 | 9834 | 0 | | `bridge_3` | `桥梁` | `3` | 9834 | 0 | 9834 | 0 | | `city` | `城市` | `1` | 10000 | 0 | 10000 | 0 | | `city_2` | `城市` | `2` | 10000 | 0 | 10000 | 0 | | `city_3` | `城市` | `3` | 10000 | 0 | 10000 | 0 | | `desert` | `沙漠` | `1` | 10000 | 0 | 10000 | 0 | | `desert_2` | `沙漠` | `2` | 10000 | 0 | 10000 | 0 | | `desert_3` | `沙漠` | `3` | 10000 | 0 | 10000 | 0 | | `forest` | `森林` | `1` | 10000 | 0 | 10000 | 0 | | `forest_2` | `森林` | `2` | 10000 | 0 | 10000 | 0 | | `forest_3` | `森林` | `3` | 10000 | 0 | 10000 | 0 | | `winter_town` | `冬季小镇` | `1` | 9881 | 0 | 9881 | 0 | | `winter_town_2` | `冬季小镇` | `2` | 9881 | 0 | 9881 | 0 | | `winter_town_3` | `冬季小镇` | `3` | 9881 | 0 | 9881 | 0 | ## 字段说明 - `sample_id`(字符串类型):稳定唯一标识符,格式为`scene_variant:frame_id` - `scene`(字符串类型):基础场景名称(例如`bridge`对应桥梁) - `variant`(整数类型):源自文件夹后缀的数字变体编号(例如`bridge_3`对应`3`,基础文件夹对应`1`) - `scene_variant`(字符串类型):源文件夹名称 - `frame_id`(整数类型):源自文件名的数字帧ID - `image`(结构体类型):嵌入的图像字节流与相对路径 - `image_relpath`(字符串类型):源图像的相对路径 - `params_relpath`(字符串类型):源JSON文件的相对路径 - `raw_params_json`(字符串类型):完整的原始JSON文本 - `camera_json`(字符串类型):JSON中的`camera`字段内容 - `actors_json`(字符串类型):JSON中的`actors`字段内容 - `source_json`(字符串类型):JSON中的`source`字段内容 - `actor_labels`(字符串列表类型):`actors`中出现的唯一标签集合 - `has_label_mapping`(布尔类型):标识`source.label_mapping`是否存在 - `label_mapping_json`(字符串类型):完整的标签映射JSON文本 - `label_mapping_keys`(字符串列表类型):标签映射的键集合 - `label_mapping_values`(字符串列表类型):标签映射的值集合 - `original_params_name`(字符串类型):当存在`source.original_params`时的对应值 - `upload_batch_utc`(字符串类型):上传任务的UTC时间戳 ## 加载方式 python from datasets import load_dataset, Image ds = load_dataset("turhancan97/SpaRRTa", split="train") # 如需将结构体列转换为数据集Image特征,请执行以下操作: ds = ds.cast_column("image", Image()) ## 本地下载 下载数据集仓库文件(包括Parquet分片、README与清单文件): bash huggingface-cli download turhancan97/SpaRRTa --repo-type dataset --local-dir ./hf_SpaRRTa ## 重建原始文件夹结构 以下脚本可重建如下目录结构: `<output>/<scene_variant>/img_XXXX.jpg` 与 `params_XXXX.json` python from pathlib import Path from datasets import load_dataset, Image repo_id = "turhancan97/SpaRRTa" output_root = Path("position_between_objects") output_root.mkdir(parents=True, exist_ok=True) ds = load_dataset(repo_id, split="train") ds = ds.cast_column("image", Image(decode=False)) for row in ds: mid = output_root / row["scene_variant"] mid.mkdir(parents=True, exist_ok=True) image_name = Path(row["image_relpath"]).name params_name = Path(row["params_relpath"]).name image_bytes = row["image"]["bytes"] if image_bytes is None: raise RuntimeError(f"样本{row['sample_id']}缺失嵌入的图像字节流") (mid / image_name).write_bytes(image_bytes) (mid / params_name).write_text(row["raw_params_json"], encoding="utf-8") ## 引用 如您的研究使用本数据集,请引用如下文献: bibtex @misc{kargin2026sparrta, title={SpaRRTa: 用于评估视觉基础模型空间智能的合成基准数据集}, author={Turhan Can Kargin and Wojciech Jasiński and Adam Pardyl and Bartosz Zieliński and Marcin Przewięźlikowski}, year={2026}, eprint={2601.11729}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2601.11729} }
提供机构:
turhancan97
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作