lifuguan/SAT

Name: lifuguan/SAT
Creator: lifuguan
Published: 2025-11-28 03:16:06
License: 暂无描述

Hugging Face2025-11-28 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/lifuguan/SAT

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit configs: - config_name: default data_files: - split: train path: "SAT_train.parquet" - split: static path: "SAT_static.parquet" - split: val path: "SAT_val.parquet" - split: test path: "SAT_test.parquet" dataset_info: features: - name: image_bytes list: dtype: image - name: question dtype: string - name: answers list: dtype: string - name: question_type dtype: string - name: correct_answer dtype: string task_categories: - question-answering size_categories: - 100K<n<1M --- # SAT: Spatial Aptitude Training for Multimodal Language Models [Project Page](https://arijitray1993.github.io/SAT/) ![SAT Data](https://arijitray1993.github.io/SAT/SAT_webpage/static/images/sat_teaser.png) To use the dataset, first make sure you have Python3.10 and Huggingface datasets version 3.0.2 (`pip install datasets==3.0.2`): ```python from datasets import load_dataset import io split = "val" dataset = load_dataset("array/SAT", batch_size=128) example = dataset[split][10] # example 10th item images = [Image.open(io.BytesIO(im_bytes)) for im_bytes in example['image_bytes']] # this is a list of images. Some questions are on one image, and some on 2 images question = example['question'] answer_choices = example['answers'] correct_answer = example['correct_answer'] ``` The available `split` choices are: - `train`: (175K image QA pairs) Train split of SAT data that includes both static relationships and dyamic spatial QAs involving object and scene motion. For motion-based questions, there are two images. - `static`: (127K image QA pairs) Train split of SAT data that includes _only_ static QAs. Always has one image only. - `val`: (4K image QA pairs) Synthetic validation split. - `test`: (150 image QA pairs) Real-image dynamic test set. If you find this data useful, please consider citing: ``` @misc{ray2025satdynamicspatialaptitude, title={SAT: Dynamic Spatial Aptitude Training for Multimodal Language Models}, author={Arijit Ray and Jiafei Duan and Ellis Brown and Reuben Tan and Dina Bashkirova and Rose Hendrix and Kiana Ehsani and Aniruddha Kembhavi and Bryan A. Plummer and Ranjay Krishna and Kuo-Hao Zeng and Kate Saenko}, year={2025}, eprint={2412.07755}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2412.07755}, } ```

许可证：MIT协议配置项： - 配置名称：默认数据文件： - 拆分：训练集（train）路径: "SAT_train.parquet" - 拆分：静态拆分（static）路径: "SAT_static.parquet" - 拆分：验证集（val）路径: "SAT_val.parquet" - 拆分：测试集（test）路径: "SAT_test.parquet" 数据集信息：特征字段： - 字段名：image_bytes，类型为列表，元素为图像数据 - 字段名：question，类型为字符串 - 字段名：answers，类型为字符串列表 - 字段名：question_type，类型为字符串 - 字段名：correct_answer，类型为字符串任务类别：问答（question-answering）数据规模区间：10万 < 样本数量 < 100万 --- # SAT：面向多模态语言模型（Multimodal Language Models）的空间能力训练数据集 [项目主页](https://arijitray1993.github.io/SAT/) ![SAT数据集示意图](https://arijitray1993.github.io/SAT/SAT_webpage/static/images/sat_teaser.png) 若需使用本数据集，请先确保已安装Python 3.10以及Huggingface datasets库3.0.2版本，可通过以下命令完成安装： python from datasets import load_dataset import io split = "val" dataset = load_dataset("array/SAT", batch_size=128) example = dataset[split][10] # 选取第10个样本 images = [Image.open(io.BytesIO(im_bytes)) for im_bytes in example['image_bytes']] # 该字段为图像列表，部分问题仅关联单张图像，部分关联两张图像 question = example['question'] answer_choices = example['answers'] correct_answer = example['correct_answer'] 可用的拆分（split）选项如下： - `train`：（17.5万图像问答对）SAT数据集的训练拆分，涵盖静态空间关系与涉及物体及场景运动的动态空间问答任务。针对基于运动的问题，会提供两张图像。 - `static`：（12.7万图像问答对）SAT数据集仅包含静态问答任务的训练拆分，始终仅提供单张图像。 - `val`：（4000个图像问答对）合成验证拆分。 - `test`：（150个图像问答对）真实图像动态测试集。若本数据集对您的研究有所帮助，请考虑引用以下文献： @misc{ray2025satdynamicspatialaptitude, title={SAT: Dynamic Spatial Aptitude Training for Multimodal Language Models}, author={Arijit Ray and Jiafei Duan and Ellis Brown and Reuben Tan and Dina Bashkirova and Rose Hendrix and Kiana Ehsani and Aniruddha Kembhavi and Bryan A. Plummer and Ranjay Krishna and Kuo-Hao Zeng and Kate Saenko}, year={2025}, eprint={2412.07755}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2412.07755}, }

提供机构：

lifuguan

5,000+

优质数据集

54 个

任务类型

进入经典数据集