user66666/IS_Bench_dataset

Name: user66666/IS_Bench_dataset
Creator: user66666
Published: 2026-04-09 12:22:51
License: 暂无描述

Hugging Face2026-04-09 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/user66666/IS_Bench_dataset

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 --- # Dataset Card for IS-Bench This dataset is for paper IS-Bench: Evaluating Interactive Safety of VLM-Driven Embodied Agents in Daily Household Tasks ⭐ You can check our [Paper](https://www.arxiv.org/abs/2506.16402), [Github](https://github.com/AI45Lab/IS-Bench), [Project Page](https://ursulalujun.github.io/isbench.github.io/) for more information. ## Usage - Online Evaluation: Download scenes.tar.gz and load these scene files in Omnigibson simulator. - Offline Evaluation: Download scene_images.tar.gz and use the scene images in it as input directly. Please see [IS-Bench code](https://github.com/AI45Lab/IS-Bench) for more details. ## Dataset Details Our dataset statistics are listed in the following: <img src="https://github.com/AI45Lab/IS-Bench/blob/main/assets/statistics.png?raw=true"/> Here are examples in our dataset: <img src="https://github.com/AI45Lab/IS-Bench/blob/main/assets/example.png?raw=true"/> The evaluation results on leading VLMs. (SR: Success Rate, SSR: Safe and Success Rate, Srec: Safety Recall, Ll: implicit safety reminder, L2: safety CoT reminder configurations.) <img src="https://github.com/AI45Lab/IS-Bench/blob/main/assets/results2.png?raw=true"/> ## Citation ```bibtex @article{lu2025bench, title={IS-Bench: Evaluating Interactive Safety of VLM-Driven Embodied Agents in Daily Household Tasks}, author={Lu, Xiaoya and Chen, Zeren and Hu, Xuhao and Zhou, Yijin and Zhang, Weichen and Liu, Dongrui and Sheng, Lu and Shao, Jing}, journal={arXiv preprint arXiv:2506.16402}, year={2025} } ```

提供机构：

user66666

5,000+

优质数据集

54 个

任务类型

进入经典数据集