user66666/IS_Bench_dataset
收藏Hugging Face2026-04-09 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/user66666/IS_Bench_dataset
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
---
# Dataset Card for IS-Bench
This dataset is for paper IS-Bench: Evaluating Interactive Safety of VLM-Driven Embodied Agents in Daily Household Tasks
⭐ You can check our [Paper](https://www.arxiv.org/abs/2506.16402), [Github](https://github.com/AI45Lab/IS-Bench), [Project Page](https://ursulalujun.github.io/isbench.github.io/) for more information.
## Usage
- Online Evaluation: Download scenes.tar.gz and load these scene files in Omnigibson simulator.
- Offline Evaluation: Download scene_images.tar.gz and use the scene images in it as input directly.
Please see [IS-Bench code](https://github.com/AI45Lab/IS-Bench) for more details.
## Dataset Details
Our dataset statistics are listed in the following:
<img src="https://github.com/AI45Lab/IS-Bench/blob/main/assets/statistics.png?raw=true"/>
Here are examples in our dataset:
<img src="https://github.com/AI45Lab/IS-Bench/blob/main/assets/example.png?raw=true"/>
The evaluation results on leading VLMs. (SR: Success Rate, SSR: Safe and Success Rate, Srec: Safety Recall, Ll: implicit safety reminder, L2: safety CoT reminder configurations.)
<img src="https://github.com/AI45Lab/IS-Bench/blob/main/assets/results2.png?raw=true"/>
## Citation
```bibtex
@article{lu2025bench,
title={IS-Bench: Evaluating Interactive Safety of VLM-Driven Embodied Agents in Daily Household Tasks},
author={Lu, Xiaoya and Chen, Zeren and Hu, Xuhao and Zhou, Yijin and Zhang, Weichen and Liu, Dongrui and Sheng, Lu and Shao, Jing},
journal={arXiv preprint arXiv:2506.16402},
year={2025}
}
```
提供机构:
user66666



