VisGym/visgym_data
收藏Hugging Face2026-02-05 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/VisGym/visgym_data
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
license: apache-2.0
size_categories:
- 1M<n<10M
task_categories:
- image-text-to-text
tags:
- agent
- vision-language-models
- reinforcement-learning
- vlm
- multimodal
- sft
---
# VisGym Dataset
[**Project Page**](https://visgym.github.io/) | [**Paper**](https://huggingface.co/papers/2601.16973) | [**GitHub**](https://github.com/visgym/VIsGym)
**VisGym** consists of 17 diverse, long-horizon environments designed to systematically evaluate, diagnose, and train Vision-Language Models (VLMs) on visually interactive tasks. In these environments, agents must select actions conditioned on both their past actions and observation history, challenging their ability to handle complex, multimodal sequences.
## Dataset Summary
This dataset contains trajectories and interaction data generated from the VisGym suites, intended for training and benchmarking multimodal agents. The environments are designed to be:
* **Diverse:** Covering 17 distinct task categories.
* **Customizable:** Allowing for various configurations of task difficulty and visual settings.
* **Scalable:** Suitable for large-scale training of VLMs and Reinforcement Learning agents.
## Usage
You can download the dataset assets and metadata using the `huggingface-cli`:
```bash
# Install huggingface-cli
pip install -U "huggingface_hub[cli]"
# Download the dataset to local
# This will download 'assets/' and 'metadata/' folder into local dir
mkdir -p training_dataset
huggingface-cli download VisGym/visgym_data --repo-type dataset --local-dir ./training_dataset
```
Check here for more usage details: https://github.com/visgym/VisGym/blob/main/visgym_training/README.md
## Citation
If you use this dataset, please cite:
```bibtex
@article{wang2026visgym,
title = {VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents},
author = {Wang, Zirui and Zhang, Junyi and Ge, Jiaxin and Lian, Long and Fu, Letian and Dunlap, Lisa and Goldberg, Ken and Wang, Xudong and Stoica, Ion and Chan, David M. and Min, Sewon and Gonzalez, Joseph E.},
journal = {arXiv preprint arXiv:2601.16973},
year = {2026},
url = {https://arxiv.org/abs/2601.16973}
}
```
提供机构:
VisGym



