VisGym/visgym_data

Name: VisGym/visgym_data
Creator: VisGym
Published: 2026-02-05 20:08:57
License: 暂无描述

Hugging Face2026-02-05 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/VisGym/visgym_data

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - en license: apache-2.0 size_categories: - 1M<n<10M task_categories: - image-text-to-text tags: - agent - vision-language-models - reinforcement-learning - vlm - multimodal - sft --- # VisGym Dataset [**Project Page**](https://visgym.github.io/) | [**Paper**](https://huggingface.co/papers/2601.16973) | [**GitHub**](https://github.com/visgym/VIsGym) **VisGym** consists of 17 diverse, long-horizon environments designed to systematically evaluate, diagnose, and train Vision-Language Models (VLMs) on visually interactive tasks. In these environments, agents must select actions conditioned on both their past actions and observation history, challenging their ability to handle complex, multimodal sequences. ## Dataset Summary This dataset contains trajectories and interaction data generated from the VisGym suites, intended for training and benchmarking multimodal agents. The environments are designed to be: * **Diverse:** Covering 17 distinct task categories. * **Customizable:** Allowing for various configurations of task difficulty and visual settings. * **Scalable:** Suitable for large-scale training of VLMs and Reinforcement Learning agents. ## Usage You can download the dataset assets and metadata using the `huggingface-cli`: ```bash # Install huggingface-cli pip install -U "huggingface_hub[cli]" # Download the dataset to local # This will download 'assets/' and 'metadata/' folder into local dir mkdir -p training_dataset huggingface-cli download VisGym/visgym_data --repo-type dataset --local-dir ./training_dataset ``` Check here for more usage details: https://github.com/visgym/VisGym/blob/main/visgym_training/README.md ## Citation If you use this dataset, please cite: ```bibtex @article{wang2026visgym, title = {VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents}, author = {Wang, Zirui and Zhang, Junyi and Ge, Jiaxin and Lian, Long and Fu, Letian and Dunlap, Lisa and Goldberg, Ken and Wang, Xudong and Stoica, Ion and Chan, David M. and Min, Sewon and Gonzalez, Joseph E.}, journal = {arXiv preprint arXiv:2601.16973}, year = {2026}, url = {https://arxiv.org/abs/2601.16973} } ```

提供机构：

VisGym

5,000+

优质数据集

54 个

任务类型

进入经典数据集