five

VisGym/visgym_data

收藏
Hugging Face2026-02-05 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/VisGym/visgym_data
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en license: apache-2.0 size_categories: - 1M<n<10M task_categories: - image-text-to-text tags: - agent - vision-language-models - reinforcement-learning - vlm - multimodal - sft --- # VisGym Dataset [**Project Page**](https://visgym.github.io/) | [**Paper**](https://huggingface.co/papers/2601.16973) | [**GitHub**](https://github.com/visgym/VIsGym) **VisGym** consists of 17 diverse, long-horizon environments designed to systematically evaluate, diagnose, and train Vision-Language Models (VLMs) on visually interactive tasks. In these environments, agents must select actions conditioned on both their past actions and observation history, challenging their ability to handle complex, multimodal sequences. ## Dataset Summary This dataset contains trajectories and interaction data generated from the VisGym suites, intended for training and benchmarking multimodal agents. The environments are designed to be: * **Diverse:** Covering 17 distinct task categories. * **Customizable:** Allowing for various configurations of task difficulty and visual settings. * **Scalable:** Suitable for large-scale training of VLMs and Reinforcement Learning agents. ## Usage You can download the dataset assets and metadata using the `huggingface-cli`: ```bash # Install huggingface-cli pip install -U "huggingface_hub[cli]" # Download the dataset to local # This will download 'assets/' and 'metadata/' folder into local dir mkdir -p training_dataset huggingface-cli download VisGym/visgym_data --repo-type dataset --local-dir ./training_dataset ``` Check here for more usage details: https://github.com/visgym/VisGym/blob/main/visgym_training/README.md ## Citation If you use this dataset, please cite: ```bibtex @article{wang2026visgym, title = {VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents}, author = {Wang, Zirui and Zhang, Junyi and Ge, Jiaxin and Lian, Long and Fu, Letian and Dunlap, Lisa and Goldberg, Ken and Wang, Xudong and Stoica, Ion and Chan, David M. and Min, Sewon and Gonzalez, Joseph E.}, journal = {arXiv preprint arXiv:2601.16973}, year = {2026}, url = {https://arxiv.org/abs/2601.16973} } ```
提供机构:
VisGym
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作