GUIMid
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/hkust-nlp/GUIMid
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个在训练中期整合的高性能领域数据集,旨在增强视觉语言模型(VLMs)的基础代理能力。该数据集在训练中期使用,随后在图形用户界面(GUI)轨迹数据上进行微调,融合了各种非GUI特定数据和GUI特定数据。数据集规模包含30万个示例,具体来源包括:15万个来自MathInstruct,2万个来自Code I/O,5万个来自奥林匹克数学,以及8万个来自多模态数学。该数据集的任务是训练视觉语言模型,以应对GUI代理任务。
This dataset is a high-performance domain dataset integrated during the mid-training phase, designed to enhance the foundational agent capabilities of Vision-Language Models (VLMs). Utilized in the mid-training stage, this dataset is subsequently fine-tuned on graphical user interface (GUI) trajectory data, integrating a variety of both non-GUI-specific and GUI-specific data. The dataset contains 300,000 total examples, with specific sources as follows: 150,000 from MathInstruct, 20,000 from Code I/O, 50,000 from Olympic Mathematics, and 80,000 from Multimodal Mathematics. The core task of this dataset is to train vision-language models to tackle GUI-based agent tasks.



