five

ZhuOnR/ScreenSpot-v2

收藏
Hugging Face2026-04-26 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/ZhuOnR/ScreenSpot-v2
下载链接
链接失效反馈
官方服务:
资源简介:
ScreenSpot-V2是一个用于评估跨平台(移动、桌面、网页)单步GUI接地能力的数据集,包含1272个样本。数据集按平台类型组织:网页域有436个问题,桌面域有334个问题,移动域有502个问题。每个问题基于自然语言指令,要求识别截图中的文本元素或图标/小部件元素,并提供真实坐标或边界框。该数据集旨在修正原始ScreenSpot基准中的注释错误(约11.32%的样本存在问题),通过移除问题样本、重写指令和纠正边界框标签来改进。数据集结构包括指令字段、数据源分类和动作检测字段(包含元素类型和边界框坐标)。

ScreenSpot-V2 is a dataset designed for evaluating single-step GUI grounding capabilities across multiple platforms (mobile, desktop, and web), containing 1272 samples. The dataset is organized by platform type: web domain with 436 questions, desktop domain with 334 questions, and mobile domain with 502 questions. Each question requires identifying text elements or icon/widget elements based on natural language instructions, with screenshots and corresponding ground truth coordinates or bounding boxes provided. It addresses annotation errors in the original ScreenSpot benchmark (approximately 11.32% of samples had issues) by removing problematic samples, revising instructions, and correcting mislabeled bounding boxes. The dataset structure includes fields for instruction, data source classification, and action detection (with element type and bounding box coordinates).
提供机构:
ZhuOnR
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作