ShowUI-desktop

Name: ShowUI-desktop
Creator: maas
Published: 2025-12-05 16:36:48
License: 暂无描述

魔搭社区2025-12-05 更新2025-05-31 收录

下载链接：

https://modelscope.cn/datasets/showlab/ShowUI-desktop

下载链接

链接失效反馈

官方服务：

资源简介：

[Github](https://github.com/showlab/ShowUI/tree/main) | [arXiv](https://arxiv.org/abs/2411.17465) | [HF Paper](https://huggingface.co/papers/2411.17465) | [Spaces](https://huggingface.co/spaces/showlab/ShowUI) | [Datasets](https://huggingface.co/datasets/showlab/ShowUI-desktop-8K) | [Quick Start](https://huggingface.co/showlab/ShowUI-2B) **ShowUI-desktop-8K** is a UI-grounding dataset focused on PC-based grounding, with screenshots and annotations originally sourced from [OmniAct](https://huggingface.co/datasets/Writer/omniact). We utilize GPT-4o to augment the original annotations, enriching them with diverse attributes such as appearance, spatial relationships, and intended functionality. You can use our [rewrite strategy code](https://github.com/showlab/ShowUI/blob/main/recaption.ipynb) to augment your own data. ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64440be5af034cdfd69ca3a7/t6dZzpBdiDTHDxlke4Eva.png) If you find our work helpful, please consider citing our paper. ``` @misc{lin2024showui, title={ShowUI: One Vision-Language-Action Model for GUI Visual Agent}, author={Kevin Qinghong Lin and Linjie Li and Difei Gao and Zhengyuan Yang and Shiwei Wu and Zechen Bai and Weixian Lei and Lijuan Wang and Mike Zheng Shou}, year={2024}, eprint={2411.17465}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2411.17465}, } ```

[GitHub](https://github.com/showlab/ShowUI/tree/main) | [arXiv](https://arxiv.org/abs/2411.17465) | [Hugging Face 论文页（HF Paper）](https://huggingface.co/papers/2411.17465) | [Hugging Face 在线演示空间（Spaces）](https://huggingface.co/spaces/showlab/ShowUI) | [Hugging Face 数据集页（Datasets）](https://huggingface.co/datasets/showlab/ShowUI-desktop-8K) | [快速开始（Quick Start）](https://huggingface.co/showlab/ShowUI-2B) **ShowUI-desktop-8K** 是一款聚焦于PC端视觉锚定（UI-grounding）的数据集，其原始截图与标注数据源自[OmniAct](https://huggingface.co/datasets/Writer/omniact)。我们借助GPT-4o对原始标注进行增强，为其补充了外观、空间关系以及预期功能等多样化属性。您可使用我们提供的[重写策略代码（rewrite strategy code）](https://github.com/showlab/ShowUI/blob/main/recaption.ipynb)对自有数据进行扩增。 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64440be5af034cdfd69ca3a7/t6dZzpBdiDTHDxlke4Eva.png) 若您的研究工作得益于本项目，敬请引用我们的论文。 @misc{lin2024showui, title={ShowUI：面向GUI视觉智能体的视觉语言动作模型}, author={Kevin Qinghong Lin and Linjie Li and Difei Gao and Zhengyuan Yang and Shiwei Wu and Zechen Bai and Weixian Lei and Lijuan Wang and Mike Zheng Shou}, year={2024}, eprint={2411.17465}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2411.17465}, }

提供机构：

maas

创建时间：

2025-05-29

5,000+

优质数据集

54 个

任务类型

进入经典数据集