five

ShowUI-desktop

收藏
魔搭社区2025-12-05 更新2025-05-31 收录
下载链接:
https://modelscope.cn/datasets/showlab/ShowUI-desktop
下载链接
链接失效反馈
官方服务:
资源简介:
[Github](https://github.com/showlab/ShowUI/tree/main) | [arXiv](https://arxiv.org/abs/2411.17465) | [HF Paper](https://huggingface.co/papers/2411.17465) | [Spaces](https://huggingface.co/spaces/showlab/ShowUI) | [Datasets](https://huggingface.co/datasets/showlab/ShowUI-desktop-8K) | [Quick Start](https://huggingface.co/showlab/ShowUI-2B) **ShowUI-desktop-8K** is a UI-grounding dataset focused on PC-based grounding, with screenshots and annotations originally sourced from [OmniAct](https://huggingface.co/datasets/Writer/omniact). We utilize GPT-4o to augment the original annotations, enriching them with diverse attributes such as appearance, spatial relationships, and intended functionality. You can use our [rewrite strategy code](https://github.com/showlab/ShowUI/blob/main/recaption.ipynb) to augment your own data. ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64440be5af034cdfd69ca3a7/t6dZzpBdiDTHDxlke4Eva.png) If you find our work helpful, please consider citing our paper. ``` @misc{lin2024showui, title={ShowUI: One Vision-Language-Action Model for GUI Visual Agent}, author={Kevin Qinghong Lin and Linjie Li and Difei Gao and Zhengyuan Yang and Shiwei Wu and Zechen Bai and Weixian Lei and Lijuan Wang and Mike Zheng Shou}, year={2024}, eprint={2411.17465}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2411.17465}, } ```

[GitHub](https://github.com/showlab/ShowUI/tree/main) | [arXiv](https://arxiv.org/abs/2411.17465) | [Hugging Face 论文页(HF Paper)](https://huggingface.co/papers/2411.17465) | [Hugging Face 在线演示空间(Spaces)](https://huggingface.co/spaces/showlab/ShowUI) | [Hugging Face 数据集页(Datasets)](https://huggingface.co/datasets/showlab/ShowUI-desktop-8K) | [快速开始(Quick Start)](https://huggingface.co/showlab/ShowUI-2B) **ShowUI-desktop-8K** 是一款聚焦于PC端视觉锚定(UI-grounding)的数据集,其原始截图与标注数据源自[OmniAct](https://huggingface.co/datasets/Writer/omniact)。 我们借助GPT-4o对原始标注进行增强,为其补充了外观、空间关系以及预期功能等多样化属性。 您可使用我们提供的[重写策略代码(rewrite strategy code)](https://github.com/showlab/ShowUI/blob/main/recaption.ipynb)对自有数据进行扩增。 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64440be5af034cdfd69ca3a7/t6dZzpBdiDTHDxlke4Eva.png) 若您的研究工作得益于本项目,敬请引用我们的论文。 @misc{lin2024showui, title={ShowUI:面向GUI视觉智能体的视觉语言动作模型}, author={Kevin Qinghong Lin and Linjie Li and Difei Gao and Zhengyuan Yang and Shiwei Wu and Zechen Bai and Weixian Lei and Lijuan Wang and Mike Zheng Shou}, year={2024}, eprint={2411.17465}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2411.17465}, }
提供机构:
maas
创建时间:
2025-05-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作