five

Jedi

收藏
魔搭社区2025-11-27 更新2025-08-30 收录
下载链接:
https://modelscope.cn/datasets/xlangai/Jedi
下载链接
链接失效反馈
官方服务:
资源简介:
# JEDI **NOTE**: Before you use this dataset, make sure you understand the logic of absolute coordinates and [image processor](https://github.com/QwenLM/Qwen2.5-VL/blob/d2240f11656bfe404b9ba56db4e51cd09f522ff1/qwen-vl-utils/src/qwen_vl_utils/vision_process.py#L60) for [Qwen2.5-VL](https://arxiv.org/abs/2502.13923). This dataset is set with the image processor max tokens to be 2700, a.k.a max_pixels=2700x14x14x2x2 , the coordinates were resized to be smaller and you have to resize the image as well within max_pixels=2700x14x14x2x2 via image processor to make them align. Make sure you also follow it in your training procedure, otherwise the performance will not be as expected. The JEDI Dataset consists of four carefully designed categories: - Icon - Component - Layout - Refusal This repository includes all the textures and images for these components. Additionally, JEDI processes and improves the data from [AGUVIS](https://github.com/xlang-ai/aguvis) and [Desktop domain from OS-ATLAS](https://huggingface.co/datasets/agnet/osatlas), calling it `AGUVIS++`. This repository contains the texture portion of `AGUVIS++`. For images, please refer to the [aguvis original repository](https://huggingface.co/collections/ranpox/aguvis-unified-pure-vision-gui-agents-6764e2bc343c62af95c209d8) and [os-atlas original repository](https://huggingface.co/datasets/OS-Copilot/OS-Atlas-data/tree/main/desktop_domain), please download them and align with the image path for usage. Project Page: https://osworld-grounding.github.io GitHub Repository: https://github.com/xlang-ai/OSWorld-G **Fun Fact**: JEDI has no meaning in the name itself. Just because it is pronounced very close to Chinese '接地', which is the translation of English 'Grounding'. ## 📄 Citation If you find this work useful, please consider citing our paper: ```bibtex @misc{xie2025scalingcomputerusegroundinguser, title={Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis}, author={Tianbao Xie and Jiaqi Deng and Xiaochuan Li and Junlin Yang and Haoyuan Wu and Jixuan Chen and Wenjing Hu and Xinyuan Wang and Yuhui Xu and Zekun Wang and Yiheng Xu and Junli Wang and Doyen Sahoo and Tao Yu and Caiming Xiong}, year={2025}, eprint={2505.13227}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2505.13227}, } ```

# JEDI **重要提示**:在使用本数据集前,请务必理解[Qwen2.5-VL](https://arxiv.org/abs/2502.13923)所使用的[图像处理器(image processor)](https://github.com/QwenLM/Qwen2.5-VL/blob/d2240f11656bfe404b9ba56db4e51cd09f522ff1/qwen-vl-utils/src/qwen_vl_utils/vision_process.py#L60)的逻辑。 本数据集将图像处理器的最大Token数设置为2700,即`max_pixels=2700×14×14×2×2`,坐标已被缩小适配,你需要通过图像处理器将图像resize至`max_pixels=2700×14×14×2×2`以确保坐标对齐。请在训练流程中遵循此要求,否则将无法达到预期性能。 JEDI数据集包含四个精心设计的类别: - 图标(Icon) - 组件(Component) - 布局(Layout) - 拒绝(Refusal) 本仓库包含上述类别的全部纹理与图像资源。 此外,JEDI对来自[AGUVIS](https://github.com/xlang-ai/aguvis)与[OS-ATLAS的桌面领域数据集](https://huggingface.co/datasets/agnet/osatlas)的数据进行了处理与优化,将其命名为`AGUVIS++`。本仓库仅包含`AGUVIS++`的纹理部分,图像资源请参考[AGUVIS原始仓库](https://huggingface.co/collections/ranpox/aguvis-unified-pure-vision-gui-agents-6764e2bc343c62af95c209d8)与[OS-ATLAS原始仓库](https://huggingface.co/datasets/OS-Copilot/OS-Atlas-data/tree/main/desktop_domain),请下载对应资源并匹配图像路径后使用。 项目主页:https://osworld-grounding.github.io GitHub仓库:https://github.com/xlang-ai/OSWorld-G **趣闻**:JEDI这一名称本身并无特殊含义,仅因其发音与中文“接地”(即英文Grounding的标准译法)极为相近而得名。 ## 📄 引用 若您认为本工作对您有所帮助,请引用我们的论文: bibtex @misc{xie2025scalingcomputerusegroundinguser, title={Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis}, author={Tianbao Xie and Jiaqi Deng and Xiaochuan Li and Junlin Yang and Haoyuan Wu and Jixuan Chen and Wenjing Hu and Xinyuan Wang and Yuhui Xu and Zekun Wang and Yiheng Xu and Junli Wang and Doyen Sahoo and Tao Yu and Caiming Xiong}, year={2025}, eprint={2505.13227}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2505.13227}, }
提供机构:
maas
创建时间:
2025-08-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作