Magic-RICH
收藏魔搭社区2026-01-02 更新2025-09-06 收录
下载链接:
https://modelscope.cn/datasets/FudanNLP/Magic-RICH
下载链接
链接失效反馈官方服务:
资源简介:
# Dataset Summary
Magic-RICH is a Chinese benchmark dataset for evaluating mobile GUI agents in realistic smartphone environments. It contains 4,000 step-level samples across four subsets, covering 17 categories and over 150 popular apps. Unlike many previous GUI datasets, Magic-RICH also includes special actions such as screenshot and long screenshot to better reflect real-world interactions. This dataset is designed for evaluation only (no train/dev split) and was used in the development of MagicGUI, an open-source GUI agent.
# Subsets
Magic-RICH is composed of four balanced subsets (1,000 samples each):
- Routine: High-frequency, single-step actions (e.g., tap, scroll, text input).
- Instruction: Direct user commands (e.g., "Open...", "Check membership"), testing instruction-to-action mapping.
- Complex: Harder tasks requiring reasoning (logical conditions, visual analysis, multi-step navigation).
- Handling Exception: Special cases including
- *Non-interactive* (page cannot be acted on),
- *Completed* (task already finished),
- *Loading* (page still in transition).
# Evaluation Protocol
Three metrics are recommended for evaluation:
- Type – action type accuracy (e.g., Tap vs. Scroll)
- Grd – grounding accuracy (tap/scroll location falls inside ground-truth element box)
- SR – full correctness at the step level (all parameters correct)
# License
This project is licensed under the [Apache-2.0](./LICENSE) license. The model weights are fully open for academic research, and commercial use licenses can be applied for by contacting magicgui@honor.com. This project uses the pre-trained Qwen2VL-7B-Instruct for initialization, which is also licensed under the Apache- 2.0 License.
# Citation
If you use Magic-RICH in your research, please cite:
```bibtex
@misc{tang2025magicguifoundationalmobilegui,
title={MagicGUI: A Foundational Mobile GUI Agent with Scalable Data Pipeline and Reinforcement Fine-tuning},
author={Liujian Tang and Shaokang Dong and Yijia Huang and Minqi Xiang and Hongtao Ruan and Bin Wang and Shuo Li and Zhiheng Xi and Zhihui Cao and Hailiang Pang and Heng Kong and He Yang and Mingxu Chai and Zhilin Gao and Xingyu Liu and Yingnan Fu and Jiaming Liu and Xuanjing Huang and Yu-Gang Jiang and Tao Gui and Qi Zhang and Kang Wang and Yunke Zhang and Yuran Wang},
year={2025},
eprint={2508.03700},
archivePrefix={arXiv},
primaryClass={cs.HC},
url={https://arxiv.org/abs/2508.03700},
}
# 数据集概述
Magic-RICH 是一款用于在真实智能手机环境中评估移动GUI智能体(GUI Agent)的中文基准数据集。该数据集包含4000个步骤级样本,分为4个子集,覆盖17个类别与150余款热门应用。与多数现有GUI数据集不同,Magic-RICH 还包含截图、长截图等特殊操作,以更贴合真实交互场景。本数据集仅用于评估(无训练集/开发集划分),且已用于开源GUI智能体MagicGUI的开发。
# 子集
Magic-RICH 由4个均衡子集构成(每个子集含1000个样本):
- 常规操作:高频单步操作(如点击、滑动、文本输入)。
- 指令遵循:直接用户指令(如“打开……”“查看会员信息”),用于测试指令到动作的映射能力。
- 复杂任务:需进行推理的高难度任务(如逻辑条件判断、视觉分析、多步导航)。
- 异常处理:包含以下特殊场景:
- 非交互场景(页面无法执行操作),
- 任务已完成场景(任务已达成),
- 加载中场景(页面仍处于过渡状态)。
# 评估协议
推荐使用以下三项指标开展评估:
- 动作类型准确率(Type):动作类型识别准确率(如区分点击与滑动操作)
- 定位准确率(Grd):点击/滑动位置落在标注元素框内的准确率
- 步骤级全正确准确率(SR):单步骤所有参数均正确的全匹配准确率
# 许可证
本项目采用[Apache-2.0](./LICENSE)许可证。模型权重完全开放用于学术研究,商业使用许可可通过联系magicgui@honor.com申请。本项目使用预训练的Qwen2VL-7B-Instruct进行初始化,该模型同样采用Apache-2.0许可证。
# 引用
若您在研究中使用Magic-RICH数据集,请引用以下文献:
bibtex
@misc{tang2025magicguifoundationalmobilegui,
title={MagicGUI: A Foundational Mobile GUI Agent with Scalable Data Pipeline and Reinforcement Fine-tuning},
author={Liujian Tang and Shaokang Dong and Yijia Huang and Minqi Xiang and Hongtao Ruan and Bin Wang and Shuo Li and Zhiheng Xi and Zhihui Cao and Hailiang Pang and Heng Kong and He Yang and Mingxu Chai and Zhilin Gao and Xingyu Liu and Yingnan Fu and Jiaming Liu and Xuanjing Huang and Yu-Gang Jiang and Tao Gui and Qi Zhang and Kang Wang and Yunke Zhang and Yuran Wang},
year={2025},
eprint={2508.03700},
archivePrefix={arXiv},
primaryClass={cs.HC},
url={https://arxiv.org/abs/2508.03700},
}
提供机构:
maas
创建时间:
2025-09-03



