Act2Intention
收藏魔搭社区2025-09-15 更新2025-09-20 收录
下载链接:
https://modelscope.cn/datasets/AriaDeLuca/Act2Intention
下载链接
链接失效反馈官方服务:
资源简介:
# Act2Intention Bench
## 📝 Overview
Mobile GUI agents powered by MLLMs show promise in human-computer intelligence. However, current research primarily focuses on reactive task execution while lacking major capabilities for proactively inferring user intentions, which are the core requirements of active agents. In this paper, we propose the Act2Intention framework that builds an active mobile agent by integrating understanding, predicting user intentions, and executing decisions.
To realize this paradigm, we introduce Act2Intention Bench, the first comprehensive dataset designed for studying agents based on GUI action trajectories. It contains 450 personas, 72,511 intentions, and over 70,000 actions across 52 applications. We further develop Act2Intention Agent, a multi-agent framework (based on Qwen2.5-7B/Llama3.1-8B) combining three specialized agents: 1) Intent Understanding, designed to understand user intention underlying GUI actions; 2) Intent Prediction, which infers potential user intentions by integrating historical intentions and their characteristics; 3) Intention Execution, leveraging historical experience to guide task execution.
# Act2Intention 基准测试集
## 📝 概述
基于多模态大语言模型(Multimodal Large Language Model, MLLMs)构建的移动图形用户界面(Graphical User Interface, GUI)智能体在人机智能领域展现出可观的应用前景。然而,当前研究主要聚焦于响应式任务执行,却缺乏主动推断用户意图的核心能力,而这正是主动智能体的核心需求。本文提出Act2Intention框架,通过整合意图理解、意图预测与决策执行,构建主动式移动智能体。
为实现该范式,我们发布Act2Intention基准测试集,这是首个面向GUI操作轨迹智能体研究的综合性数据集。该数据集涵盖52款应用场景下的450个人物角色设定、72,511条用户意图以及超70,000条操作轨迹。我们进一步开发了Act2Intention智能体,这是一个基于Qwen2.5-7B/Llama3.1-8B的多智能体框架,整合了三类专用智能体:
1. 意图理解智能体:用于解析GUI操作背后的用户意图;
2. 意图预测智能体:通过整合历史意图及其特征,推断用户潜在意图;
3. 意图执行智能体:依托历史经验指导任务执行。
提供机构:
maas
创建时间:
2025-09-15



