easyr1-10k-hard-qwen7b-easy-gta1-4MP-synthetic-prompts-qwen25vl7binstruct

Name: easyr1-10k-hard-qwen7b-easy-gta1-4MP-synthetic-prompts-qwen25vl7binstruct
Creator: maas
Published: 2025-10-22 15:23:56
License: 暂无描述

魔搭社区2025-10-22 更新2025-11-03 收录

下载链接：

https://modelscope.cn/datasets/mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP-synthetic-prompts-qwen25vl7binstruct

下载链接

链接失效反馈

官方服务：

资源简介：

# easyr1-10k-hard-qwen7b-easy-gta1-4MP-synthetic-prompts-qwen25vl7binstruct This dataset is derived from [mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP](https://huggingface.co/datasets/mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP) with synthetic prompts generated using Qwen/Qwen2.5-VL-7B-Instruct. ## Generation Details - **Generated on**: 2025-08-21 14:56:03 UTC - **Source dataset**: mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP - **Split processed**: train - **Model used**: Qwen/Qwen2.5-VL-7B-Instruct - **Total entries**: 10000 ## Synthetic Prompt Generation Each entry in the original dataset has been processed to generate synthetic prompts that represent possible user tasks. The synthetic prompts are generated based on: 1. The UI element that was clicked 2. The context visible in the screenshot 3. Common user intentions for similar UI interactions ## New Fields Added - `synthetic_prompts`: List of generated synthetic prompts (typically 3) - `original_prompt`: The original user instruction before synthesis - `selected_synthetic_prompt`: The synthetic prompt selected for the updated messages ## Usage ```python from datasets import load_dataset dataset = load_dataset("mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP-synthetic-prompts-qwen25vl7binstruct") # Access the data sample = dataset['train'][0] # Original prompt print(sample['original_prompt']) # All generated synthetic prompts print(sample['synthetic_prompts']) # The selected synthetic prompt used in messages print(sample['selected_synthetic_prompt']) # Updated messages with synthetic prompt print(sample['messages']) ``` ## License Inherits the license from the original dataset: mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP

# easyr1-10k-hard-qwen7b-easy-gta1-4MP-synthetic-prompts-qwen25vl7binstruct 本数据集衍生自 [mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP](https://huggingface.co/datasets/mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP)，其合成提示词由 Qwen/Qwen2.5-VL-7B-Instruct 生成。 ## 生成详情 - **生成时间**：2025-08-21 14:56:03 UTC - **源数据集**：mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP - **处理拆分集**：训练集（train） - **使用模型**：Qwen/Qwen2.5-VL-7B-Instruct - **总条目数**：10000 ## 合成提示词生成本数据集的原始条目均经过处理，以生成可表征实际用户任务的合成提示词。合成提示词的生成依据如下： 1. 被点击的UI元素 2. 截图中可见的上下文信息 3. 同类UI交互场景下的常见用户意图 ## 新增字段 - `synthetic_prompts`：生成的合成提示词列表（通常包含3条） - `original_prompt`：合成前的原始用户指令 - `selected_synthetic_prompt`：用于更新对话消息的选定合成提示词 ## 使用方法 python from datasets import load_dataset dataset = load_dataset("mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP-synthetic-prompts-qwen25vl7binstruct") # 访问数据样本 sample = dataset['train'][0] # 原始提示词 print(sample['original_prompt']) # 所有生成的合成提示词 print(sample['synthetic_prompts']) # 用于对话消息的选定合成提示词 print(sample['selected_synthetic_prompt']) # 带有合成提示词的更新后对话消息 print(sample['messages']) ## 许可证本数据集继承源数据集 mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP 的许可证协议。

提供机构：

maas

创建时间：

2025-10-03

搜集汇总

数据集介绍