five

easyr1-10k-hard-qwen7b-easy-gta1-4MP-synthetic-prompts-qwen25vl7binstruct

收藏
魔搭社区2025-10-22 更新2025-11-03 收录
下载链接:
https://modelscope.cn/datasets/mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP-synthetic-prompts-qwen25vl7binstruct
下载链接
链接失效反馈
官方服务:
资源简介:
# easyr1-10k-hard-qwen7b-easy-gta1-4MP-synthetic-prompts-qwen25vl7binstruct This dataset is derived from [mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP](https://huggingface.co/datasets/mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP) with synthetic prompts generated using Qwen/Qwen2.5-VL-7B-Instruct. ## Generation Details - **Generated on**: 2025-08-21 14:56:03 UTC - **Source dataset**: mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP - **Split processed**: train - **Model used**: Qwen/Qwen2.5-VL-7B-Instruct - **Total entries**: 10000 ## Synthetic Prompt Generation Each entry in the original dataset has been processed to generate synthetic prompts that represent possible user tasks. The synthetic prompts are generated based on: 1. The UI element that was clicked 2. The context visible in the screenshot 3. Common user intentions for similar UI interactions ## New Fields Added - `synthetic_prompts`: List of generated synthetic prompts (typically 3) - `original_prompt`: The original user instruction before synthesis - `selected_synthetic_prompt`: The synthetic prompt selected for the updated messages ## Usage ```python from datasets import load_dataset dataset = load_dataset("mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP-synthetic-prompts-qwen25vl7binstruct") # Access the data sample = dataset['train'][0] # Original prompt print(sample['original_prompt']) # All generated synthetic prompts print(sample['synthetic_prompts']) # The selected synthetic prompt used in messages print(sample['selected_synthetic_prompt']) # Updated messages with synthetic prompt print(sample['messages']) ``` ## License Inherits the license from the original dataset: mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP

# easyr1-10k-hard-qwen7b-easy-gta1-4MP-synthetic-prompts-qwen25vl7binstruct 本数据集衍生自 [mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP](https://huggingface.co/datasets/mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP),其合成提示词由 Qwen/Qwen2.5-VL-7B-Instruct 生成。 ## 生成详情 - **生成时间**:2025-08-21 14:56:03 UTC - **源数据集**:mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP - **处理拆分集**:训练集(train) - **使用模型**:Qwen/Qwen2.5-VL-7B-Instruct - **总条目数**:10000 ## 合成提示词生成 本数据集的原始条目均经过处理,以生成可表征实际用户任务的合成提示词。合成提示词的生成依据如下: 1. 被点击的UI元素 2. 截图中可见的上下文信息 3. 同类UI交互场景下的常见用户意图 ## 新增字段 - `synthetic_prompts`:生成的合成提示词列表(通常包含3条) - `original_prompt`:合成前的原始用户指令 - `selected_synthetic_prompt`:用于更新对话消息的选定合成提示词 ## 使用方法 python from datasets import load_dataset dataset = load_dataset("mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP-synthetic-prompts-qwen25vl7binstruct") # 访问数据样本 sample = dataset['train'][0] # 原始提示词 print(sample['original_prompt']) # 所有生成的合成提示词 print(sample['synthetic_prompts']) # 用于对话消息的选定合成提示词 print(sample['selected_synthetic_prompt']) # 带有合成提示词的更新后对话消息 print(sample['messages']) ## 许可证 本数据集继承源数据集 mlfoundations-cua-dev/easyr1-10k-hard-qwen7b-easy-gta1-4MP 的许可证协议。
提供机构:
maas
创建时间:
2025-10-03
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集源自easyr1-10k-hard-qwen7b-easy-gta1-4MP,通过Qwen2.5-VL-7B-Instruct模型生成了合成提示,用于模拟用户任务。它包含10000个条目,新增了合成提示列表、原始提示和选定提示等字段,适用于数据加载和处理。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作