osunlp/Multimodal-Mind2Web
收藏Hugging Face2024-06-05 更新2024-04-19 收录
下载链接:
https://hf-mirror.com/datasets/osunlp/Multimodal-Mind2Web
下载链接
链接失效反馈官方服务:
资源简介:
Multimodal-Mind2Web是Mind2Web数据集的多模态版本,旨在开发和评估能够根据语言指令在任何网站上完成复杂任务的通用网页代理。该数据集将每个HTML文档与其对应的网页截图图像对齐,解决了从Mind2Web原始数据集中加载图像的不便。数据集包含训练集和三个测试集,每个数据条目包含截图图像、HTML文本和其他用于动作预测的字段。训练集可能包含一些由于渲染问题而未正确渲染的截图图像,而三个测试集经过人工验证以确保元素可见性和正确渲染。
Multimodal-Mind2Web is the multimodal variant of the Mind2Web dataset, which aims to develop and evaluate general-purpose web agents that can complete complex tasks on any website based on natural language instructions. This dataset aligns each HTML document with its corresponding webpage screenshot image, resolving the inconvenience of loading images from the original Mind2Web dataset. The dataset includes a training set and three test sets, where each data entry contains screenshot images, HTML text, and other fields for action prediction. The training set may contain some screenshot images that are not properly rendered due to rendering issues, while the three test sets have been manually verified to ensure element visibility and correct rendering.
提供机构:
osunlp
原始信息汇总
数据集概述
- 名称:Multimodal-Mind2Web
- 类型:多模态数据集
- 目的:用于开发和评估通用网络代理,这些代理能够根据语言指令在任何网站上完成复杂任务。
- 特点:
- 每个HTML文档与其对应的网页截图图像对齐。
- 解决了从约300GB的Mind2Web原始数据集中加载图像的不便问题。
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



