OpenGVLab/V2PE-Data

Name: OpenGVLab/V2PE-Data
Creator: OpenGVLab
Published: 2024-12-14 06:55:49
License: 暂无描述

Hugging Face2024-12-14 更新2024-12-14 收录

下载链接：

https://hf-mirror.com/datasets/OpenGVLab/V2PE-Data

下载链接

链接失效反馈

官方服务：

资源简介：

V2PE-Data数据集包含两个增强的长上下文多模态数据集：长视觉问答（Long-VQA）和长多模态检索（Long-MR）。这些数据集旨在增强视觉语言模型（VLMs）的长上下文训练，并建立一个系统的评估框架，以解决现有训练数据范围之外的长上下文理解挑战。Long-VQA数据集扩展了17个广泛采用的数据集，包含多达32K到64K的标记序列，涉及常识推理、事实知识和视觉信息解释等任务。Long-MR数据集则通过插入目标图像或文本段来评估VLMs从超长多模态序列中检索特定目标的能力。

The V2PE-Data dataset includes two augmented long-context multimodal datasets: Long Visual Question Answering (Long-VQA) and Long Multimodal Retrieval (Long-MR). Long-VQA extends 17 widely adopted datasets, containing 533K samples, to evaluate the capabilities of VLMs in understanding and reasoning over long multimodal sequences. Long-MR inserts target images or textual segments into sequences of interleaved images and texts, assessing the models ability to retrieve specific targets from ultra-long multimodal sequences, with two subsets: Long-MR-32K and Long-MR-256K.

提供机构：

OpenGVLab

5,000+

优质数据集

54 个

任务类型

进入经典数据集