SEED-Bench-2

Name: SEED-Bench-2
Creator: 腾讯人工智能实验室
Published: 2023-11-28 13:53:55
License: 暂无描述

arXiv2023-11-28 更新2024-06-21 收录

下载链接：

https://github.com/AILab-CVC/SEED-Bench

下载链接

链接失效反馈

官方服务：

资源简介：

SEED-Bench-2是由腾讯人工智能实验室开发的综合性多模态大型语言模型评估基准。该数据集包含24,371个多选题，覆盖27个评估维度，旨在全面评估模型在文本和图像生成方面的能力。数据集通过精细的人工标注确保准确性，适用于评估多种开放源码的多模态大型语言模型，如GPT-4V和DALL-E 3的结合体。SEED-Bench-2不仅评估模型的理解能力，还涵盖了从单图像文本输入到多模态输入的处理能力，以及图像和文本的生成能力，为推动通用人工智能的发展提供了重要工具。

SEED-Bench-2 is a comprehensive multimodal large language model evaluation benchmark developed by Tencent AI Lab. This dataset contains 24,371 multiple-choice questions spanning 27 evaluation dimensions, aiming to comprehensively assess models' capabilities in text and image generation. With meticulous manual annotations to ensure accuracy, the dataset is suitable for evaluating various open-source multimodal large language models, as well as the combination of GPT-4V and DALL-E 3. SEED-Bench-2 not only evaluates models' comprehension capabilities, but also covers processing capabilities ranging from single-image text inputs to multimodal inputs, as well as image and text generation capabilities, providing an important tool for advancing the development of AGI.

提供机构：

腾讯人工智能实验室

创建时间：

2023-11-28

5,000+

优质数据集

54 个

任务类型

进入经典数据集