SEED-Bench-2-plus
收藏魔搭社区2026-01-02 更新2024-08-31 收录
下载链接:
https://modelscope.cn/datasets/TencentARC/SEED-Bench-2-plus
下载链接
链接失效反馈官方服务:
资源简介:
# SEED-Bench-2-Plus Card
## Benchmark details
**Benchmark type:**
SEED-Bench-2-Plus is a large-scale benchmark to evaluate Multimodal Large Language Models (MLLMs).
It consists of 2.3K multiple-choice questions with precise human annotations, spanning three broad categories: Charts, Maps,
and Webs, each of which covers a wide spectrum of text-rich scenarios in the real world.
**Benchmark date:**
SEED-Bench-2-Plus was collected in April 2024.
**Paper or resources for more information:**
https://github.com/AILab-CVC/SEED-Bench
**License:**
Attribution-NonCommercial 4.0 International. It should abide by the policy of OpenAI: https://openai.com/policies/terms-of-use.
For the images of SEED-Bench-2-plus, we use data from the internet under CC-BY licenses.
Please contact us if you believe any data infringes upon your rights, and we will remove it.
**Where to send questions or comments about the benchmark:**
https://github.com/AILab-CVC/SEED-Bench/issues
## Intended use
**Primary intended uses:**
The primary use of SEED-Bench-2-Plus is evaluate Multimodal Large Language Models on text-rich visual understanding.
**Primary intended users:**
The primary intended users of the Benchmark are researchers and hobbyists in computer vision, natural language processing, machine learning, and artificial intelligence.
# SEED-Bench-2-Plus 基准数据集说明卡
## 基准数据集详情
**基准类型:**
SEED-Bench-2-Plus 是一款用于评估多模态大语言模型(Multimodal Large Language Models,MLLMs)的大规模基准数据集。该数据集包含2300道经人工精准标注的多项选择题,涵盖图表(Charts)、地图(Maps)与网页(Webs)三大核心类别,每一类均覆盖现实世界中丰富文本场景的广泛范畴。
**采集时间:**
SEED-Bench-2-Plus 于2024年4月完成采集。
**更多信息参考渠道:**
https://github.com/AILab-CVC/SEED-Bench
**授权协议:**
本数据集采用署名-非商业性使用4.0国际许可协议(Attribution-NonCommercial 4.0 International),且需遵守OpenAI相关政策:https://openai.com/policies/terms-of-use。
SEED-Bench-2-Plus 所使用的图像数据均来自遵循CC-BY许可协议的互联网公开资源。若您认为本数据集包含侵犯您合法权益的内容,请联系我们,我们将及时移除相关数据。
**问题与意见反馈渠道:**
https://github.com/AILab-CVC/SEED-Bench/issues
## 预期用途
**主要用途:**
SEED-Bench-2-Plus 主要用于评估多模态大语言模型在富文本视觉理解任务中的性能表现。
**主要适用人群:**
本数据集的主要适用人群为计算机视觉、自然语言处理、机器学习以及人工智能领域的研究人员与爱好者。
提供机构:
maas
创建时间:
2024-07-09



