five

HaochenWang/Grasp-Any-Region-Dataset

收藏
Hugging Face2025-10-30 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/HaochenWang/Grasp-Any-Region-Dataset
下载链接
链接失效反馈
官方服务:
资源简介:
Grasp Any Region (GAR)数据集旨在提升多模态大型语言模型对区域级视觉的全面理解能力。该数据集通过精确感知、利用关键全局上下文和模拟多提示之间的交互,使模型能够进行高级组合推理,回答关于图像或视频中任何区域的特定自由形式问题。此外,它还支撑了GAR-Bench新基准,用于评估单区域理解、交互和跨图像和视频多个区域的复杂推理。

The Grasp Any Region (GAR) dataset is designed to enhance the region-level visual understanding of Multimodal Large Language Models (MLLMs). By enabling precise perception, leveraging crucial global contexts, and modeling interactions between multiple prompts, the dataset allows for advanced compositional reasoning to answer specific free-form questions about any region in both images and videos. It also supports GAR-Bench, a new benchmark for evaluating single-region comprehension, interactions, and complex reasoning across multiple regions in images and videos.
提供机构:
HaochenWang
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作