five

THRONE

收藏
arXiv2024-05-09 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2405.05256v1
下载链接
链接失效反馈
官方服务:
资源简介:
THRONE是一个新颖的对象基础自动框架,用于定量评估LVLM自由形式输出中的Type I幻觉。该数据集基于COCO 2017验证集,包含5000张图像和80个类别。THRONE利用语言模型来准确判断LVLM响应中提到的对象是否暗示存在于图像中,或者是抽象提及而没有暗示其存在。数据集的应用领域主要集中在评估和减少大型视觉语言模型在生成自由形式描述时的幻觉现象,旨在提高模型在安全关键场景中的可靠性和准确性。

THRONE is a novel object-grounded automatic framework for quantitatively evaluating Type I hallucinations in free-form outputs of Large Vision-Language Models (LVLMs). This dataset is based on the COCO 2017 validation set, containing 5000 images across 80 categories. THRONE leverages language models to accurately determine whether objects mentioned in LVLM responses are visually grounded in the corresponding image, or merely abstractly referenced without implying their actual existence. The primary application scope of this dataset focuses on evaluating and mitigating hallucinations in free-form descriptions generated by Large Vision-Language Models, with the goal of enhancing the reliability and accuracy of such models in safety-critical scenarios.
提供机构:
VGG, University of Oxford
创建时间:
2024-05-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作