HALLUCINOGEN
收藏HALLUCINOGEN 数据集概述
数据集简介
HALLUCINOGEN 是一个用于评估大型视觉语言模型(LVLMs)在对象检测和医疗应用中幻觉现象的新基准。该基准引入了多样化的复杂上下文推理提示,称为对象幻觉攻击,专门设计用于查询LVLMs关于目标图像中可能不存在的视觉对象。数据集包含60,000个图像-提示组合,涵盖3,000个视觉-对象对,并包含四个难度递增的基本视觉语言任务:识别、定位、视觉上下文推理和反事实推理。
MED-HALLUCINOGEN 扩展
HALLUCINOGEN 扩展至高风险医疗应用,引入了MED-HALLUCINOGEN,旨在评估LVLMs从生物医学图像(如胸部X光片)中准确诊断疾病的能力。MED-HALLUCINOGEN 包含3,000个独特的X光片和疾病对,每个对配有十个幻觉攻击提示,其中五个用于疾病识别,五个用于定位。
数据集来源
- HALLUCINOGEN 的图像-对象对提取自 POPE 基准的所有分割。
- MED-HALLUCINOGEN 的胸部X光片和对应诊断疾病数据来自 NIH Chest X-rays。
目录结构
HALLUCINOGEN 目录结构
└── hallucinogen ├── data │ ├── task_name (e.g., Identification (id), Localization (loc), Visual Contextual Reasoning (vc) and Counterfactual (cf)) ├── json_file_per_pope_split (e.g, adversarial (_adv), random (_rand) and popular (_pop)) ...
MED-HALLUCINOGEN 目录结构
└── med_hallucinogen ├── data │ ├── task_name (e.g., Identification (id), Localization (loc)) ├── json_file ...
JSON 文件结构
HALLUCINOGEN JSON 文件结构
{ "question_id": 1, "image": "COCO_val2014_000000310196.jpg", "text": "Is there a snowboard in the image?", "label": "yes", "query_1": "Is the snowboard present in the image?", "query_2": "Determine whether a snowboard is visible in the image.", "query_3": "Assess if a snowboard can be seen in the image.", "query_4": "Find out if a snowboard appears in the photo.", "query_5": "Verify if there is a snowboard in the image." } ...
MED-HALLUCINOGEN JSON 文件结构
{ "image": "images_001/images/00000003_003.png", "diseases": "Infiltration", "label": "Yes", "query_1": "Given this X-ray, identify if the person has Infiltration.", "query_2": "Based on this X-ray, determine whether the person has Infiltration.", "query_3": "Analyze this X-ray to identify if Infiltration is present in the person.", "query_4": "Examine this X-ray and conclude if the person has Infiltration.", "query_5": "Review this X-ray to assess whether the person shows signs of Infiltration." } ...
引用
@inproceedings{seth2024hallucinogen, title={HALLUCINOGEN: A Benchmark for Evaluating Object Hallucination in Large Visual-Language Models}, author={Seth, Ashish and Manocha, Dinesh and Agarwal, Chirag}, journal={arxiv}, year={2024} }




