ISG-Bench
收藏Interleaved Scene Graph for Interleaved Text-and-Image Generation Assessment
数据集概述
- 名称: Interleaved Scene Graph (ISG)
- 描述: 该数据集用于评估交错文本和图像生成任务,涵盖四个层次:结构、块、图像和整体。
- 适用领域: 多模态理解和生成任务,如Show-o和Anole。
数据集结构
- 文件:
ISG-Bench.jsonl: 包含由ISG编译的基准数据,每个样本包含查询和人工标注的黄金答案。images: 包含查询和黄金答案中的图像,需从huggingface下载并放置在ISG_eval目录下。
数据样本示例
json { "id": "0000", "Category": "Prediction", "Query": [ { "type": "text", "content": "I will give you a picture of a person washing their hands. Please use a combination of 4 images and text to show what will happen next. Please generate an overall description first, then directly generate adjacent image blocks. For example, [whole description] <object1 image> <object2 image> <object3 image> <object4 image>." }, { "type": "image", "content": "images/0000_q1.jpg" } ], "Golden": [ { "type": "text", "content": "The person continues to scrub their hands thoroughly, with the soap lathering up. The hands are cleaned under running water, and the lather is rinsed away." }, { "type": "image", "content": "images/0000_g1.jpg" }, { "type": "image", "content": "images/0000_g2.jpg" }, { "type": "image", "content": "images/0000_g3.jpg" }, { "type": "image", "content": "images/0000_g4.jpg" } ], "predict": { "structural": { "Query": [ "<query_text1>", "<query_img1>" ], "Answer": [ "<gen_text1>", "<gen_img1>", "<gen_img2>", "<gen_img3>", "<gen_img4>" ] }, "block_tuple": { "relation": [ [ "<gen_text1>", "<query_img1>", "is an overall description of" ], ... ] }, "block_qa": { "questions": [ { "subject": "<gen_text1>", "object": "<query_img1>", "relation": "is an overall description of", "Question": "Does <gen_text1> describe this image?" }, ... ] }, "image_tuple": [ [ "entity", "hands", "<gen_img1>" ], ... ], "image_qa": { "questions": [ { "image": "<gen_img1>", "Question": "Are there hands in this image?", "id": 0, "Preliminary": [] }, ... ] } } }
评估方法
- 环境设置: 使用GPT-4o进行VQA,以及MLLM-as-a-Judge进行整体评估。
- 模型评估: 通过运行
ISG-eval.py和summarize_performance.py脚本来评估自定义模型的输出。
引用
@article{chen2024interleaved, title={Interleaved Scene Graph for Interleaved Text-and-Image Generation Assessment}, author={Dongping Chen and Ruoxi Chen and Shu Pu and Zhaoyi Liu and Yanru Wu and Caixi Chen and Benlin Liu and Yue Huang and Yao Wan and Pan Zhou and Ranjay Krishna}, journal={arXiv preprint arXiv:2411.17188}, year={2024}, }




