five

secret13/JAMMEval

收藏
Hugging Face2026-04-16 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/secret13/JAMMEval
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: CC-OCR-JA-Refined features: - name: original_id dtype: int64 - name: image dtype: image - name: question dtype: string - name: answer dtype: string - name: ocr dtype: string - name: image_name dtype: string splits: - name: test num_bytes: 71840923 num_examples: 145 download_size: 71822359 dataset_size: 71840923 - config_name: CVQA-JA-Refined features: - name: image dtype: image - name: original_id dtype: string - name: question dtype: string - name: options list: string - name: answer dtype: int64 - name: Category dtype: string - name: Image Type dtype: string - name: Image Source dtype: string - name: License dtype: string splits: - name: test num_bytes: 62737078 num_examples: 200 download_size: 62713215 dataset_size: 62737078 - config_name: Heron-Bench-Refined features: - name: original_id dtype: int64 - name: image dtype: image - name: image_category dtype: string - name: context dtype: string - name: question dtype: string - name: answer dtype: string splits: - name: test num_bytes: 38067838 num_examples: 88 download_size: 38046258 dataset_size: 38067838 - config_name: JA-Multi-Image-VQA-Refined features: - name: original_id dtype: int64 - name: images list: image - name: question dtype: string - name: answer dtype: string - name: image_urls list: string - name: page_urls list: string splits: - name: test num_bytes: 99461072 num_examples: 53 download_size: 99466318 dataset_size: 99461072 - config_name: JA-VLM-Bench-Refined features: - name: original_id dtype: int64 - name: image dtype: image - name: question dtype: string - name: answer dtype: string - name: image_url dtype: string - name: page_url dtype: string splits: - name: test num_bytes: 6925318 num_examples: 49 download_size: 6917035 dataset_size: 6925318 - config_name: JDocQA-Refined features: - name: original_id dtype: int64 - name: question dtype: string - name: answer dtype: string splits: - name: test num_bytes: 103278 num_examples: 861 download_size: 64671 dataset_size: 103278 - config_name: JGraphQA-Refined features: - name: original_id dtype: int64 - name: type dtype: string - name: question dtype: string - name: answer dtype: string - name: image dtype: image splits: - name: test num_bytes: 33427252 num_examples: 196 download_size: 33423226 dataset_size: 33427252 configs: - config_name: CC-OCR-JA-Refined data_files: - split: test path: CC-OCR-JA-Refined/test-* - config_name: CVQA-JA-Refined data_files: - split: test path: CVQA-JA-Refined/test-* - config_name: Heron-Bench-Refined data_files: - split: test path: Heron-Bench-Refined/test-* - config_name: JA-Multi-Image-VQA-Refined data_files: - split: test path: JA-Multi-Image-VQA-Refined/test-* - config_name: JA-VLM-Bench-Refined data_files: - split: test path: JA-VLM-Bench-Refined/test-* - config_name: JDocQA-Refined data_files: - split: test path: JDocQA-Refined/test-* - config_name: JGraphQA-Refined data_files: - split: test path: JGraphQA-Refined/test-* default: true license: other language: - ja size_categories: - 1K<n<10K --- ## Overview JAMMEval is a curated benchmark collection for evaluating Vision-Language Models (VLMs) on Japanese Visual Question Answering (VQA) tasks. It is constructed by refining seven existing Japanese VQA evaluation datasets through two rounds of human annotation, with the goal of improving evaluation reliability and quality. ## Included Datasets JAMMEval consists of the following seven refined datasets: - CC-OCR-JA-Refined - CVQA-JA-Refined - Heron-Bench-Refined - JA-Multi-Image-VQA-Refined - JA-VLM-Bench-Refined - JDocQA-Refined - JGraphQA-Refined Each dataset is derived from its original version (without the `-Refined` suffix) through a systematic refinement process. ## Refinement Process The refinement focuses on addressing key issues found in the original datasets: - Ambiguity in questions or answers - Incorrect annotated answers - Questions solvable without visual input (i.e., not requiring the image) Through manual inspection and correction, JAMMEval improves the reliability of VLM evaluation, ensuring that tasks genuinely require multimodal understanding. ## Dataset Usage ```python from datasets import load_dataset ds = load_dataset("secret13/JAMMEval", "Heron-Bench-Refined", split="test") print(ds) ``` Example output: ``` Dataset({ features: ['original_id', 'image', 'question', 'answer', 'ocr', 'image_name'], num_rows: 145 }) ``` ## License Each dataset is derived from its original source dataset and is subject to the license terms of the original dataset. - [CC-OCR](https://arxiv.org/abs/2412.02210) - MIT - [CVQA](https://arxiv.org/abs/2406.05967): - > Note that each question has its own license. All data here is free to use for research purposes, but not every entry is permissible for commercial use. - [Heron-Bench](https://arxiv.org/abs/2404.07824) - > We have collected images that are either in the public domain or licensed under Creative Commons Attribution 1.0 (CC BY 1.0) or Creative Commons Attribution 2.0 (CC BY 2.0). Please refer to the LICENSE.md file for details on the licenses. - [JA-Multi-Image-VQA](https://huggingface.co/datasets/SakanaAI/JA-Multi-Image-VQA) - The images in this dataset are sourced from Unsplash and are free to use under the Unsplash License. They cannot be sold without significant modification and cannot be used to replicate similar or competing services. All other parts of this dataset, excluding the images, are licensed under the Apache 2.0 License. - [JA-VLM-Bench](https://huggingface.co/datasets/SakanaAI/JA-VLM-Bench-In-the-Wild) - > The images in this dataset are sourced from Unsplash and are free to use under the Unsplash License. They cannot be sold without significant modification and cannot be used to replicate similar or competing services. - [JDocQA](https://arxiv.org/abs/2403.19454) - > JDocQA dataset annotations are distributed under CC BY-SA 4.0. We are delighted to see many derivations from JDocQA! When you create any derivations, e.g., datasets, papers, etc, from JDocQA, please cite our paper accordingly. If your derivations are web-based projects, please cite our paper and include the link to this github page. - [JGraphQA](https://huggingface.co/datasets/r-g2-2024/JGraphQA) - License information is not clearly specified. Users should verify the original source before use. ⚠️ Since JAMMEval is a collection of datasets with different licenses, users must check the license of each individual dataset and each data entry (if applicable) before use. In particular, some datasets (e.g., CVQA) may include data that is restricted to non-commercial use. **Note on JDocQA-Refined Images** The images included in JDocQA-Refined must be used in compliance with Japanese copyright law: > "Use is permitted only within the scope defined by Article 30-4 of the Japanese Copyright Act." Users are responsible for ensuring that their use of these images complies with applicable regulations.
提供机构:
secret13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作