five

llm-jp/WAON-Bench

收藏
Hugging Face2026-04-13 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/llm-jp/WAON-Bench
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: class dtype: string - name: url dtype: string - name: category dtype: string splits: - name: train num_bytes: 218995 num_examples: 1870 download_size: 177502 dataset_size: 218995 configs: - config_name: default data_files: - split: train path: data/train-* --- # WAON-Bench: Japanese Cultural Image Classification Dataset <div align="center" style="line-height: 1;"> | <a href="https://huggingface.co/collections/llm-jp/waon" target="_blank">🤗 HuggingFace</a> &nbsp;| <a href="https://arxiv.org/abs/2510.22276" target="_blank">📄 Paper</a> &nbsp;| <a href="https://github.com/llm-jp/WAON" target="_blank">🧑‍💻 Code</a> &nbsp;| <br/> </div> <img src="WAON-Bench.jpg" alt="Overview of WAON-Bench" width="100%"/> WAON-Bench is a manually curated image classification dataset designed to benchmark Vision-Language models on Japanese culture. The dataset contains 374 classes across 8 categories (animals, buildings, events, everyday life, food, nature, scenery, and traditions), with 5 images per class, totaling 1,870 examples. ## How to Use ⚠️ This repository is a mirror of the original dataset hosted at https://gitlab.llm-jp.nii.ac.jp/datasets/WAON-Bench Due to copyright restrictions, WAON-Bench dataset with images is hosted only on a domestic server and are not included in this mirror. To use WAON-Bench, first download the dataset from: ```python git clone https://gitlab.llm-jp.nii.ac.jp/datasets/WAON-Bench mv WAON-Bench/data . ``` After placing the dataset directory locally, you can load each dataset using the 🤗 datasets library: ```python from datasets import load_dataset ds = load_dataset("data") ``` ## Data Collection Pipeline We followed the pipeline below to construct the dataset: 1. **Class Definition**: A total of 374 class names were manually defined and grouped into eight top-level categories: animal, building, event, everyday, food, nature, scenery, and tradition. 2. **Image Selection**: For each class, 5 images were manually retrieved using Google Image Search. \ Images were selected based on the following criteria: - The image should clearly represent the intended class. - It should not contain elements that could be easily confused with other classes. ## Dataset Format Each sample includes: - `class`: Class name - `url`: Image URL - `category`: Class category Example: ``` {'class': '柴犬', 'url': 'https://img.wanqol.com/2020/11/6e489894-main.jpg?auto=format', 'category': 'animal'} ``` ## Dataset Statistics - **Total classes**: 374 - **Total images**: 1,870 - **Class num per category** | **class** | animal | building | event | everyday | food | nature | scenery | tradition | total | |----------:|-------:|---------:|------:|---------:|-----:|-------:|--------:|----------:|------:| | **count** | 41 | 40 | 29 | 45 | 55 | 27 | 75 | 62 | 374 | - **Example Class Names per Category** |category | class names| |:-----------|--------:| | animal | '柴犬', 'エゾシカ', 'ニホンカモシカ', 'イノシシ', ...| | building | '鳥居', '茶室', '合掌造り', '町家', '縁側', ...| | event | '花見', '花火大会', '盆踊り', '運動会', '卒業式', '成人式', ...| | everyday | 'カラオケ', '温泉', '屋台', '洗濯物', '敷布団', ...| | food | '茄子', 'しらす', 'ラーメン', '焼き鳥', '焼肉', ...| | nature | '桜', '梅', '藤', '松, '噴火', ...| | scenery | '茶畑', '雪国の街並み', '漁港', '砂防ダム', '石垣', ...| | tradition| '華道', 剣道', '柔道', '弓道', ...| - **t-SNE Visualization of SigLIP2 Embeddings** The figure below shows a 2D t-SNE projection of image embeddings generated using [google/siglip2-base-patch16-256](https://huggingface.co/google/siglip2-base-patch16-256). Each point represents one image in the dataset. <img src="siglip_tsne_visualization.png" alt="t-SNE Visualization" width="50%"/> ## LICENSE This dataset is licensed under the Apache License 2.0. ## Citation ```bibtex @misc{sugiura2025waonlargescalehighqualityjapanese, title={WAON: Large-Scale and High-Quality Japanese Image-Text Pair Dataset for Vision-Language Models}, author={Issa Sugiura and Shuhei Kurita and Yusuke Oda and Daisuke Kawahara and Yasuo Okabe and Naoaki Okazaki}, year={2025}, eprint={2510.22276}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2510.22276}, } ```
提供机构:
llm-jp
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作