JA-VG-VQA-500
收藏魔搭社区2025-12-26 更新2025-01-18 收录
下载链接:
https://modelscope.cn/datasets/SakanaAI/JA-VG-VQA-500
下载链接
链接失效反馈官方服务:
资源简介:
# JA-VG-VQA-500
## Dataset Description
**JA-VG-VQA-500** is a 500-sample subset of [Japanese Visual Genome VQA dataset](https://github.com/yahoojapan/ja-vg-vqa).
This dataset was used in the evaluation of [EvoVLM-JP-v1-7B](https://huggingface.co/SakanaAI/EvoVLM-JP-v1-7B).
Please refer to our [report](https://arxiv.org/abs/2403.13187) and [blog](https://sakana.ai/evolutionary-model-merge/) for more details.
We are grateful to the developers for making the dataset available under [Creative Commons Attribution 4.0 License](https://creativecommons.org/licenses/by/4.0/legalcode).
- [Visual Genome](https://homes.cs.washington.edu/~ranjay/visualgenome/index.html)
- [Japanese Visual Genome VQA dataset](https://github.com/yahoojapan/ja-vg-vqa)
## Usage
Use the code below to get started with the dataset.
```python
from datasets import load_dataset
dataset = load_dataset("SakanaAI/JA-VG-VQA-500", split="test")
```
See [our GitHub repository](https://github.com/SakanaAI/evolutionary-model-merge) to evaluate Japanese VLMs.
## Acknowledgement
We would like to thank the developers of the source datasets for their contributions and for making their work available.
## Citation
```bibtex
@article{Krishna2016VisualGC,
title = {Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations},
author. = {Ranjay Krishna and Yuke Zhu and Oliver Groth and Justin Johnson and Kenji Hata and Joshua Kravitz and Stephanie Chen and Yannis Kalantidis and Li-Jia Li and David A. Shamma and Michael S. Bernstein and Li Fei-Fei},
journal = {International Journal of Computer Vision},
year. = {2017},
volume. = {123},
pages. = {32-73},
URL = {https://doi.org/10.1007/s11263-016-0981-7},
doi = {10.1007/s11263-016-0981-7}
}
```
```bibtex
@InProceedings{C18-1163,
author = "Shimizu, Nobuyuki and Rong, Na and Miyazaki, Takashi",
title = "Visual Question Answering Dataset for Bilingual Image Understanding: A Study of Cross-Lingual Transfer Using Attention Maps",
booktitle = "Proceedings of the 27th International Conference on Computational Linguistics",
year = "2018",
publisher = "Association for Computational Linguistics",
pages = "1918--1928",
location = "Santa Fe, New Mexico, USA",
url = "http://aclweb.org/anthology/C18-1163"
}
```
# JA-VG-VQA-500
## 数据集描述
**JA-VG-VQA-500** 是[日语视觉基因组视觉问答数据集(Japanese Visual Genome VQA dataset)](https://github.com/yahoojapan/ja-vg-vqa)的500样本子集。
本数据集被用于评估[EvoVLM-JP-v1-7B](https://huggingface.co/SakanaAI/EvoVLM-JP-v1-7B)。
如需了解更多细节,请参阅我们的[研究报告](https://arxiv.org/abs/2403.13187)与[技术博客](https://sakana.ai/evolutionary-model-merge/)。
我们感谢数据集开发者将其基于[知识共享署名4.0许可协议(Creative Commons Attribution 4.0 License)](https://creativecommons.org/licenses/by/4.0/legalcode)发布。
- [视觉基因组(Visual Genome)](https://homes.cs.washington.edu/~ranjay/visualgenome/index.html)
- [日语视觉基因组视觉问答数据集(Japanese Visual Genome VQA dataset)](https://github.com/yahoojapan/ja-vg-vqa)
## 使用方法
使用以下代码快速上手本数据集:
python
from datasets import load_dataset
dataset = load_dataset("SakanaAI/JA-VG-VQA-500", split="test")
如需评估日语视觉语言模型(Vision-Language Model, VLM),请访问我们的[GitHub仓库](https://github.com/SakanaAI/evolutionary-model-merge)。
## 致谢
我们谨向源数据集的开发者致以谢意,感谢其贡献与开源分享其研究成果。
## 引用
bibtex
@article{Krishna2016VisualGC,
title = {Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations},
author. = {Ranjay Krishna and Yuke Zhu and Oliver Groth and Justin Johnson and Kenji Hata and Joshua Kravitz and Stephanie Chen and Yannis Kalantidis and Li-Jia Li and David A. Shamma and Michael S. Bernstein and Li Fei-Fei},
journal = {International Journal of Computer Vision},
year. = {2017},
volume. = {123},
pages. = {32-73},
URL = {https://doi.org/10.1007/s11263-016-0981-7},
doi = {10.1007/s11263-016-0981-7}
}
bibtex
@InProceedings{C18-1163,
author = "Shimizu, Nobuyuki and Rong, Na and Miyazaki, Takashi",
title = "Visual Question Answering Dataset for Bilingual Image Understanding: A Study of Cross-Lingual Transfer Using Attention Maps",
booktitle = "Proceedings of the 27th International Conference on Computational Linguistics",
year = "2018",
publisher = "Association for Computational Linguistics",
pages = "1918--1928",
location = "Santa Fe, New Mexico, USA",
url = "http://aclweb.org/anthology/C18-1163"
}
提供机构:
maas
创建时间:
2025-01-17



