five

HiTZ/pixmo-ask-model-anything_eu

收藏
Hugging Face2026-03-02 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/HiTZ/pixmo-ask-model-anything_eu
下载链接
链接失效反馈
官方服务:
资源简介:
# Pixmo-ask-model-anything-eu (Basque Translation) ## 📚 Overview *pixmo-ask-model-anything-eu* is a **Basque** version of the original *pixmo-ask-model-anything* multimodal question-answering dataset. Every English QA pair was translated into Basque with **HiTZ/Latxa-Llama-3.1-70B-Instruct**. **Important:** This is **not the official dataset**. It is an independent community translation intended to broaden access for Basque-speaking researchers and practitioners. ## ✍️ Authors & Acknowledgements - **Original dataset:** *pixmo-ask-model-anything* — © 2024 AllenAI - **Basque translation & curation:** <Lukas Arana / HiTZ>, 2025 • Automatic translation with Latxa-Llama-70B If you use this Basque split, please cite both the original dataset and this translation. The JSONL schema is changed from the original dataset. 1. image : Name of the image in the original dataset. 2. Id: Unique Id for each sample 3. Conversations: Question and Ground Truth result in conversation format. ## 🔧 How We Built It 1. **MT** – Each English question and answer translated with HiTZ/Latxa-Llama-3.1-70B-Instruct No images were added or removed. ## 🚦 Limitations & Ethical Considerations - **Non-official**: Pixmo AI has not reviewed or endorsed this edition; subtle meaning shifts may remain. - **Model biases**: HiTZ/Latxa-Llama-3.1-70B-Instruct may amplify biases from either the source data or the MT model. ## 💻 Quick Start ``` from datasets import load_dataset ds = load_dataset("lukasArana/pixmo-ask-model-anything", split="eu") ``` All fields mirror the English original; only textual content is translated. ## 📜 License The translated files inherit the **same license** as the upstream dataset (CC-BY-SA-4.0). By downloading or using this repository, you agree to comply with that license, including proper attribution to both Pixmo AI and the Basque translation authors. ## 🏷️ Citation @misc{arana2025multimodallargelanguagemodels, title={Multimodal Large Language Models for Low-Resource Languages: A Case Study for Basque}, author={Lukas Arana and Julen Etxaniz and Ander Salaberria and Gorka Azkune}, year={2025}, eprint={2511.09396}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2511.09396}, }
提供机构:
HiTZ
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作