Qhuy204/VQA_VN_Destination

Name: Qhuy204/VQA_VN_Destination
Creator: Qhuy204
Published: 2025-12-09 09:27:54
License: 暂无描述

Hugging Face2025-12-09 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/Qhuy204/VQA_VN_Destination

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - vi task_categories: - visual-question-answering tags: - vietnam - tourism - travel - vqa size_categories: - about 860 destination --- # Vietnamese Travel Destination VQA Dataset ## Dataset Description This dataset contains Visual Question Answering (VQA) pairs for Vietnamese travel destinations. ### Dataset Statistics - **Total samples**: 29759 - **Number of shards**: 6 - **Total size**: 9799.20 MB - **Total QA Pairs**: ~1,160,600 ### Files - `train-00000-of-00006.parquet` (1486.67 MB, 5000 rows) - `train-00001-of-00006.parquet` (1635.00 MB, 5000 rows) - `train-00002-of-00006.parquet` (1587.41 MB, 5000 rows) - `train-00003-of-00006.parquet` (1784.27 MB, 5000 rows) - `train-00004-of-00006.parquet` (1641.47 MB, 5000 rows) - `train-00005-of-00006.parquet` (1664.38 MB, 4759 rows) ### Data Fields - `id` (string): Unique identifier for each sample - `image` (Image): Image of Vietnamese travel destination (embedded as bytes) - `conversations` (string): JSON string containing question-answer pairs in Vietnamese ### Data Format Each `conversations` field is a JSON array with `role/content` format: ```json [ {"role": "user", "content": "Nội dung chính của bức ảnh là gì?"}, {"role": "assistant", "content": "Bức ảnh chụp cảnh đẹp Việt Nam..."} ] ``` This format is compatible with: - OpenAI Chat format - LLaVA training - LLaMA-style instruction tuning - Most modern VLM frameworks ## Usage ### Load with datasets library ```python from datasets import load_dataset import json # Load full dataset dataset = load_dataset("Qhuy204/VQA_VN_Destination") # Access and parse conversations for item in dataset['train']: print(item['id']) item['image'].show() # Parse conversations JSON conversations = json.loads(item['conversations']) for turn in conversations: print(f"{turn['role']}: {turn['content']}") ``` ### Use with LLaVA/VLM training ```python from datasets import load_dataset import json dataset = load_dataset("Qhuy204/VQA_VN_Destination") def format_for_llava(example): '''Convert to LLaVA training format''' convs = json.loads(example['conversations']) return { 'image': example['image'], 'conversations': convs # Already in correct format! } # Ready for training dataset = dataset.map(format_for_llava) ``` ### Stream without downloading ```python from datasets import load_dataset # Stream mode - memory efficient dataset = load_dataset("Qhuy204/VQA_VN_Destination", streaming=True) for item in dataset['train']: # Process each item without loading entire dataset pass ``` ### Load specific shard ```python from datasets import load_dataset # Load only first shard dataset = load_dataset( "Qhuy204/VQA_VN_Destination", data_files="data/train-00000-of-00006.parquet" ) ``` ## Technical Details - **Image Storage**: Images are embedded as binary data in Parquet files - **Compression**: ZSTD level 3 for optimal size/speed tradeoff - **Sharding**: Multiple shards enable parallel downloading and streaming - **Format**: Apache Parquet with Hugging Face metadata - **Row Group Size**: 100 rows per group for optimal Dataset Viewer performance ## Citation If you use this dataset, please cite: ```bibtex @dataset{vietnamese_travel_vqa, title={Vietnamese Travel Destination VQA Dataset}, author={Qhuy204}, year={2025}, publisher={Hugging Face}, url={https://huggingface.co/datasets/Qhuy204/VQA_VN_Destination}, license={CC BY-SA 4.0}, source_notes={Images collected from multiple public domain travel blogs and sites, processed by the author.} } ``` ## License CC BY-SA 4.0 ## Contact For questions or issues, please open an issue on the dataset repository.

提供机构：

Qhuy204

5,000+

优质数据集

54 个

任务类型

进入经典数据集