Qhuy204/VQA_VN_Destination
收藏Hugging Face2025-12-09 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/Qhuy204/VQA_VN_Destination
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- vi
task_categories:
- visual-question-answering
tags:
- vietnam
- tourism
- travel
- vqa
size_categories:
- about 860 destination
---
# Vietnamese Travel Destination VQA Dataset
## Dataset Description
This dataset contains Visual Question Answering (VQA) pairs for Vietnamese travel destinations.
### Dataset Statistics
- **Total samples**: 29759
- **Number of shards**: 6
- **Total size**: 9799.20 MB
- **Total QA Pairs**: ~1,160,600
### Files
- `train-00000-of-00006.parquet` (1486.67 MB, 5000 rows)
- `train-00001-of-00006.parquet` (1635.00 MB, 5000 rows)
- `train-00002-of-00006.parquet` (1587.41 MB, 5000 rows)
- `train-00003-of-00006.parquet` (1784.27 MB, 5000 rows)
- `train-00004-of-00006.parquet` (1641.47 MB, 5000 rows)
- `train-00005-of-00006.parquet` (1664.38 MB, 4759 rows)
### Data Fields
- `id` (string): Unique identifier for each sample
- `image` (Image): Image of Vietnamese travel destination (embedded as bytes)
- `conversations` (string): JSON string containing question-answer pairs in Vietnamese
### Data Format
Each `conversations` field is a JSON array with `role/content` format:
```json
[
{"role": "user", "content": "Nội dung chính của bức ảnh là gì?"},
{"role": "assistant", "content": "Bức ảnh chụp cảnh đẹp Việt Nam..."}
]
```
This format is compatible with:
- OpenAI Chat format
- LLaVA training
- LLaMA-style instruction tuning
- Most modern VLM frameworks
## Usage
### Load with datasets library
```python
from datasets import load_dataset
import json
# Load full dataset
dataset = load_dataset("Qhuy204/VQA_VN_Destination")
# Access and parse conversations
for item in dataset['train']:
print(item['id'])
item['image'].show()
# Parse conversations JSON
conversations = json.loads(item['conversations'])
for turn in conversations:
print(f"{turn['role']}: {turn['content']}")
```
### Use with LLaVA/VLM training
```python
from datasets import load_dataset
import json
dataset = load_dataset("Qhuy204/VQA_VN_Destination")
def format_for_llava(example):
'''Convert to LLaVA training format'''
convs = json.loads(example['conversations'])
return {
'image': example['image'],
'conversations': convs # Already in correct format!
}
# Ready for training
dataset = dataset.map(format_for_llava)
```
### Stream without downloading
```python
from datasets import load_dataset
# Stream mode - memory efficient
dataset = load_dataset("Qhuy204/VQA_VN_Destination", streaming=True)
for item in dataset['train']:
# Process each item without loading entire dataset
pass
```
### Load specific shard
```python
from datasets import load_dataset
# Load only first shard
dataset = load_dataset(
"Qhuy204/VQA_VN_Destination",
data_files="data/train-00000-of-00006.parquet"
)
```
## Technical Details
- **Image Storage**: Images are embedded as binary data in Parquet files
- **Compression**: ZSTD level 3 for optimal size/speed tradeoff
- **Sharding**: Multiple shards enable parallel downloading and streaming
- **Format**: Apache Parquet with Hugging Face metadata
- **Row Group Size**: 100 rows per group for optimal Dataset Viewer performance
## Citation
If you use this dataset, please cite:
```bibtex
@dataset{vietnamese_travel_vqa,
title={Vietnamese Travel Destination VQA Dataset},
author={Qhuy204},
year={2025},
publisher={Hugging Face},
url={https://huggingface.co/datasets/Qhuy204/VQA_VN_Destination},
license={CC BY-SA 4.0},
source_notes={Images collected from multiple public domain travel blogs and sites, processed by the author.}
}
```
## License
CC BY-SA 4.0
## Contact
For questions or issues, please open an issue on the dataset repository.
提供机构:
Qhuy204



