typhoon-vision-preview-data
收藏魔搭社区2025-11-12 更新2025-05-24 收录
下载链接:
https://modelscope.cn/datasets/scb10x/typhoon-vision-preview-data
下载链接
链接失效反馈官方服务:
资源简介:
# Typhoon Vision Preview Data
## Dataset Overview
This dataset is designed for visual question-answering and image-to-text tasks, supporting both Thai (th) and English (en) languages.
## Data Source
The dataset is based on the Bunny image dataset. You can download the original images from [here](https://huggingface.co/datasets/BoyaWu10/Bunny-v1_0-data).
## Dataset Splits
The dataset is organized into multiple splits, available as Hugging Face datasets:
1. **Pretrain**: Used for pretraining the adapter in LLaVA format.
2. **Finetune**: Used for finetuning in LLaVA format.
3. **Finetune_translated_stats**: Contains original texts, their Thai translations, and COMET scores (translation quality estimation).
### Pretraining Set
- Comprises the original Bunny data.
- Includes an additional 10% of translated data appended to the original set.
- COMET QE scores were not computed for this set.
### Finetuning Set
- Based on the same structure as the pretraining set.
- The appended 10% data consists of top-performing translations, as determined by COMET scores.
## File Descriptions
- `pretrain.json`: Dataset for pretraining the adapter in LLaVA format.
- `finetune.json`: Dataset for finetuning in LLaVA format.
- `finetune_translated_stats.json`: Contains original texts, Thai translations, and COMET scores.
## Usage Notes
- The dataset is designed for use with the LLaVA (Large Language and Vision Assistant) format.
- When using the finetuning set, be aware that it includes high-quality translations based on COMET scores.
- The pretraining set can be used for initial model adaptation, while the finetuning set is optimized for final model tuning.
## Language Support
This dataset supports bilingual tasks:
- Thai (th)
- English (en)
Researchers and developers can use this dataset for tasks involving both languages, especially for cross-lingual visual question-answering and image-to-text generation.
# 台风视觉预览数据集
## 数据集概览
本数据集面向视觉问答与图像到文本任务,支持泰语(th)与英语(en)两种语言。
## 数据来源
本数据集以Bunny图像数据集为基础构建,原始图像可从[此处](https://huggingface.co/datasets/BoyaWu10/Bunny-v1_0-data)下载。
## 数据集划分
本数据集组织为多个子集,可作为Hugging Face数据集获取:
1. **预训练(Pretrain)**:用于大语言视觉助手(Large Language and Vision Assistant,LLaVA)格式适配器的预训练。
2. **微调(Finetune)**:用于LLaVA格式的模型微调。
3. **Finetune_translated_stats**:包含原始文本、泰语译文以及COMET(翻译质量评估)分数。
### 预训练子集
- 包含原始Bunny数据集的数据。
- 向原始数据集追加10%的译文数据。
- 该子集未计算COMET QE评估分数。
### 微调子集
- 结构与预训练子集保持一致。
- 追加的10%数据为经COMET分数筛选的高质量译文。
## 文件说明
- `pretrain.json`:用于LLaVA格式适配器预训练的数据集。
- `finetune.json`:用于LLaVA格式模型微调的数据集。
- `finetune_translated_stats.json`:包含原始文本、泰语译文与COMET分数的文件。
## 使用须知
- 本数据集专为适配LLaVA(大语言视觉助手)格式设计。
- 使用微调子集时,请注意其包含经COMET分数筛选的高质量译文。
- 预训练子集可用于模型的初始适配,微调子集则专为最终模型调优优化。
## 语言支持
本数据集支持双语任务:
- 泰语(th)
- 英语(en)
研究人员与开发者可将其用于涉及两种语言的相关任务,尤其适用于跨语言视觉问答及图像到文本生成任务。
提供机构:
maas
创建时间:
2025-05-23



