iiCEMAN/BanglaMedVQA
收藏Hugging Face2026-04-08 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/iiCEMAN/BanglaMedVQA
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: image
dtype: image
- name: image_id
dtype: string
- name: sub_dataset
dtype: string
- name: category
dtype: string
- name: category_bn
dtype: string
- name: modality
dtype: string
- name: organ
dtype: string
- name: question_en
dtype: string
- name: answer_en
dtype: string
- name: question_bn
dtype: string
- name: answer_bn
dtype: string
splits:
- name: chest_xray_train
num_bytes: 1422736473.5
num_examples: 3500
- name: chest_xray_test
num_bytes: 80214865.0
num_examples: 200
- name: medicat_train
num_bytes: 1389275722.5
num_examples: 3500
- name: medicat_test
num_bytes: 82104479.0
num_examples: 200
download_size: 3001794660
dataset_size: 2974331540.0
configs:
- config_name: default
data_files:
- split: chest_xray_train
path: data/chest_xray_train-*
- split: chest_xray_test
path: data/chest_xray_test-*
- split: medicat_train
path: data/medicat_train-*
- split: medicat_test
path: data/medicat_test-*
---
# Medical VQA LoRA Training
Fine-tuning `google/medgemma-4b-it` vision-language model using FP32 LoRA with LLaMA Factory for Bengali medical question answering.
## 📁 Repository Structure
```
med-vqa-lora/
├── configs/
│ ├── qwen2.5-vl-7b.yaml # Training configuration
│ ├── qwen2.5-vl-7b-test.yaml # Testing configuration
│ ├── med-gemma-4b.yaml # Legacy MedGemma training config
│ └── med-gemma-4b-test.yaml # Legacy MedGemma testing config
├── data/
│ ├── dataset_info.json # Master dataset registry
│ ├── chest_x-ray/
│ │ ├── train/
│ │ │ ├── images/ # Training images
│ │ │ ├── chest_x-ray.csv # Training data
│ │ │ └── chest_x-ray_dataset.json
│ │ └── test/
│ │ ├── images/ # Test images
│ │ ├── chest_x-ray.csv # Test data
│ │ └── chest_x-ray_dataset.json
│ └── medicat/
│ └── train/
│ ├── images/ # Medicat training images
│ ├── medicat.csv # Medicat training data
│ └── medicat_dataset.json
├── output/ # Model outputs and predictions
├── prepare_data.py # Data preparation script
├── train_model.py # Main training/testing script
├── requirements.txt # Python dependencies
├── Dockerfile # Container setup
└── README.md # This file
```
## 🚀 Usage
### 1. Build Docker Container
```bash
docker build -t med-vqa:latest .
```
### 2. Prepare Data
```bash
# Prepare all datasets
docker run --rm -v $(pwd):/app med-vqa:latest python prepare_data.py --all
# Prepare specific dataset
docker run --rm -v $(pwd):/app med-vqa:latest python prepare_data.py --dataset chest_x-ray --split train
```
### 3. Run Training
```bash
docker run --gpus all --ipc=host --rm --env-file .env -v $(pwd):/app med-vqa:latest python train_model.py --config configs/med-gemma-4b.yaml
```
### 4. Run Testing
```bash
docker run --gpus all --ipc=host --rm --env-file .env -v $(pwd):/app med-vqa:latest python train_model.py --config configs/med-gemma-4b-test.yaml
```
### 5. Complete Pipeline
```bash
# Train and test in sequence
docker run --gpus all --ipc=host --rm --env-file .env -v $(pwd):/app med-vqa:latest python train_model.py --config configs/med-gemma-4b.yaml
docker run --gpus all --ipc=host --rm --env-file .env -v $(pwd):/app med-vqa:latest python train_model.py --config configs/med-gemma-4b-test.yaml
```
## ⚙️ Configuration
### Training (`configs/med-gemma-4b.yaml`)
- Model: `google/medgemma-4b-it`
- LoRA Rank: 16, Alpha: 32
- Batch Size: 2 per device, 2 gradient accumulation
- Template: `gemma3`
### Testing (`configs/med-gemma-4b-test.yaml`)
- Loads trained adapter from `./output`
- Generates predictions for test dataset
## 📝 Data Preparation
```bash
# Prepare all datasets
python prepare_data.py --all
# Prepare specific dataset
python prepare_data.py --dataset chest_x-ray --split train
# Custom paths
python prepare_data.py --dataset chest_x-ray --split test --csv data/chest_x-ray/test/chest_x-ray.csv --images data/chest_x-ray/test/images --output data/chest_x-ray/test/chest_x-ray_dataset.json
```
## 🎯 Output Files
### Training Outputs
- `output/adapter_model.safetensors` - LoRA weights (119MB)
- `output/adapter_config.json` - LoRA configuration
- `output/checkpoint-32/` - Training checkpoint
- `output/train_results.json` - Training metrics
### Test Predictions
- `output/chest_x-ray_test_predictions.csv` - Test predictions with columns:
- `image_id`, `image_path`, `category`, `category_bn`
- `question`, `question_bn`, `llm_answer`, `llm_answer_bn`
- `predicted_answer_bn` - Model's predictions
## 🔍 Example Usage
### Sample Predictions
```csv
image_id,image_path,question_bn,llm_answer_bn,predicted_answer_bn
test_001,data/chest_x-ray/test/images/00015953_015.png,এখানে কোন নির্দিষ্ট অবস্থা চিহ্নিত করা হয়েছে?,কোন নির্দিষ্ট অবস্থা চিহ্নিত হয়নি।,কোন নির্দিষ্ট অবস্থা চিহ্নিত করা হয়নি।
test_002,data/chest_x-ray/test/images/00011237_094.png,অনুপ্রবেশটি কোথায় অবস্থিত?,মধ্য ডান,মধ্য বাম
```
### Model Loading
```python
from transformers import AutoModelForCausalLM, AutoProcessor
from peft import PeftModel
model = AutoModelForCausalLM.from_pretrained("google/medgemma-4b-it")
processor = AutoProcessor.from_pretrained("google/medgemma-4b-it")
model = PeftModel.from_pretrained(model, "./output")
```
提供机构:
iiCEMAN



