VQA² Dataset
收藏VQA²-Visual-Question-Answering-for-Video-Quality-Assessment
数据集概述
- 数据集名称: VQA²
- 数据集用途: 用于视频质量评估的视觉问答(Visual Question Answering, VQA)模型和数据集。
数据集构建流程
- 流程图: pipeline_00.png
模型结构
- 结构图: model_00.png
快速开始
-
依赖安装: shell cd VQA_main conda create -n VQA python=3.10 -y conda activate VQA pip install --upgrade pip pip install -e ".[train]" pip install pytorchvideo pip install transformers==4.44.0
-
注意事项: 替换
VQA/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py为VQA_main/modeling_qwen2.py。
VQA² 评分器
-
UGC 视频评分: shell python ./llava/eval/model_score_UGC.py
-
流媒体视频评分: shell python ./llava/eval/model_score_streaming.py
VQA² 助手
-
Q-bench-video 评估: shell python ./llava/eval/model_vqa_q_bench_video.py
-
简单问答: shell python ./llava/eval/model_conv.py
-
Gradio 演示: shell python ./app.py
训练
-
训练脚本: shell cd VQA_main chmod +x ./scripts/train/finetune_VQA².sh
-
注意事项: 仅支持
per_device_train_batch_size=1的训练。
模型库
- VQA²-UGC-Scorer(7B): q-future/VQA-UGC-Scorer
- VQA²-Streaming-Scorer(7B): q-future/VQA-Streaming-Scorer
- VQA²-Assistant(7B): q-future/VQA-Assistant
引用
-
VQA²: bibtex @article{jia2024vqa, title={VQA $^{} 2$: Visual Question Answering for Video Quality Assessment}, author={Jia, Ziheng and Zhang, Zicheng and Qian, Jiaying and Wu, Haoning and Sun, Wei and Li, Chunyi and Liu, Xiaohong and Lin, Weisi and Zhai, Guangtao and Min, Xiongkuo}, journal={arXiv preprint arXiv:2411.03795}, year={2024} }
-
Q-Bench-Video: bibtex @article{zhang2024q, title={Q-Bench-Video: Benchmarking the Video Quality Understanding of LMMs}, author={Zhang, Zicheng and Jia, Ziheng and Wu, Haoning and Li, Chunyi and Chen, Zijian and Zhou, Yingjie and Sun, Wei and Liu, Xiaohong and Min, Xiongkuo and Lin, Weisi and others}, journal={arXiv preprint arXiv:2409.20063}, year={2024} }




