HIT-TMG/VideoVista-CoTs
收藏Hugging Face2025-12-10 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/HIT-TMG/VideoVista-CoTs
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- video-text-to-text
language:
- en
---
# VideoVista-CoTs
This repository contains VideoVista-CoTs, used in [Uni-MoE-2.0](https://huggingface.co/HIT-TMG/Uni-MoE-2.0-Omni) training.
This dataset samples a portion of data from [LLaVA-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-Video-178K), [SEED-Bench-R1](https://huggingface.co/datasets/TencentARC/SEED-Bench-R1), [SR-91K](https://huggingface.co/datasets/RUBBISHLIKE/SpaceR-151k), and [STAR](https://bobbywu.com/STAR/), and uses our automatic Video QA generation framework to perform multi-step reasoning annotations for filtered complex questions.
The automatic video QA generation codes and our VideoVista series are presented in [VideoVista Family](https://github.com/HITsz-TMG/VideoVista)
# Citation
If you find VideoVista-CulturalLingo useful for your research and applications, please cite using this BibTeX:
```bibtex
@article{chen2025videovista,
title={VideoVista-CulturalLingo: 360$\^{}$\backslash$circ $ Horizons-Bridging Cultures, Languages, and Domains in Video Comprehension},
author={Chen, Xinyu and Li, Yunxin and Shi, Haoyuan and Hu, Baotian and Luo, Wenhan and Wang, Yaowei and Zhang, Min},
journal={arXiv preprint arXiv:2504.17821},
year={2025}
}
@article{li2024videovista,
title={Videovista: A versatile benchmark for video understanding and reasoning},
author={Li, Yunxin and Chen, Xinyu and Hu, Baotian and Wang, Longyue and Shi, Haoyuan and Zhang, Min},
journal={arXiv preprint arXiv:2406.11303},
year={2024}
}
```
提供机构:
HIT-TMG



