five

pbwpbw/tiny_llavavideo

收藏
Hugging Face2026-03-19 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/pbwpbw/tiny_llavavideo
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - video-text-to-text --- **<center><span style="font-size:2em;">TinyLLaVA-Video</span></center>** [![arXiv](https://img.shields.io/badge/Arxiv-2501.15513-b31b1b.svg?logo=arXiv)](https://arxiv.org/abs/2501.15513)[![Github](https://img.shields.io/badge/Github-Github-blue.svg)](https://github.com/ZhangXJ199/TinyLLaVA-Video)[![Hugging Face Paper](https://img.shields.io/badge/Hugging%20Face-Paper-blue)](https://huggingface.co/papers/2501.15513) This dataset combines data from multiple sources for pre-training and fine-tuning. **Pretrain Data:** Four subsets of LLaVA-Video-178K (`0_30_s_academic_v0_1`, `30_60_s_academic_v0_1`, `0_30_s_youtube_v0_1`, `30_60_s_youtube_v0_1`), supplemented with filtered Video-LLaVA data ([https://huggingface.co/datasets/LanguageBind/Video-LLaVA](https://huggingface.co/datasets/LanguageBind/Video-LLaVA)) and data from Valley ([https://github.com/RupertLuo/Valley](https://github.com/RupertLuo/Valley)). The video data can be downloaded from the linked datasets, and cleaned annotations are provided within this dataset. **Finetune Data:** Four subsets of LLaVA-Video-178K (`0_30_s_academic_v0_1`, `30_60_s_academic_v0_1`, `0_30_s_youtube_v0_1`, `30_60_s_youtube_v0_1`). Cleaned annotations are provided; video data is available via the LLaVA-Video-178K dataset ([https://huggingface.co/datasets/lmms-lab/LLaVA-Video-178K](https://huggingface.co/datasets/lmms-lab/LLaVA-Video-178K)). The data is organized as follows: ```Shell dataset ├── academic_source ├── liwei_youtube_videos ├── valley ├── text_files │ ├── cleaned_video_caption.json │ ├── cleaned_video_openqa.json ``` **Note:** If there is any infringement, please contact us for removal. Please refer to the Github repository for detailed instructions on data usage and training.
提供机构:
pbwpbw
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作