five

VCG-plus_112K

收藏
魔搭社区2025-11-02 更新2025-03-22 收录
下载链接:
https://modelscope.cn/datasets/MBZUAI/VCG-plus_112K
下载链接
链接失效反馈
官方服务:
资源简介:
# 👁️ VCG+ 112K Dataset --- ## 📝 Description Video-ChatGPT introduces the VideoInstruct100K dataset, which employs a semi-automatic annotation pipeline to generate 75K instruction-tuning QA pairs. To address the limitations of this annotation process, we present VCG+112K dataset developed through an improved annotation pipeline. Our approach improves the accuracy and quality of instruction tuning pairs by improving keyframe extraction, leveraging SoTA large multimodal models (LMMs) for detailed descriptions, and refining the instruction generation strategy. <p align="center"> <img src="vcg-plus112k.png" alt="Contributions"> </p> ## 💻 Download To get started, follow these steps: ``` git lfs install git clone https://huggingface.co/MBZUAI/VCG-plus_112K ``` ## 💻 Download Videos The videos can be downloaded from [this link](https://huggingface.co/datasets/MBZUAI/video_annotation_pipeline/blob/main/activitynet_videos.tgz). ## 📚 Dataset Annotation Pipeline We have released our semi-automatic dataset annotation pipeline as well, which is available at [Dataset Annotation Pipeline](https://huggingface.co/datasets/MBZUAI/video_annotation_pipeline). ## 📚 Additional Resources - **Paper:** [ArXiv](https://arxiv.org/abs/2406.09418). - **GitHub Repository:** For training and updates: [GitHub - GLaMM](https://github.com/mbzuai-oryx/VideoGPT-plus). - **HuggingFace Collection:** For downloading the pretrained checkpoints, VCGBench-Diverse Benchmarks and Training data, visit [HuggingFace Collection - VideoGPT+](https://huggingface.co/collections/MBZUAI/videogpt-665c8643221dda4987a67d8d). ## 📜 Citations and Acknowledgments ```bibtex @article{Maaz2024VideoGPT+, title={VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding}, author={Maaz, Muhammad and Rasheed, Hanoona and Khan, Salman and Khan, Fahad Shahbaz}, journal={arxiv}, year={2024}, url={https://arxiv.org/abs/2406.09418} }

# 👁️ VCG+ 112K 数据集 --- ## 📝 数据集概况 Video-ChatGPT 推出了 VideoInstruct100K 数据集,该数据集采用半自动标注流程生成了75K条指令微调问答(QA)对。为解决该标注流程存在的局限性,我们提出了经优化标注流程构建的 VCG+112K 数据集。本方法通过优化关键帧提取流程、借助当前最优的大多模态模型(Large Multimodal Models, LMMs)生成详细描述,并优化指令生成策略,从而有效提升了指令微调样本对的准确性与质量。 <p align="center"> <img src="vcg-plus112k.png" alt="研究贡献"> </p> ## 💻 数据集下载 如需部署使用,请遵循以下步骤: git lfs install git clone https://huggingface.co/MBZUAI/VCG-plus_112K ## 💻 视频资源下载 数据集配套视频可通过[此链接](https://huggingface.co/datasets/MBZUAI/video_annotation_pipeline/blob/main/activitynet_videos.tgz)获取。 ## 📚 数据集标注流程 我们同时开源了该半自动数据集标注流程,其仓库地址为[数据集标注流程](https://huggingface.co/datasets/MBZUAI/video_annotation_pipeline)。 ## 📚 补充资源 - **论文**:[ArXiv](https://arxiv.org/abs/2406.09418) - **GitHub 仓库**:用于模型训练与版本更新:[GitHub - GLaMM](https://github.com/mbzuai-oryx/VideoGPT-plus) - **HuggingFace 合集**:如需下载预训练检查点(checkpoint)、VCGBench-Diverse 基准测试集与训练数据,请访问 [HuggingFace 合集 - VideoGPT+](https://huggingface.co/collections/MBZUAI/videogpt-665c8643221dda4987a67d8d)。 ## 📜 引用与致谢 bibtex @article{Maaz2024VideoGPT+, title={VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding}, author={Maaz, Muhammad and Rasheed, Hanoona and Khan, Salman and Khan, Fahad Shahbaz}, journal={arxiv}, year={2024}, url={https://arxiv.org/abs/2406.09418} }
提供机构:
maas
创建时间:
2025-03-17
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
VCG-plus_112K是一个包含112K指令调优QA对的数据集,通过改进的注释流程提高准确性和质量,适用于视频理解任务。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作