ShareGPT4Video

Name: ShareGPT4Video
Creator: maas
Published: 2026-01-08 01:24:18
License: 暂无描述

魔搭社区2026-01-08 更新2024-06-29 收录

下载链接：

https://modelscope.cn/datasets/AI-ModelScope/ShareGPT4Video

下载链接

链接失效反馈

官方服务：

资源简介：

# ShareGPT4Video 4.8M Dataset Card ## Dataset details **Dataset type:** ShareGPT4Video Captions 4.8M is a set of GPT4-Vision-powered multi-modal captions data of videos. It is constructed to enhance modality alignment and fine-grained visual concept perception in Large Video-Language Models (LVLMs) and Text-to-Video Models (T2VMs). This advancement aims to bring LVLMs and T2VMs towards the capabilities of GPT4V and Sora. * sharegpt4video_40k.jsonl is generated by GPT4-Vision (ShareGPT4Video). * share-captioner-video_mixkit-pexels-pixabay_4814k_0417.json is generated by our ShareCaptioner-Video trained on GPT4-Vision-generated video-caption pairs. * sharegpt4video_mix181k_vqa-153k_share-cap-28k.json is curated from sharegpt4video_instruct_gpt4-vision_cap40k.json for the supervised fine-tuning stage of LVLMs. * llava_v1_5_mix665k_with_video_chatgpt72k_share4video28k.json has replaced 28K detailed-caption-related data in VideoChatGPT with 28K high-quality captions from ShareGPT4Video. This file is utilized to validate the effectiveness of high-quality captions under the VideoLLaVA and LLaMA-VID models. **Dataset date:** ShareGPT4Video Captions 4.8M was collected in 4.17 2024. **Paper or resources for more information:** [[Project](https://ShareGPT4Video.github.io/)] [[Paper](https://arxiv.org/abs/2406.04325v1)] [[Code](https://github.com/ShareGPT4Omni/ShareGPT4Video)] [[ShareGPT4Video-8B](https://huggingface.co/Lin-Chen/sharegpt4video-8b)] **License:** Attribution-NonCommercial 4.0 International It should abide by the policy of OpenAI: https://openai.com/policies/terms-of-use ## Intended use **Primary intended uses:** The primary use of ShareGPT4Video Captions 4.8M is research on large multimodal models and text-to-video models. **Primary intended users:** The primary intended users of this dataset are researchers and hobbyists in computer vision, natural language processing, machine learning, AIGC, and artificial intelligence. ## Paper arxiv.org/abs/2406.04325

# ShareGPT4Video 4.8M 数据集卡片 ## 数据集详情 **数据集类型：** ShareGPT4Video Captions 4.8M 是一组由GPT4-Vision驱动的多模态视频字幕数据集。本数据集旨在提升大型视频语言模型（Large Video-Language Models, LVLMs）与文本到视频模型（Text-to-Video Models, T2VMs）的模态对齐能力与细粒度视觉概念感知能力，以期推动这类模型向GPT4V与Sora的性能边界迈进。 * `sharegpt4video_40k.jsonl` 由GPT4-Vision生成（即ShareGPT4Video）。 * `share-captioner-video_mixkit-pexels-pixabay_4814k_0417.json` 由我们基于GPT4-Vision生成的视频字幕对训练得到的ShareCaptioner-Video生成。 * `sharegpt4video_mix181k_vqa-153k_share-cap-28k.json` 从`sharegpt4video_instruct_gpt4-vision_cap40k.json`中整理而来，用于大型视频语言模型的监督微调阶段。 * `llava_v1_5_mix665k_with_video_chatgpt72k_share4video28k.json` 将VideoChatGPT中的2.8万条细粒度字幕相关数据替换为ShareGPT4Video提供的2.8万条高质量字幕，该文件用于验证高质量字幕在VideoLLaVA与LLaMA-VID模型上的有效性。 **数据集采集时间：** ShareGPT4Video Captions 4.8M 于2024年4月17日完成采集。 **更多信息的论文与资源：** [[项目页](https://ShareGPT4Video.github.io/)] [[论文](https://arxiv.org/abs/2406.04325v1)] [[代码仓库](https://github.com/ShareGPT4Omni/ShareGPT4Video)] [[ShareGPT4Video-8B 模型](https://huggingface.co/Lin-Chen/sharegpt4video-8b)] **许可协议：** 署名-非商业性使用4.0国际许可（Attribution-NonCommercial 4.0 International），需遵守OpenAI官方政策：https://openai.com/policies/terms-of-use ## 预期用途 **主要用途：** ShareGPT4Video Captions 4.8M 主要用于大型多模态模型与文本到视频模型的相关研究。 **目标用户：** 本数据集的目标用户为计算机视觉、自然语言处理、机器学习、AIGC以及人工智能领域的研究人员与爱好者。 ## 相关论文 arxiv.org/abs/2406.04325

提供机构：

maas

创建时间：

2024-06-21

5,000+

优质数据集

54 个

任务类型

进入经典数据集