five

TIGER-Lab/VISTA-400K

收藏
Hugging Face2024-12-19 更新2024-12-21 收录
下载链接:
https://hf-mirror.com/datasets/TIGER-Lab/VISTA-400K
下载链接
链接失效反馈
官方服务:
资源简介:
VISTA-400K是一个视频指令跟随数据集,通过VISTA方法生成,该方法是一种视频时空增强技术,能够生成长时长和高分辨率的视频数据。该数据集旨在增强视频LMMs(视频语言模型)的视频理解能力,特别是对于长时间和高分辨率视频的理解。数据合成管道利用现有的公共视频字幕数据集,确保了数据的开放性和可扩展性,从而构建了这个高质量的数据集。

VISTA-400K is a dataset generated by a video spatiotemporal augmentation method, aiming to enhance the long-duration and high-resolution video understanding capabilities of video language models (LMMs). The dataset utilizes data augmentation techniques from image and video classification, such as CutMix, MixUp, and VideoMix, to create artificially augmented video samples and synthesize instruction data based on these new videos. The dataset is fully open-sourced and scalable, leveraging existing public video-caption datasets, with the goal of improving the long-duration and high-resolution video understanding capabilities of video language models.
提供机构:
TIGER-Lab
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作