InternVid
收藏arXiv2024-01-04 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2307.06942v2
下载链接
链接失效反馈官方服务:
资源简介:
InternVid是一个大规模以视频为中心的多模态数据集,包含超过700万视频,总时长近76万小时,产生2.34亿视频片段,伴随有总计41亿字的详细描述。该数据集旨在促进多模态理解和生成的强大且可迁移的视频-文本表示学习。
InternVid is a large-scale video-centric multimodal dataset. It contains over 7 million videos with a total duration of nearly 760,000 hours, generating 234 million video clips, and is accompanied by detailed descriptions totaling 4.1 billion words. This dataset aims to facilitate robust and transferable video-text representation learning for multimodal understanding and generation.
创建时间:
2023-07-14



