LV-VIS

arXiv2025-09-30 收录

下载链接：

https://github.com/haochenheheda/LVVIS

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集名为LV-VIS，包含了4,832个真实世界视频，视频中共有656,130个像素级标注的分割掩模，跨越了1,212个类别，旨在评估开放词汇视频实例分割。此外，该数据集包括659个基础类别和553个新颖类别，其中93.5%的类别与MS-COCO数据集不重叠，目的是使模型能够在不存在于现有数据集中的新颖类别上进行评估。该数据集的规模为4,832个视频和1,212个类别，所针对的任务是视频实例分割。

This dataset, named LV-VIS, contains 4,832 real-world videos and a total of 656,130 pixel-level annotated segmentation masks spanning 1,212 categories, and it is intended for evaluating open-vocabulary video instance segmentation. Additionally, the dataset includes 659 base categories and 553 novel categories, with 93.5% of these categories having no overlap with those in the MS-COCO dataset. This setup aims to enable models to be evaluated on novel categories that are not present in existing datasets. With a scale of 4,832 videos and 1,212 categories, the targeted task of this dataset is video instance segmentation.

搜集汇总

数据集介绍

背景与挑战

背景概述

LV-VIS是一个开放词汇视频实例分割数据集，包含4,828个视频和26,099个对象的像素级分割掩码，覆盖1,196个独特类别。数据集提供了训练、验证和测试集，以及相应的注释文件和基线代码，适用于非商业研究用途。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集