Open Images V6

Name: Open Images V6
Creator: OpenDataLab
Published: 2026-05-17 03:30:01
License: 暂无描述

OpenDataLab2026-05-17 更新2024-05-09 收录

下载链接：

https://opendatalab.org.cn/OpenDataLab/OpenImagesV6

下载链接

链接失效反馈

官方服务：

资源简介：

Open Images 是一个约 900 万张图像的数据集，标注有图像级标签、对象边界框、对象分割掩码、视觉关系和本地化叙述：它包含 190 万张图像上 600 个对象类别的总共 160 万个边界框，使其成为具有对象位置注释的最大现有数据集。这些框主要由专业注释者手动绘制，以确保准确性和一致性。图像非常多样化，通常包含具有多个对象的复杂场景（平均每张图像 8.3 个）。 Open Images 还提供视觉关系注释，指示特定关系中的对象对（例如“弹吉他的女人”、“桌子上的啤酒”）、对象属性（例如“桌子是木头的”）和人类行为（例如“”女人在跳“”）。它总共有来自 1,466 个不同关系三元组的 330 万个注释。在 V5 中，我们为 350 个类中的 280 万个对象实例添加了分割掩码。分割蒙版标记对象的轮廓，从而将它们的空间范围表征为更高级别的细节。在 V6 中，我们添加了 675k 本地化叙述：由同步语音、文本和鼠标在被描述对象上的轨迹组成的图像的多模式描述。（请注意，我们最初仅在 V6 的 train 上发布了本地化叙述，但自 2020 年 7 月起，我们还进行了验证和测试。）最后，数据集使用 59957 个类别的 5990 万个图像级标签进行了注释。

Open Images is a dataset consisting of approximately 9 million images, annotated with image-level labels, object bounding boxes, object segmentation masks, visual relationships, and localized narratives: It contains a total of 1.6 million bounding boxes for 600 object categories across 1.9 million images, making it the largest existing dataset with object location annotations. These boxes are primarily drawn manually by professional annotators to ensure accuracy and consistency. The images are highly diverse, often featuring complex scenes with multiple objects (an average of 8.3 objects per image). Open Images also provides visual relationship annotations, which indicate object pairs in specific relationships (e.g., "woman playing guitar", "beer on the table"), object attributes (e.g., "table is made of wood"), and human behaviors (e.g., "woman is jumping"). In total, it has 3.3 million annotations from 1,466 distinct relationship triples. In V5, we added segmentation masks for 2.8 million object instances across 350 categories. Segmentation masks mark the contours of objects, thereby characterizing their spatial extent with a higher level of detail. In V6, we added 675k localized narratives: multimodal descriptions of images consisting of synchronized speech, text, and mouse trajectories over the described objects. (Note that we initially released localized narratives only on the training split of V6, but since July 2020, we have also made the validation and test splits available.) Finally, the dataset is annotated with 59.9 million image-level labels across 59,957 categories.

提供机构：

OpenDataLab

创建时间：

2022-04-13

搜集汇总

数据集介绍

背景与挑战

背景概述

Open Images V6是一个大规模、多标注类型的图像数据集，包含约900万张图像，覆盖600个对象类别，并提供图像级标签、对象边界框、分割掩码、视觉关系和本地化叙述等多种注释。其特点是拥有190万张图像上的160万个边界框，是目前最大的对象位置注释数据集之一，图像场景复杂多样，平均每张图像包含8.3个对象，适用于图像分类、物体检测和视觉关系检测等计算机视觉任务。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集