RefCOCOg

Name: RefCOCOg
Creator: OpenDataLab
Published: 2026-05-17 09:30:41
License: 暂无描述

OpenDataLab2026-05-17 更新2024-05-09 收录

下载链接：

https://opendatalab.org.cn/OpenDataLab/RefCOCOg

下载链接

链接失效反馈

官方服务：

资源简介：

RefCocoG来自Mao等人的2016，由于注释过程中的差异，与RefCoco相比，对对象的描述更加丰富。特别是，RefCoco是在基于交互式游戏的设置中收集的，而RefCocoG是在非交互式设置中收集的。平均而言，RefCocoG每个表达式有8.4个单词，而RefCoco有3.5个单词。每个数据集都有不同的拆分分配，这些分配通常都在论文中报告。RefCoco和RefCoco中的 “testA” 和 “testB” 集分别仅包含人和非人。图像被划分为各种分裂。在 “google” 拆分中，对象 (而不是图像) 在火车和非火车拆分之间进行了分区。这意味着相同的图像可以同时出现在train和validation split中，但是图像中引用的对象在两个集合之间将是不同的。相反，“unc” 和 “umd” 在火车，验证和测试分割之间分割分区图像。在RefCocoG中，“google” 拆分没有规范的测试集，验证集通常在论文中报告为 “val *”。

RefCocoG was proposed by Mao et al. in 2016. Due to disparities in the annotation workflow, it features more detailed object descriptions compared to RefCoco. Specifically, RefCoco was collected via an interactive game-based paradigm, while RefCocoG was gathered in a non-interactive setting. On average, RefCocoG contains 8.4 words per referring expression, whereas RefCoco only has 3.5 words per expression. Each dataset has distinct split configurations, which are typically documented in their respective publications. The "testA" and "testB" splits of RefCoco and RefCocoG only include human and non-human objects respectively. Images are partitioned into various splits. In the "google" split, objects (rather than entire images) are partitioned between training and non-training subsets. This means that the same image can appear in both the training and validation subsets, yet the referenced objects within that image will differ between the two sets. In contrast, the "unc" and "umd" splits partition entire images across the training, validation, and test subsets. For RefCocoG, the "google" split lacks a canonical test set, and its validation subset is commonly referred to as "val*" in the corresponding literature.

提供机构：

OpenDataLab

创建时间：

2023-03-22

搜集汇总

数据集介绍

背景与挑战

背景概述

RefCOCOg是一个2016年发布的图像文本数据集，专注于对象指代表达，由北卡罗来纳大学教堂山分校发布。其特点在于通过非交互式设置收集，使得对象描述更丰富，平均每个表达式有8.4个单词，相比RefCOCO的3.5个单词更长。数据集包含多种拆分方式，如'google'拆分基于对象分区，而'unc'和'umd'拆分基于图像分区，常用于视觉语言任务的研究和评估。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集