WebVision
收藏OpenDataLab2026-05-17 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/WebVision
下载链接
链接失效反馈官方服务:
资源简介:
WebVision数据集旨在促进从嘈杂的web数据中学习视觉表示的研究。我们的目标是在注释大规模视觉数据集时,将深度学习技术与巨大的人工劳动分开。我们发布了这个大规模的web图像数据集作为基准,以推进从web数据中学习的研究,包括弱监督的视觉表示学习,视觉转移学习,文本和视觉等。(请参阅WebVision数据集的推荐设置)。
WebVision数据集包含从Flickr网站和Google图像搜索抓取的240万多个图像。与ILSVRC 2012数据集相同的1,000概念用于查询图像,从而可以直接研究一堆现有方法,并将其与从ILSVRC 2012数据集训练的模型进行比较,并且还可以研究数据集偏差问题。大规模场景。伴随那些图像的文本信息 (例如,标题、用户标签或描述) 也被提供为附加的元信息。提供包含50,000个图像 (每个类别50个图像) 的验证集,以促进算法开发。简单基线的初步结果表明,WebVision数据集能够学习鲁棒的表示,其实现了与从人类注释的ILSVRC 2012数据集在几个视觉任务上学习的模型相当的性能。
The WebVision dataset aims to promote research on learning visual representations from noisy web data. Our goal is to separate deep learning techniques from massive manual labor during the annotation of large-scale visual datasets. We release this large-scale web image dataset as a benchmark to advance research on learning from web data, including weakly-supervised visual representation learning, visual transfer learning, cross-modal text-vision research, and more. Please refer to the recommended setup of the WebVision dataset.
The WebVision dataset contains over 2.4 million images crawled from Flickr and Google Image Search. The same 1,000 concepts from the ILSVRC 2012 dataset are used for image querying, which enables direct investigation of a variety of existing methods, comparison with models trained on the human-annotated ILSVRC 2012 dataset, as well as research on dataset bias issues in large-scale scenarios. Text information accompanying these images, such as titles, user tags, or descriptions, is also provided as additional meta-information. A validation set containing 50,000 images (50 images per category) is provided to facilitate algorithm development. Preliminary results on simple baselines demonstrate that the WebVision dataset can learn robust representations that achieve performance comparable to models trained on the human-annotated ILSVRC 2012 dataset across several visual tasks.
提供机构:
OpenDataLab
创建时间:
2022-11-02
搜集汇总
数据集介绍

背景与挑战
背景概述
WebVision是一个大规模图像分类数据集,旨在推动从嘈杂网络数据中学习视觉表示的研究。它包含超过240万张基于ILSVRC 2012的1,000个概念从Flickr和Google图像搜索抓取的图像,并附带文本元数据,提供50,000张图像的验证集以支持算法开发。该数据集由Google和苏黎世联邦理工学院于2017年发布,初步研究表明其能学习到与人类注释数据集性能相当的鲁棒表示。
以上内容由遇见数据集搜集并总结生成



