five

myownskyW7/V3Det

收藏
Hugging Face2023-10-19 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/myownskyW7/V3Det
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 task_categories: - object-detection size_categories: - 1K<n<10K --- <p align="center"> <img src="images/v3det_icon.jpg" width="100"/> </p> <p align="center"> <b><font size="6">V3Det: Vast Vocabulary Visual Detection Dataset</font></b> </p> <p> <div align="center"> <div> <a href='https://myownskyw7.github.io/' target='_blank'>Jiaqi Wang</a>*, <a href='https://panzhang0212.github.io/' target='_blank'>Pan Zhang</a>*, Tao Chu*, Yuhang Cao*, </br> Yujie Zhou, <a href='https://wutong16.github.io/' target='_blank'>Tong Wu</a>, Bin Wang, Conghui He, <a href='http://dahua.site/' target='_blank'>Dahua Lin</a></br> (* equal contribution)</br> <strong>Accepted to ICCV 2023 (Oral)</strong> </div> </p> <p> <div> <strong> <a href='https://arxiv.org/pdf/2304.03752.pdf' target='_blank'>Paper</a>, <a href='https://v3det.openxlab.org.cn/' target='_blank'>Dataset</a></br> </strong> </div> </div> </p> <p align="center"> <img width=960 src="images/introduction.jpg"/> </p> ## Codebase ### Object Detection - mmdetection: https://github.com/V3Det/mmdetection-V3Det/tree/main/configs/v3det - Detectron2: https://github.com/V3Det/Detectron2-V3Det ### Open Vocabulary Detection (OVD) - Detectron2: https://github.com/V3Det/Detectron2-V3Det ## Data Format The data includes a training set, a validation set, comprising 13,204 categories. The training set consists of 183,354 images, while the validation set has 29,821 images. The data organization is: ``` V3Det/ images/ <category_node>/ |────<image_name>.png ... ... annotations/ |────v3det_2023_v1_category_tree.json # Category tree |────category_name_13204_v3det_2023_v1.txt # Category name |────v3det_2023_v1_train.json # Train set |────v3det_2023_v1_val.json # Validation set ``` ## Annotation Files ### Train/Val The annotation files are provided in dictionary format and contain the keywords "images," "categories," and "annotations." - images : store a list containing image information, where each element is a dictionary representing an image. ``` file_name # The relative image path, eg. images/n07745046/21_371_29405651261_633d076053_c.jpg. height # The height of the image width # The width of the image id # Unique identifier of the image. ``` - categories : store a list containing category information, where each element is a dictionary representing a category. ``` name # English name of the category. name_zh # Chinese name of the category. cat_info # The format for the description information of categories is a list. cat_info_gpt # The format for the description information of categories generated by ChatGPT is a list. novel # For open-vocabulary detection, indicate whether the current category belongs to the 'novel' category. id # Unique identifier of the category. ``` - annotations : store a list containing annotation information, where each element is a dictionary representing a bounding box annotation. ``` image_id # The unique identifier of the image where the bounding box is located. category_id # The unique identifier of the category corresponding to the bounding box. bbox # The coordinates of the bounding box, in the format [x, y, w, h], representing the top-left corner coordinates and the width and height of the box. iscrowd # Whether the bounding box is a crowd box. area # The area of the bounding box ``` ### Category Tree - The category tree stores information about dataset category mappings and relationships in dictionary format. ``` categoryid2treeid # Unique identifier of node in the category tree corresponding to the category identifier in dataset id2name # English name corresponding to each node in the category tree id2name_zh # Chinese name corresponding to each node in the category tree id2desc # English description corresponding to each node in the category tree id2desc_zh # Chinese description corresponding to each node in the category tree id2synonym_list # List of synonyms corresponding to each node in the category tree id2center_synonym # Center synonym corresponding to each node in the category tree father2child # All direct child categories corresponding to each node in the category tree child2father # All direct parent categories corresponding to each node in the category tree ancestor2descendant # All descendant nodes corresponding to each node in the category tree descendant2ancestor # All ancestor nodes corresponding to each node in the category tree ``` ## Image Download - Run the command to crawl the images. By default, the images will be stored in the './V3Det/' directory. ``` python v3det_image_download.py ``` - If you want to change the storage location, you can specify the desired folder by adding the option '--output_folder' when executing the script. ``` python v3det_image_download.py --output_folder our_folder ``` ## Category Tree Visualization - Run the command and then select dataset path `path/to/V3Det` to visualize the category tree. ``` python v3det_visualize_tree.py ``` Please refer to the [TreeUI Operation Guide](VisualTree.md) for more information. ## License: - **V3Det Images**: Around 90% images in V3Det were selected from the [Bamboo Dataset](https://github.com/ZhangYuanhan-AI/Bamboo), sourced from the Flickr website. The remaining 10% were directly crawled from the Flickr. **We do not own the copyright of the images.** Use of the images must abide by the [Flickr Terms of Use](https://www.flickr.com/creativecommons/). We only provide lists of image URLs without redistribution. - **V3Det Annotations**: The V3Det annotations, the category relationship tree, and related tools are licensed under a [Creative Commons Attribution 4.0 License](https://creativecommons.org/licenses/by/4.0/) (allow commercial use). ## Citation ```bibtex @inproceedings{wang2023v3det, title = {V3Det: Vast Vocabulary Visual Detection Dataset}, author = {Wang, Jiaqi and Zhang, Pan and Chu, Tao and Cao, Yuhang and Zhou, Yujie and Wu, Tong and Wang, Bin and He, Conghui and Lin, Dahua}, booktitle = {The IEEE International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023} } ```
提供机构:
myownskyW7
原始信息汇总

数据集概述

名称: V3Det: Vast Vocabulary Visual Detection Dataset

任务类别:

  • 目标检测

数据规模:

  • 1K<n<10K

数据格式:

  • 包含训练集和验证集,共13,204个类别。
  • 训练集包含183,354张图像,验证集包含29,821张图像。

数据组织:

V3Det/ images/ <category_node>/ |────<image_name>.png ... ... annotations/ |────v3det_2023_v1_category_tree.json # 类别树 |────category_name_13204_v3det_2023_v1.txt # 类别名称 |────v3det_2023_v1_train.json # 训练集 |────v3det_2023_v1_val.json # 验证集

标注文件:

  • 训练/验证集标注文件以字典格式提供,包含关键字 "images"、"categories" 和 "annotations"。
    • images: 存储图像信息的列表,每个元素是一个表示图像的字典。

      file_name # 图像相对路径 height # 图像高度 width # 图像宽度 id # 图像唯一标识符

    • categories: 存储类别信息的列表,每个元素是一个表示类别的字典。

      name # 类别英文名称 name_zh # 类别中文名称 cat_info # 类别描述信息格式为列表 cat_info_gpt # ChatGPT生成的类别描述信息格式为列表 novel # 对于开放词汇检测,指示当前类别是否属于novel类别 id # 类别唯一标识符

    • annotations: 存储标注信息的列表,每个元素是一个表示边界框标注的字典。

      image_id # 边界框所在图像的唯一标识符 category_id # 边界框对应类别的唯一标识符 bbox # 边界框坐标,格式为[x, y, w, h],表示左上角坐标和框的宽度及高度 iscrowd # 是否为人群框 area # 边界框面积

类别树:

  • 类别树以字典格式存储数据集类别映射和关系信息。

    categoryid2treeid # 数据集中类别标识符对应的类别树节点唯一标识符 id2name # 类别树中每个节点的英文名称 id2name_zh # 类别树中每个节点的中文名称 id2desc # 类别树中每个节点的英文描述 id2desc_zh # 类别树中每个节点的中文描述 id2synonym_list # 类别树中每个节点的同义词列表 id2center_synonym # 类别树中每个节点的中心同义词 father2child # 类别树中每个节点的所有直接子类别 child2father # 类别树中每个节点的所有直接父类别 ancestor2descendant # 类别树中每个节点的所有后代节点 descendant2ancestor # 类别树中每个节点的所有祖先节点

许可证:

  • V3Det Images: 约90%的图像来自Bamboo数据集,源自Flickr网站,其余10%直接从Flickr爬取。图像版权不属于我们,使用图像需遵守Flickr使用条款。
  • V3Det Annotations: V3Det标注、类别关系树及相关工具采用Creative Commons Attribution 4.0 License(允许商业使用)。
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
V3Det是一个大规模视觉检测数据集,包含13,204个类别和超过21万张图像,主要用于目标检测和开放词汇检测任务。数据集提供详细的类别树结构和丰富的标注信息,图像主要来自Flickr,标注采用CC-BY-4.0许可。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作