five

pemujo/GLDv2_Top_51_Categories

收藏
Hugging Face2023-05-18 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/pemujo/GLDv2_Top_51_Categories
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: id dtype: string - name: landmark_id dtype: int64 - name: category dtype: string - name: image dtype: image - name: label dtype: int64 splits: - name: train num_bytes: 2428986323.125 num_examples: 36463 - name: test num_bytes: 606874794.5 num_examples: 9116 download_size: 3034360629 dataset_size: 3035861117.625 language: - en pretty_name: GLDv2 Top 51 Categories size_categories: - n<1K --- # Dataset Card for Dataset Name ### Dataset Summary This dataset is a subset of Kaggle's Google Landmark Recognition 2021 competition with only the categories with more than 500 images. https://www.kaggle.com/competitions/landmark-recognition-2021/data The dataset consists of a total of 45579 224x224 color images in 51 categories. ### Languages English ## Dataset Structure ### Data Fields - `landmark_id`: Int - Numeric identifier of the category - `category` : String - Name of the category - `id` : String - Image identifier - `image` : Image - PIL image object - `label` : Int - Numeric label from 0 to 50 ### Data Splits The dataset was randomly split with 80% of the images for the train set and 20% for the test set. | | train | test | |----------------------|------:|-----:| | Dataset | 36463 | 9116 | ### Source Data The full dataset is from Kaggle Landmark Recognition 2021 "Towards A Fairer Landmark Recognition Dataset", Z. Kim, A. Araujo, B. Cao, C. Askew, J. Sim, M. Green, N. Yilla and T. Weyand, arxiv:2108.08874 https://www.kaggle.com/competitions/landmark-recognition-2021/data ### Citation Information "Google Landmarks Dataset v2 - A Large-Scale Benchmark for Instance-Level Recognition and Retrieval", T. Weyand, A. Araujo, B. Cao and J. Sim, Proc. CVPR'20 "Towards A Fairer Landmark Recognition Dataset", Z. Kim, A. Araujo, B. Cao, C. Askew, J. Sim, M. Green, N. Yilla and T. Weyand, arxiv:2108.08874
提供机构:
pemujo
原始信息汇总

数据集概述

  • 名称: GLDv2 Top 51 Categories
  • 数据集大小: 总大小为3035861117.625字节
  • 下载大小: 3034360629字节
  • 语言: 英语
  • 类别数量: 51
  • 图像总数: 45579张
  • 图像尺寸: 224x224像素,彩色图像
  • 数据分割:
    • 训练集: 36463张图像,占80%
    • 测试集: 9116张图像,占20%

数据字段

  • id: 字符串类型,图像标识符
  • landmark_id: 整数类型,类别数字标识
  • category: 字符串类型,类别名称
  • image: 图像类型,PIL图像对象
  • label: 整数类型,数值标签,范围从0到50

数据集来源

  • 原始数据集: Kaggle Landmark Recognition 2021
  • 相关研究:
    • "Google Landmarks Dataset v2 - A Large-Scale Benchmark for Instance-Level Recognition and Retrieval", T. Weyand, A. Araujo, B. Cao and J. Sim, Proc. CVPR20
    • "Towards A Fairer Landmark Recognition Dataset", Z. Kim, A. Araujo, B. Cao, C. Askew, J. Sim, M. Green, N. Yilla and T. Weyand, arxiv:2108.08874
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作