pemujo/GLDv2_Top_51_Categories

Name: pemujo/GLDv2_Top_51_Categories
Creator: pemujo
Published: 2023-05-18 08:30:54
License: 暂无描述

Hugging Face2023-05-18 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/pemujo/GLDv2_Top_51_Categories

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: id dtype: string - name: landmark_id dtype: int64 - name: category dtype: string - name: image dtype: image - name: label dtype: int64 splits: - name: train num_bytes: 2428986323.125 num_examples: 36463 - name: test num_bytes: 606874794.5 num_examples: 9116 download_size: 3034360629 dataset_size: 3035861117.625 language: - en pretty_name: GLDv2 Top 51 Categories size_categories: - n<1K --- # Dataset Card for Dataset Name ### Dataset Summary This dataset is a subset of Kaggle's Google Landmark Recognition 2021 competition with only the categories with more than 500 images. https://www.kaggle.com/competitions/landmark-recognition-2021/data The dataset consists of a total of 45579 224x224 color images in 51 categories. ### Languages English ## Dataset Structure ### Data Fields - `landmark_id`: Int - Numeric identifier of the category - `category` : String - Name of the category - `id` : String - Image identifier - `image` : Image - PIL image object - `label` : Int - Numeric label from 0 to 50 ### Data Splits The dataset was randomly split with 80% of the images for the train set and 20% for the test set. | | train | test | |----------------------|------:|-----:| | Dataset | 36463 | 9116 | ### Source Data The full dataset is from Kaggle Landmark Recognition 2021 "Towards A Fairer Landmark Recognition Dataset", Z. Kim, A. Araujo, B. Cao, C. Askew, J. Sim, M. Green, N. Yilla and T. Weyand, arxiv:2108.08874 https://www.kaggle.com/competitions/landmark-recognition-2021/data ### Citation Information "Google Landmarks Dataset v2 - A Large-Scale Benchmark for Instance-Level Recognition and Retrieval", T. Weyand, A. Araujo, B. Cao and J. Sim, Proc. CVPR'20 "Towards A Fairer Landmark Recognition Dataset", Z. Kim, A. Araujo, B. Cao, C. Askew, J. Sim, M. Green, N. Yilla and T. Weyand, arxiv:2108.08874

提供机构：

pemujo

原始信息汇总

数据集概述

名称: GLDv2 Top 51 Categories
数据集大小: 总大小为3035861117.625字节
下载大小: 3034360629字节
语言: 英语
类别数量: 51
图像总数: 45579张
图像尺寸: 224x224像素，彩色图像
数据分割:
- 训练集: 36463张图像，占80%
- 测试集: 9116张图像，占20%

数据字段

id: 字符串类型，图像标识符
landmark_id: 整数类型，类别数字标识
category: 字符串类型，类别名称
image: 图像类型，PIL图像对象
label: 整数类型，数值标签，范围从0到50

数据集来源

原始数据集: Kaggle Landmark Recognition 2021
相关研究:
- "Google Landmarks Dataset v2 - A Large-Scale Benchmark for Instance-Level Recognition and Retrieval", T. Weyand, A. Araujo, B. Cao and J. Sim, Proc. CVPR20
- "Towards A Fairer Landmark Recognition Dataset", Z. Kim, A. Araujo, B. Cao, C. Askew, J. Sim, M. Green, N. Yilla and T. Weyand, arxiv:2108.08874

5,000+

优质数据集

54 个

任务类型

进入经典数据集