pemujo/GLDv2_Top_51_Categories
收藏Hugging Face2023-05-18 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/pemujo/GLDv2_Top_51_Categories
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: id
dtype: string
- name: landmark_id
dtype: int64
- name: category
dtype: string
- name: image
dtype: image
- name: label
dtype: int64
splits:
- name: train
num_bytes: 2428986323.125
num_examples: 36463
- name: test
num_bytes: 606874794.5
num_examples: 9116
download_size: 3034360629
dataset_size: 3035861117.625
language:
- en
pretty_name: GLDv2 Top 51 Categories
size_categories:
- n<1K
---
# Dataset Card for Dataset Name
### Dataset Summary
This dataset is a subset of Kaggle's Google Landmark Recognition 2021 competition with only the categories with more than 500 images.
https://www.kaggle.com/competitions/landmark-recognition-2021/data
The dataset consists of a total of 45579 224x224 color images in 51 categories.
### Languages
English
## Dataset Structure
### Data Fields
- `landmark_id`: Int - Numeric identifier of the category
- `category` : String - Name of the category
- `id` : String - Image identifier
- `image` : Image - PIL image object
- `label` : Int - Numeric label from 0 to 50
### Data Splits
The dataset was randomly split with 80% of the images for the train set and 20% for the test set.
| | train | test |
|----------------------|------:|-----:|
| Dataset | 36463 | 9116 |
### Source Data
The full dataset is from Kaggle Landmark Recognition 2021
"Towards A Fairer Landmark Recognition Dataset", Z. Kim, A. Araujo, B. Cao, C. Askew, J. Sim, M. Green, N. Yilla and T. Weyand, arxiv:2108.08874
https://www.kaggle.com/competitions/landmark-recognition-2021/data
### Citation Information
"Google Landmarks Dataset v2 - A Large-Scale Benchmark for Instance-Level Recognition and Retrieval", T. Weyand, A. Araujo, B. Cao and J. Sim, Proc. CVPR'20
"Towards A Fairer Landmark Recognition Dataset", Z. Kim, A. Araujo, B. Cao, C. Askew, J. Sim, M. Green, N. Yilla and T. Weyand, arxiv:2108.08874
提供机构:
pemujo
原始信息汇总
数据集概述
- 名称: GLDv2 Top 51 Categories
- 数据集大小: 总大小为3035861117.625字节
- 下载大小: 3034360629字节
- 语言: 英语
- 类别数量: 51
- 图像总数: 45579张
- 图像尺寸: 224x224像素,彩色图像
- 数据分割:
- 训练集: 36463张图像,占80%
- 测试集: 9116张图像,占20%
数据字段
- id: 字符串类型,图像标识符
- landmark_id: 整数类型,类别数字标识
- category: 字符串类型,类别名称
- image: 图像类型,PIL图像对象
- label: 整数类型,数值标签,范围从0到50
数据集来源
- 原始数据集: Kaggle Landmark Recognition 2021
- 相关研究:
- "Google Landmarks Dataset v2 - A Large-Scale Benchmark for Instance-Level Recognition and Retrieval", T. Weyand, A. Araujo, B. Cao and J. Sim, Proc. CVPR20
- "Towards A Fairer Landmark Recognition Dataset", Z. Kim, A. Araujo, B. Cao, C. Askew, J. Sim, M. Green, N. Yilla and T. Weyand, arxiv:2108.08874



