86Cao/google-landmark-v2-chinese-filtered

Name: 86Cao/google-landmark-v2-chinese-filtered
Creator: 86Cao
Published: 2025-12-19 03:36:58
License: 暂无描述

Hugging Face2025-12-19 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/86Cao/google-landmark-v2-chinese-filtered

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含用于训练地标检索模型的地标图像和元数据，其中地标名称已翻译为中文，以促进中文多模态检索任务。数据集基于Kaggle上的Google Landmarks V2数据集，经过过滤和处理，确保数据质量。关键特点包括：从Kaggle数据集过滤而来，确保数据质量；所有地标名称已翻译为中文，适合训练基于视觉语言模型（VLM）的多模态检索系统；地标按类别组织，支持基于类别的硬负采样以提高训练效果。数据集结构包括元数据文件和图像文件夹，提供了训练集和验证集的详细统计数据。

This dataset contains landmark images and metadata for training landmark retrieval models, with Chinese translations of landmark names to facilitate Chinese multimodal retrieval tasks. The dataset is based on the Google Landmarks V2 dataset from Kaggle, filtered and processed to ensure data quality. Key features include: filtered from the Kaggle dataset to ensure data quality; all landmark names translated to Chinese, making it ideal for training Vision-Language Models (VLM) based multimodal retrieval systems; landmarks organized by categories, enabling category-based hard negative sampling for improved training. The dataset structure includes metadata files and image folders, with detailed statistics on the training and validation sets.

提供机构：

86Cao

5,000+

优质数据集

54 个

任务类型

进入经典数据集