five

huggingface/label-files

收藏
Hugging Face2023-03-15 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/huggingface/label-files
下载链接
链接失效反馈
官方服务:
资源简介:
This repository contains the mapping from integer id's to actual label names (in HuggingFace Transformers typically called `id2label`) for several datasets. Current datasets include: - ImageNet-1k - ImageNet-22k (also called ImageNet-21k as there are 21,843 classes) - COCO detection 2017 - COCO panoptic 2017 - ADE20k (actually, the [MIT Scene Parsing benchmark](http://sceneparsing.csail.mit.edu/), which is a subset of ADE20k) - Cityscapes - VQAv2 - Kinetics-700 - RVL-CDIP - PASCAL VOC - Kinetics-400 - ... You can read in a label file as follows (using the `huggingface_hub` library): ``` from huggingface_hub import hf_hub_download import json repo_id = "huggingface/label-files" filename = "imagenet-22k-id2label.json" id2label = json.load(open(hf_hub_download(repo_id, filename, repo_type="dataset"), "r")) id2label = {int(k):v for k,v in id2label.items()} ``` To add an `id2label` mapping for a new dataset, simply define a Python dictionary, and then save that dictionary as a JSON file, like so: ``` import json # simple example id2label = {0: 'cat', 1: 'dog'} with open('cats-and-dogs-id2label.json', 'w') as fp: json.dump(id2label, fp) ``` You can then upload it to this repository (assuming you have write access).
提供机构:
huggingface
原始信息汇总

数据集概述

包含的数据集

  • ImageNet-1k
  • ImageNet-22k (又称ImageNet-21k,包含21,843个类别)
  • COCO detection 2017
  • COCO panoptic 2017
  • ADE20k (实际上是MIT Scene Parsing benchmark的子集)
  • Cityscapes
  • VQAv2
  • Kinetics-700
  • RVL-CDIP
  • PASCAL VOC
  • Kinetics-400

数据集文件格式

  • 数据集的映射文件以JSON格式存储,如imagenet-22k-id2label.json

数据集文件的读取方法

使用huggingface_hub库,通过以下代码读取JSON文件: python from huggingface_hub import hf_hub_download import json

repo_id = "huggingface/label-files" filename = "imagenet-22k-id2label.json" id2label = json.load(open(hf_hub_download(repo_id, filename, repo_type="dataset"), "r")) id2label = {int(k):v for k,v in id2label.items()}

如何添加新的数据集映射

  1. 定义一个Python字典。
  2. 将字典保存为JSON文件。
  3. 上传至仓库。

示例代码: python import json

id2label = {0: cat, 1: dog}

with open(cats-and-dogs-id2label.json, w) as fp: json.dump(id2label, fp)

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作