huggingface/label-files
收藏Hugging Face2023-03-15 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/huggingface/label-files
下载链接
链接失效反馈官方服务:
资源简介:
This repository contains the mapping from integer id's to actual label names (in HuggingFace Transformers typically called `id2label`) for several datasets.
Current datasets include:
- ImageNet-1k
- ImageNet-22k (also called ImageNet-21k as there are 21,843 classes)
- COCO detection 2017
- COCO panoptic 2017
- ADE20k (actually, the [MIT Scene Parsing benchmark](http://sceneparsing.csail.mit.edu/), which is a subset of ADE20k)
- Cityscapes
- VQAv2
- Kinetics-700
- RVL-CDIP
- PASCAL VOC
- Kinetics-400
- ...
You can read in a label file as follows (using the `huggingface_hub` library):
```
from huggingface_hub import hf_hub_download
import json
repo_id = "huggingface/label-files"
filename = "imagenet-22k-id2label.json"
id2label = json.load(open(hf_hub_download(repo_id, filename, repo_type="dataset"), "r"))
id2label = {int(k):v for k,v in id2label.items()}
```
To add an `id2label` mapping for a new dataset, simply define a Python dictionary, and then save that dictionary as a JSON file, like so:
```
import json
# simple example
id2label = {0: 'cat', 1: 'dog'}
with open('cats-and-dogs-id2label.json', 'w') as fp:
json.dump(id2label, fp)
```
You can then upload it to this repository (assuming you have write access).
提供机构:
huggingface
原始信息汇总
数据集概述
包含的数据集
- ImageNet-1k
- ImageNet-22k (又称ImageNet-21k,包含21,843个类别)
- COCO detection 2017
- COCO panoptic 2017
- ADE20k (实际上是MIT Scene Parsing benchmark的子集)
- Cityscapes
- VQAv2
- Kinetics-700
- RVL-CDIP
- PASCAL VOC
- Kinetics-400
数据集文件格式
- 数据集的映射文件以JSON格式存储,如
imagenet-22k-id2label.json。
数据集文件的读取方法
使用huggingface_hub库,通过以下代码读取JSON文件:
python
from huggingface_hub import hf_hub_download
import json
repo_id = "huggingface/label-files" filename = "imagenet-22k-id2label.json" id2label = json.load(open(hf_hub_download(repo_id, filename, repo_type="dataset"), "r")) id2label = {int(k):v for k,v in id2label.items()}
如何添加新的数据集映射
- 定义一个Python字典。
- 将字典保存为JSON文件。
- 上传至仓库。
示例代码: python import json
id2label = {0: cat, 1: dog}
with open(cats-and-dogs-id2label.json, w) as fp: json.dump(id2label, fp)



