renumics/cifar100-outlier
收藏Hugging Face2023-06-30 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/renumics/cifar100-outlier
下载链接
链接失效反馈官方服务:
资源简介:
---
annotations_creators:
- crowdsourced
language_creators:
- found
language:
- en
license:
- unknown
multilinguality:
- monolingual
size_categories:
- 10K<n<100K
source_datasets:
- extended|other-80-Million-Tiny-Images
task_categories:
- image-classification
task_ids: []
paperswithcode_id: cifar-100
pretty_name: Cifar100
dataset_info:
features:
- name: img
dtype: image
- name: fine_label
dtype:
class_label:
names:
'0': apple
'1': aquarium_fish
'2': baby
'3': bear
'4': beaver
'5': bed
'6': bee
'7': beetle
'8': bicycle
'9': bottle
'10': bowl
'11': boy
'12': bridge
'13': bus
'14': butterfly
'15': camel
'16': can
'17': castle
'18': caterpillar
'19': cattle
'20': chair
'21': chimpanzee
'22': clock
'23': cloud
'24': cockroach
'25': couch
'26': cra
'27': crocodile
'28': cup
'29': dinosaur
'30': dolphin
'31': elephant
'32': flatfish
'33': forest
'34': fox
'35': girl
'36': hamster
'37': house
'38': kangaroo
'39': keyboard
'40': lamp
'41': lawn_mower
'42': leopard
'43': lion
'44': lizard
'45': lobster
'46': man
'47': maple_tree
'48': motorcycle
'49': mountain
'50': mouse
'51': mushroom
'52': oak_tree
'53': orange
'54': orchid
'55': otter
'56': palm_tree
'57': pear
'58': pickup_truck
'59': pine_tree
'60': plain
'61': plate
'62': poppy
'63': porcupine
'64': possum
'65': rabbit
'66': raccoon
'67': ray
'68': road
'69': rocket
'70': rose
'71': sea
'72': seal
'73': shark
'74': shrew
'75': skunk
'76': skyscraper
'77': snail
'78': snake
'79': spider
'80': squirrel
'81': streetcar
'82': sunflower
'83': sweet_pepper
'84': table
'85': tank
'86': telephone
'87': television
'88': tiger
'89': tractor
'90': train
'91': trout
'92': tulip
'93': turtle
'94': wardrobe
'95': whale
'96': willow_tree
'97': wolf
'98': woman
'99': worm
- name: coarse_label
dtype:
class_label:
names:
'0': aquatic_mammals
'1': fish
'2': flowers
'3': food_containers
'4': fruit_and_vegetables
'5': household_electrical_devices
'6': household_furniture
'7': insects
'8': large_carnivores
'9': large_man-made_outdoor_things
'10': large_natural_outdoor_scenes
'11': large_omnivores_and_herbivores
'12': medium_mammals
'13': non-insect_invertebrates
'14': people
'15': reptiles
'16': small_mammals
'17': trees
'18': vehicles_1
'19': vehicles_2
- name: embedding_foundation
sequence: float32
- name: embedding_ft
sequence: float32
- name: outlier_score_ft
dtype: float64
- name: outlier_score_foundation
dtype: float64
- name: nn_image
struct:
- name: bytes
dtype: binary
- name: path
dtype: 'null'
splits:
- name: train
num_bytes: 583557742.0
num_examples: 50000
download_size: 643988234
dataset_size: 583557742.0
---
# Dataset Card for "cifar100-outlier"
📚 This dataset is an enriched version of the [CIFAR-100 Dataset](https://www.cs.toronto.edu/~kriz/cifar.html).
The workflow is described in the medium article: [Changes of Embeddings during Fine-Tuning of Transformers](https://medium.com/@markus.stoll/changes-of-embeddings-during-fine-tuning-c22aa1615921).
## Explore the Dataset
The open source data curation tool [Renumics Spotlight](https://github.com/Renumics/spotlight) allows you to explorer this dataset. You can find a Hugging Face Space running Spotlight with this dataset here: <https://huggingface.co/spaces/renumics/cifar100-outlier>.

Or you can explorer it locally:
```python
!pip install renumics-spotlight datasets
from renumics import spotlight
import datasets
ds = datasets.load_dataset("renumics/cifar100-outlier", split="train")
df = ds.rename_columns({"img": "image", "fine_label": "labels"}).to_pandas()
df["label_str"] = df["labels"].apply(lambda x: ds.features["fine_label"].int2str(x))
dtypes = {
"nn_image": spotlight.Image,
"image": spotlight.Image,
"embedding_ft": spotlight.Embedding,
"embedding_foundation": spotlight.Embedding,
}
spotlight.show(
df,
dtype=dtypes,
layout="https://spotlight.renumics.com/resources/layout_pre_post_ft.json",
)
```
提供机构:
renumics
原始信息汇总
数据集概述
基本信息
- 数据集名称: Cifar100
- 语言: 英语
- 数据集大小: 10K<n<100K
- 多语言性: 单语种
- 许可证: 未知
- 来源数据集: 扩展自80 Million Tiny Images
- 任务类别: 图像分类
- PapersWithCode ID: cifar-100
数据集特征
- 图像:
- 名称: img
- 数据类型: image
- 细粒度标签:
- 名称: fine_label
- 数据类型: class_label
- 标签名称:
- 0: apple
- 1: aquarium_fish
- ...
- 99: worm
- 粗粒度标签:
- 名称: coarse_label
- 数据类型: class_label
- 标签名称:
- 0: aquatic_mammals
- 1: fish
- ...
- 19: vehicles_2
- 嵌入基础:
- 名称: embedding_foundation
- 数据类型: float32序列
- 嵌入FT:
- 名称: embedding_ft
- 数据类型: float32序列
- 异常分数FT:
- 名称: outlier_score_ft
- 数据类型: float64
- 异常分数基础:
- 名称: outlier_score_foundation
- 数据类型: float64
- 近邻图像:
- 名称: nn_image
- 数据类型: struct
- 包含:
- 名称: bytes
- 数据类型: binary
- 名称: path
- 数据类型: null
- 名称: bytes
数据集分割
- 训练集:
- 名称: train
- 字节数: 583557742.0
- 样本数: 50000
数据集大小
- 下载大小: 643988234
- 数据集大小: 583557742.0



