five

renumics/beans-outlier

收藏
Hugging Face2023-06-30 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/renumics/beans-outlier
下载链接
链接失效反馈
官方服务:
资源简介:
--- annotations_creators: - expert-generated language_creators: - expert-generated language: - en license: - mit multilinguality: - monolingual size_categories: - 1K<n<10K source_datasets: - extended task_categories: - image-classification task_ids: - multi-class-image-classification pretty_name: Beans dataset_info: features: - name: image_file_path dtype: string - name: image dtype: image - name: labels dtype: class_label: names: '0': angular_leaf_spot '1': bean_rust '2': healthy - name: embedding_foundation sequence: float32 - name: embedding_ft sequence: float32 - name: outlier_score_ft dtype: float64 - name: outlier_score_foundation dtype: float64 - name: nn_image dtype: image splits: - name: train num_bytes: 293531811.754 num_examples: 1034 download_size: 0 dataset_size: 293531811.754 --- # Dataset Card for "beans-outlier" 📚 This dataset is an enhancved version of the [ibean project of the AIR lab](https://github.com/AI-Lab-Makerere/ibean/). The workflow is described in the medium article: [Changes of Embeddings during Fine-Tuning of Transformers](https://medium.com/@markus.stoll/changes-of-embeddings-during-fine-tuning-c22aa1615921). ## Explore the Dataset The open source data curation tool [Renumics Spotlight](https://github.com/Renumics/spotlight) allows you to explorer this dataset. You can find a Hugging Face Space running Spotlight with this dataset here: <https://huggingface.co/spaces/renumics/beans-outlier> ![Analyze with Spotlight](https://spotlight.renumics.com/resources/hf-beans-outlier.png) Or you can explorer it locally: ```python !pip install renumics-spotlight datasets from renumics import spotlight import datasets ds = datasets.load_dataset("renumics/beansoutlier", split="train") df = ds.to_pandas() df["label_str"] = df["labels"].apply(lambda x: ds.features["labels"].int2str(x)) dtypes = { "nn_image": spotlight.Image, "image": spotlight.Image, "embedding_ft": spotlight.Embedding, "embedding_foundation": spotlight.Embedding, } spotlight.show( df, dtype=dtypes, layout="https://spotlight.renumics.com/resources/layout_pre_post_ft.json", ) ```
提供机构:
renumics
原始信息汇总

数据集概述

基本信息

  • 名称: Beans
  • 语言: 英语 (en)
  • 许可证: MIT
  • 多语言性: 单语种
  • 大小: 1K<n<10K
  • 来源: 扩展自其他数据集

任务类型

  • 类别: 图像分类
  • 具体任务: 多类别图像分类

数据集特征

  • 图像文件路径 (image_file_path): 字符串类型
  • 图像 (image): 图像类型
  • 标签 (labels): 类别标签,包括:
    • 0: angular_leaf_spot
    • 1: bean_rust
    • 2: healthy
  • 嵌入基础 (embedding_foundation): 序列,浮点32位
  • 嵌入微调 (embedding_ft): 序列,浮点32位
  • 异常分数微调 (outlier_score_ft): 浮点64位
  • 异常分数基础 (outlier_score_foundation): 浮点64位
  • 近邻图像 (nn_image): 图像类型

数据集拆分

  • 训练集 (train):
    • 样本数: 1034
    • 数据大小: 293531811.754字节
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作