Seenka/banners-dict
收藏Hugging Face2023-07-21 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Seenka/banners-dict
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: image
dtype: image
- name: label
dtype:
class_label:
names:
'0': none
'1': videograph
'2': zocalo
- name: cropped_image
dtype: image
- name: embeddings_cropped
sequence: float32
- name: embeddings
sequence: float32
- name: ocr_out
list:
- name: bbox
sequence:
sequence: float64
- name: confidence
dtype: float64
- name: text
dtype: string
splits:
- name: train
num_bytes: 200666273.136
num_examples: 1182
- name: canal_12
num_bytes: 26133565.0
num_examples: 265
download_size: 229251427
dataset_size: 226799838.136
---
# Dataset Card for "banners-dict"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
Seenka
原始信息汇总
数据集概述
数据集特征
- image: 图像数据
- label: 标签数据,包含以下类别:
- 0: none
- 1: videograph
- 2: zocalo
- cropped_image: 裁剪后的图像数据
- embeddings_cropped: 裁剪后图像的嵌入向量,类型为float32序列
- embeddings: 图像的嵌入向量,类型为float32序列
- ocr_out: OCR输出,包含以下子特征:
- bbox: 边界框,类型为float64序列的序列
- confidence: 置信度,类型为float64
- text: 文本,类型为字符串
数据集分割
- train: 训练集,包含1182个样本,大小为200666273.136字节
- canal_12: 数据集分割,包含265个样本,大小为26133565.0字节
数据集大小
- 下载大小: 229251427字节
- 数据集大小: 226799838.136字节



