kalivoda/dataset_easy_ocr_v0.3.0_clean
收藏Hugging Face2023-06-16 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/kalivoda/dataset_easy_ocr_v0.3.0_clean
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: id
dtype: string
- name: words
sequence: string
- name: bboxes
sequence:
sequence: float32
- name: image_path
dtype: string
- name: ner_tags
sequence:
class_label:
names:
'0': DIC
'1': IBAN
'2': ICO
'3': O
'4': account_number
'5': bank_code
'6': const_symbol
'7': contr_address
'8': contr_name
'9': due_date
'10': invoice_date
'11': invoice_number
'12': qr_code
'13': spec_symbol
'14': total_amount
'15': var_symbol
splits:
- name: train
num_bytes: 20705074
num_examples: 2523
- name: val
num_bytes: 2370943
num_examples: 280
download_size: 7037725
dataset_size: 23076017
---
# Dataset Card for "dataset_easy_ocr_v0.3.0_clean"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
kalivoda
原始信息汇总
数据集概述
数据集名称
- 名称: dataset_easy_ocr_v0.3.0_clean
数据集特征
- id: 字符串类型
- words: 字符串序列
- bboxes: 浮点数序列,内部序列类型为
float32 - image_path: 字符串类型
- ner_tags: 类别标签序列,包含以下类别:
- DIC
- IBAN
- ICO
- O
- account_number
- bank_code
- const_symbol
- contr_address
- contr_name
- due_date
- invoice_date
- invoice_number
- qr_code
- spec_symbol
- total_amount
- var_symbol
数据集分割
- 训练集:
- 示例数量: 2523
- 数据大小: 20705074 字节
- 验证集:
- 示例数量: 280
- 数据大小: 2370943 字节
数据集大小
- 下载大小: 7037725 字节
- 总数据集大小: 23076017 字节



