Zombely/wikisource-green
收藏Hugging Face2023-03-18 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Zombely/wikisource-green
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: image
dtype: image
- name: ground_truth
dtype: string
splits:
- name: train_1
num_bytes: 15342818708.456
num_examples: 9816
- name: train_2
num_bytes: 13234327199.457
num_examples: 9997
- name: train_3
num_bytes: 8814747830.88
num_examples: 9935
- name: train_4
num_bytes: 10839226390.145
num_examples: 9995
- name: train_5
num_bytes: 12414635965.0
num_examples: 10000
- name: train_6
num_bytes: 5911580759.0
num_examples: 10000
- name: train_7
num_bytes: 11420080854.0
num_examples: 10000
- name: train_8
num_bytes: 18080629271.0
num_examples: 10000
- name: train_9
num_bytes: 11348011360.0
num_examples: 10000
- name: train_10
num_bytes: 14141957301.0
num_examples: 10000
- name: train_11
num_bytes: 9983910604.0
num_examples: 10000
- name: train_12
num_bytes: 13105253749.0
num_examples: 10000
- name: train_13
num_bytes: 15681320595.0
num_examples: 10000
- name: train_14
num_bytes: 14896725472.0
num_examples: 10000
- name: train_15
num_bytes: 11493364396.927
num_examples: 9987
- name: validation
num_bytes: 4487934740.612
num_examples: 4077
download_size: 5330245163
dataset_size: 191196525196.477
---
# Dataset Card for "wikisource-green"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
Zombely
原始信息汇总
数据集概述
数据集特征
- image: 图像数据类型
- ground_truth: 字符串数据类型
数据集划分
- train_1: 9816个样本,占用空间15342818708.456字节
- train_2: 9997个样本,占用空间13234327199.457字节
- train_3: 9935个样本,占用空间8814747830.88字节
- train_4: 9995个样本,占用空间10839226390.145字节
- train_5: 10000个样本,占用空间12414635965.0字节
- train_6: 10000个样本,占用空间5911580759.0字节
- train_7: 10000个样本,占用空间11420080854.0字节
- train_8: 10000个样本,占用空间18080629271.0字节
- train_9: 10000个样本,占用空间11348011360.0字节
- train_10: 10000个样本,占用空间14141957301.0字节
- train_11: 10000个样本,占用空间9983910604.0字节
- train_12: 10000个样本,占用空间13105253749.0字节
- train_13: 10000个样本,占用空间15681320595.0字节
- train_14: 10000个样本,占用空间14896725472.0字节
- train_15: 9987个样本,占用空间11493364396.927字节
- validation: 4077个样本,占用空间4487934740.612字节
数据集大小
- 下载大小: 5330245163字节
- 数据集总大小: 191196525196.477字节



