trina731/rendered-wikipedia-zh
收藏Hugging Face2023-12-26 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/trina731/rendered-wikipedia-zh
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: pixel_values
dtype: image
- name: num_patches
dtype: int64
splits:
- name: train_0
num_bytes: 727967904.0
num_examples: 20000
- name: train_1
num_bytes: 726013967.0
num_examples: 20000
- name: train_2
num_bytes: 727841468.0
num_examples: 20000
- name: train_3
num_bytes: 723415340.0
num_examples: 20000
- name: train_4
num_bytes: 720335007.0
num_examples: 20000
- name: train_5
num_bytes: 719038885.0
num_examples: 20000
- name: train_6
num_bytes: 723462396.0
num_examples: 20000
- name: train_7
num_bytes: 716091354.0
num_examples: 20000
- name: train_8
num_bytes: 707670123.0
num_examples: 20000
- name: train_9
num_bytes: 716383829.0
num_examples: 20000
- name: train_10
num_bytes: 717800196.0
num_examples: 20000
- name: train_11
num_bytes: 711828635.0
num_examples: 20000
- name: train_12
num_bytes: 706399554.0
num_examples: 20000
- name: train_13
num_bytes: 725188256.0
num_examples: 20000
- name: train_14
num_bytes: 718145370.0
num_examples: 20000
- name: train_15
num_bytes: 712014398.0
num_examples: 20000
download_size: 11501396846
dataset_size: 11499596682.0
configs:
- config_name: default
data_files:
- split: train_0
path: data/train_0-*
- split: train_1
path: data/train_1-*
- split: train_2
path: data/train_2-*
- split: train_3
path: data/train_3-*
- split: train_4
path: data/train_4-*
- split: train_5
path: data/train_5-*
- split: train_6
path: data/train_6-*
- split: train_7
path: data/train_7-*
- split: train_8
path: data/train_8-*
- split: train_9
path: data/train_9-*
- split: train_10
path: data/train_10-*
- split: train_11
path: data/train_11-*
- split: train_12
path: data/train_12-*
- split: train_13
path: data/train_13-*
- split: train_14
path: data/train_14-*
- split: train_15
path: data/train_15-*
---
The dataset contains image data and integer data. The image data is represented by pixel values (pixel_values), and the integer data represents the number of patches in the image (num_patches). The dataset is divided into multiple training sets (train_0 to train_15), each containing 20000 samples, totaling 320000 samples. The total download size of the dataset is 11501396846 bytes, and the total dataset size is 11499596682.0 bytes.
提供机构:
trina731
原始信息汇总
数据集概述
数据特征
- 名称: pixel_values
- 数据类型: image
- 名称: num_patches
- 数据类型: int64
数据分割
- 名称: train_0
- 字节数: 727967904.0
- 样本数: 20000
- 名称: train_1
- 字节数: 726013967.0
- 样本数: 20000
- 名称: train_2
- 字节数: 727841468.0
- 样本数: 20000
- 名称: train_3
- 字节数: 723415340.0
- 样本数: 20000
- 名称: train_4
- 字节数: 720335007.0
- 样本数: 20000
- 名称: train_5
- 字节数: 719038885.0
- 样本数: 20000
- 名称: train_6
- 字节数: 723462396.0
- 样本数: 20000
- 名称: train_7
- 字节数: 716091354.0
- 样本数: 20000
- 名称: train_8
- 字节数: 707670123.0
- 样本数: 20000
- 名称: train_9
- 字节数: 716383829.0
- 样本数: 20000
- 名称: train_10
- 字节数: 717800196.0
- 样本数: 20000
- 名称: train_11
- 字节数: 711828635.0
- 样本数: 20000
- 名称: train_12
- 字节数: 706399554.0
- 样本数: 20000
- 名称: train_13
- 字节数: 725188256.0
- 样本数: 20000
- 名称: train_14
- 字节数: 718145370.0
- 样本数: 20000
- 名称: train_15
- 字节数: 712014398.0
- 样本数: 20000
数据集大小
- 下载大小: 11501396846
- 数据集大小: 11499596682.0
配置
- 配置名称: default
- 数据文件路径:
- 分割: train_0
- 路径: data/train_0-*
- 分割: train_1
- 路径: data/train_1-*
- 分割: train_2
- 路径: data/train_2-*
- 分割: train_3
- 路径: data/train_3-*
- 分割: train_4
- 路径: data/train_4-*
- 分割: train_5
- 路径: data/train_5-*
- 分割: train_6
- 路径: data/train_6-*
- 分割: train_7
- 路径: data/train_7-*
- 分割: train_8
- 路径: data/train_8-*
- 分割: train_9
- 路径: data/train_9-*
- 分割: train_10
- 路径: data/train_10-*
- 分割: train_11
- 路径: data/train_11-*
- 分割: train_12
- 路径: data/train_12-*
- 分割: train_13
- 路径: data/train_13-*
- 分割: train_14
- 路径: data/train_14-*
- 分割: train_15
- 路径: data/train_15-*
- 分割: train_0
- 数据文件路径:



