ohsuz/fineweb_10000
收藏Hugging Face2024-06-09 更新2024-06-29 收录
下载链接:
https://hf-mirror.com/datasets/ohsuz/fineweb_10000
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: range_1000
features:
- name: eng
dtype: string
- name: ko
dtype: string
splits:
- name: train
num_bytes: 7528809
num_examples: 1000
download_size: 3782220
dataset_size: 7528809
- config_name: range_10000
features:
- name: eng
dtype: string
- name: ko
dtype: string
splits:
- name: train
num_bytes: 9656763
num_examples: 1000
download_size: 4951550
dataset_size: 9656763
- config_name: range_2000
features:
- name: eng
dtype: string
- name: ko
dtype: string
splits:
- name: train
num_bytes: 8528924
num_examples: 1000
download_size: 4283140
dataset_size: 8528924
- config_name: range_3000
features:
- name: eng
dtype: string
- name: ko
dtype: string
splits:
- name: train
num_bytes: 8390947
num_examples: 1000
download_size: 4225446
dataset_size: 8390947
- config_name: range_4000
features:
- name: eng
dtype: string
- name: ko
dtype: string
splits:
- name: train
num_bytes: 8621320
num_examples: 1000
download_size: 4342329
dataset_size: 8621320
- config_name: range_5000
features:
- name: eng
dtype: string
- name: ko
dtype: string
splits:
- name: train
num_bytes: 8783296
num_examples: 1000
download_size: 4481703
dataset_size: 8783296
- config_name: range_6000
features:
- name: eng
dtype: string
- name: ko
dtype: string
splits:
- name: train
num_bytes: 9167386
num_examples: 1000
download_size: 4675599
dataset_size: 9167386
- config_name: range_7000
features:
- name: eng
dtype: string
- name: ko
dtype: string
splits:
- name: train
num_bytes: 8465596
num_examples: 1000
download_size: 4340244
dataset_size: 8465596
- config_name: range_8000
features:
- name: eng
dtype: string
- name: ko
dtype: string
splits:
- name: train
num_bytes: 8945078
num_examples: 1000
download_size: 4579263
dataset_size: 8945078
- config_name: range_9000
features:
- name: eng
dtype: string
- name: ko
dtype: string
splits:
- name: train
num_bytes: 8757161
num_examples: 1000
download_size: 4486561
dataset_size: 8757161
configs:
- config_name: range_1000
data_files:
- split: train
path: range_1000/train-*
- config_name: range_10000
data_files:
- split: train
path: range_10000/train-*
- config_name: range_2000
data_files:
- split: train
path: range_2000/train-*
- config_name: range_3000
data_files:
- split: train
path: range_3000/train-*
- config_name: range_4000
data_files:
- split: train
path: range_4000/train-*
- config_name: range_5000
data_files:
- split: train
path: range_5000/train-*
- config_name: range_6000
data_files:
- split: train
path: range_6000/train-*
- config_name: range_7000
data_files:
- split: train
path: range_7000/train-*
- config_name: range_8000
data_files:
- split: train
path: range_8000/train-*
- config_name: range_9000
data_files:
- split: train
path: range_9000/train-*
---
提供机构:
ohsuz
原始信息汇总
数据集概述
数据集配置
range_1000
- 特征:
eng: stringko: string
- 分割:
train:- 字节数: 7528809
- 样本数: 1000
- 下载大小: 3782220
- 数据集大小: 7528809
range_10000
- 特征:
eng: stringko: string
- 分割:
train:- 字节数: 9656763
- 样本数: 1000
- 下载大小: 4951550
- 数据集大小: 9656763
range_2000
- 特征:
eng: stringko: string
- 分割:
train:- 字节数: 8528924
- 样本数: 1000
- 下载大小: 4283140
- 数据集大小: 8528924
range_3000
- 特征:
eng: stringko: string
- 分割:
train:- 字节数: 8390947
- 样本数: 1000
- 下载大小: 4225446
- 数据集大小: 8390947
range_4000
- 特征:
eng: stringko: string
- 分割:
train:- 字节数: 8621320
- 样本数: 1000
- 下载大小: 4342329
- 数据集大小: 8621320
range_5000
- 特征:
eng: stringko: string
- 分割:
train:- 字节数: 8783296
- 样本数: 1000
- 下载大小: 4481703
- 数据集大小: 8783296
range_6000
- 特征:
eng: stringko: string
- 分割:
train:- 字节数: 9167386
- 样本数: 1000
- 下载大小: 4675599
- 数据集大小: 9167386
range_7000
- 特征:
eng: stringko: string
- 分割:
train:- 字节数: 8465596
- 样本数: 1000
- 下载大小: 4340244
- 数据集大小: 8465596
range_8000
- 特征:
eng: stringko: string
- 分割:
train:- 字节数: 8945078
- 样本数: 1000
- 下载大小: 4579263
- 数据集大小: 8945078
range_9000
- 特征:
eng: stringko: string
- 分割:
train:- 字节数: 8757161
- 样本数: 1000
- 下载大小: 4486561
- 数据集大小: 8757161



