g4me/corpus-carolina-b512-domain-train3epval-split
收藏Hugging Face2026-03-07 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/g4me/corpus-carolina-b512-domain-train3epval-split
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: default
data_files:
- split: train_epoch1
path: data/train_epoch1-*
- split: train_epoch2
path: data/train_epoch2-*
- split: train_epoch3
path: data/train_epoch3-*
- split: train
path: data/train-*
- split: validation
path: data/validation-*
dataset_info:
features:
- name: meta
dtype: string
- name: text
dtype: string
- name: __target__
dtype: string
- name: __domain__
dtype: string
- name: __stratum__
dtype: string
splits:
- name: train_epoch1
num_bytes: 13699460745
num_examples: 2003549
- name: train_epoch2
num_bytes: 7311627147
num_examples: 1069327
- name: train_epoch3
num_bytes: 7045049751
num_examples: 1030340
- name: train
num_bytes: 28056137643
num_examples: 4103216
- name: validation
num_bytes: 714738917
num_examples: 105450
download_size: 21922903408
dataset_size: 56827014203
---
提供机构:
g4me



