keylazy/ark
收藏Hugging Face2023-11-16 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/keylazy/ark
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: evaluation
path: data/evaluation-*
- split: test
path: data/test-*
- split: train_full
path: data/train_full-*
dataset_info:
features:
- name: text1
dtype: string
- name: text2
dtype: string
splits:
- name: train
num_bytes: 246977207
num_examples: 900000
- name: evaluation
num_bytes: 27414347
num_examples: 100000
- name: test
num_bytes: 27471369
num_examples: 100000
- name: train_full
num_bytes: 274391554
num_examples: 1000000
download_size: 189206059
dataset_size: 576254477
---
# Dataset Card for "ark"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
keylazy
原始信息汇总
数据集概述
配置信息
- 默认配置:
- 数据文件:
- 训练集(train):
data/train-* - 评估集(evaluation):
data/evaluation-* - 测试集(test):
data/test-* - 完整训练集(train_full):
data/train_full-*
- 训练集(train):
- 数据文件:
数据集信息
-
特征:
text1:字符串类型text2:字符串类型
-
拆分:
- 训练集(train):
- 字节数:246977207
- 样本数:900000
- 评估集(evaluation):
- 字节数:27414347
- 样本数:100000
- 测试集(test):
- 字节数:27471369
- 样本数:100000
- 完整训练集(train_full):
- 字节数:274391554
- 样本数:1000000
- 训练集(train):
-
数据大小:
- 下载大小:189206059
- 数据集大小:576254477



