coastalcph/wiqueen_aug
收藏Hugging Face2024-05-07 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/coastalcph/wiqueen_aug
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: q_1_type
dtype: string
- name: q_1_id
dtype: string
- name: p_type
dtype: string
- name: p_id
dtype: string
- name: q_2_type
dtype: string
- name: q_2_id
dtype: string
- name: q_1_source
dtype: string
- name: q_1_source_id
dtype: string
- name: q_1_target
dtype: string
- name: q_1_target_id
dtype: string
- name: q_2_source
dtype: string
- name: q_2_source_id
dtype: string
- name: q_2_target
dtype: string
- name: q_2_target_id
dtype: string
- name: lang
dtype: string
- name: id
dtype: string
splits:
- name: train
num_bytes: 35750900
num_examples: 155013
download_size: 16806791
dataset_size: 35750900
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
The dataset includes multiple features such as question types, question IDs, sources and targets of questions, language, and ID information. It is divided into a training set with 155013 samples. The download size of the dataset is 16806791 bytes, and the actual size is 35750900 bytes.
提供机构:
coastalcph
原始信息汇总
数据集概述
数据集特征
- q_1_type: 数据类型为字符串
- q_1_id: 数据类型为字符串
- p_type: 数据类型为字符串
- p_id: 数据类型为字符串
- q_2_type: 数据类型为字符串
- q_2_id: 数据类型为字符串
- q_1_source: 数据类型为字符串
- q_1_source_id: 数据类型为字符串
- q_1_target: 数据类型为字符串
- q_1_target_id: 数据类型为字符串
- q_2_source: 数据类型为字符串
- q_2_source_id: 数据类型为字符串
- q_2_target: 数据类型为字符串
- q_2_target_id: 数据类型为字符串
- lang: 数据类型为字符串
- id: 数据类型为字符串
数据集划分
- 训练集(train):
- 数据量: 35750900 字节
- 样本数: 155013
数据集大小
- 下载大小: 16806791 字节
- 数据集总大小: 35750900 字节



