tyzhu/squad_title_v4_train_30_eval_10_deduped

Name: tyzhu/squad_title_v4_train_30_eval_10_deduped
Creator: tyzhu
Published: 2023-10-17 06:06:35
License: 暂无描述

Hugging Face2023-10-17 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/tyzhu/squad_title_v4_train_30_eval_10_deduped

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: id dtype: string - name: title dtype: string - name: context dtype: string - name: question dtype: string - name: answers sequence: - name: text dtype: string - name: answer_start dtype: int32 - name: context_id dtype: string - name: inputs dtype: string - name: targets dtype: string splits: - name: train num_bytes: 300178.52173913043 num_examples: 199 - name: validation num_bytes: 50807 num_examples: 50 download_size: 98978 dataset_size: 350985.52173913043 --- # Dataset Card for "squad_title_v4_train_30_eval_10_deduped" [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)

The dataset includes multiple features such as id, title, context, question, answers, etc., where answers is a sequence containing sub-features text and answer_start. The dataset is divided into a training set and a validation set, containing 199 and 50 samples respectively. The total download size of the dataset is 98978 bytes, and the total size is 350985.52173913043 bytes.

提供机构：

tyzhu

原始信息汇总

数据集概述

数据集名称

名称: squad_title_v4_train_30_eval_10_deduped

数据集特征

特征列表:
- id: 字符串类型
- title: 字符串类型
- context: 字符串类型
- question: 字符串类型
- answers: 序列类型，包含以下子特征:
  - text: 字符串类型
  - answer_start: 32位整数类型
- context_id: 字符串类型
- inputs: 字符串类型
- targets: 字符串类型

数据集分割

训练集:
- 名称: train
- 字节数: 300178.52173913043
- 样本数: 199
验证集:
- 名称: validation
- 字节数: 50807
- 样本数: 50

数据集大小

下载大小: 98978
数据集大小: 350985.52173913043

5,000+

优质数据集

54 个

任务类型

进入经典数据集