tyzhu/lmind_hotpot_train5000_eval5000_v1_doc
收藏Hugging Face2024-02-03 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/tyzhu/lmind_hotpot_train5000_eval5000_v1_doc
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: default
data_files:
- split: train_qa
path: data/train_qa-*
- split: train_recite_qa
path: data/train_recite_qa-*
- split: eval_qa
path: data/eval_qa-*
- split: eval_recite_qa
path: data/eval_recite_qa-*
- split: all_docs
path: data/all_docs-*
- split: all_docs_eval
path: data/all_docs_eval-*
- split: train
path: data/train-*
- split: validation
path: data/validation-*
dataset_info:
features:
- name: inputs
dtype: string
- name: targets
dtype: string
- name: answers
struct:
- name: answer_start
sequence: 'null'
- name: text
sequence: string
splits:
- name: train_qa
num_bytes: 864508
num_examples: 5000
- name: train_recite_qa
num_bytes: 5350190
num_examples: 5000
- name: eval_qa
num_bytes: 813536
num_examples: 5000
- name: eval_recite_qa
num_bytes: 5394796
num_examples: 5000
- name: all_docs
num_bytes: 8524332
num_examples: 18224
- name: all_docs_eval
num_bytes: 8523131
num_examples: 18224
- name: train
num_bytes: 8524332
num_examples: 18224
- name: validation
num_bytes: 8524332
num_examples: 18224
download_size: 28418740
dataset_size: 46519157
---
# Dataset Card for "lmind_hotpot_train5000_eval5000_v1_doc"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
The dataset includes multiple configurations, each with different data file paths and types. The features of the dataset include inputs, targets, and answers, where the answers feature contains the answer start position and text. The dataset is divided into multiple parts, including training and evaluation parts, each with specific byte counts and example numbers. The total download size and actual size of the dataset are also clearly recorded.
提供机构:
tyzhu
原始信息汇总
数据集概述
数据集配置
- 默认配置:
- 数据文件路径:
train_qa:data/train_qa-*train_recite_qa:data/train_recite_qa-*eval_qa:data/eval_qa-*eval_recite_qa:data/eval_recite_qa-*all_docs:data/all_docs-*all_docs_eval:data/all_docs_eval-*train:data/train-*validation:data/validation-*
- 数据文件路径:
数据集信息
-
特征:
inputs: 数据类型为stringtargets: 数据类型为stringanswers: 结构化数据,包含以下字段:answer_start: 序列类型为nulltext: 序列类型为string
-
数据分割:
train_qa:- 字节数: 864508
- 样本数: 5000
train_recite_qa:- 字节数: 5350190
- 样本数: 5000
eval_qa:- 字节数: 813536
- 样本数: 5000
eval_recite_qa:- 字节数: 5394796
- 样本数: 5000
all_docs:- 字节数: 8524332
- 样本数: 18224
all_docs_eval:- 字节数: 8523131
- 样本数: 18224
train:- 字节数: 8524332
- 样本数: 18224
validation:- 字节数: 8524332
- 样本数: 18224
-
数据集大小:
- 下载大小: 28418740 字节
- 数据集大小: 46519157 字节



