xPXXX/hotpot_qa_sample100
收藏Hugging Face2024-04-18 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/xPXXX/hotpot_qa_sample100
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: default
features:
- name: id
dtype: string
- name: question
dtype: string
- name: answer
dtype: string
- name: type
dtype: string
- name: level
dtype: string
- name: supporting_facts
sequence:
- name: title
dtype: string
- name: sent_id
dtype: int32
- name: context
sequence:
- name: title
dtype: string
- name: sentences
sequence: string
splits:
- name: train
num_bytes: 611351.1725098677
num_examples: 100
download_size: 365378
dataset_size: 611351.1725098677
- config_name: finetune_llama2_no_rag
features:
- name: id
dtype: string
- name: question
dtype: string
- name: answer
dtype: string
- name: type
dtype: string
- name: level
dtype: string
- name: supporting_facts
sequence:
- name: title
dtype: string
- name: sent_id
dtype: int32
- name: context
sequence:
- name: title
dtype: string
- name: sentences
sequence: string
- name: finetune_llama2_no_rag_response
dtype: string
splits:
- name: train
num_bytes: 609709
num_examples: 100
download_size: 377161
dataset_size: 609709
- config_name: finetune_llama2_rag
features:
- name: id
dtype: string
- name: question
dtype: string
- name: answer
dtype: string
- name: type
dtype: string
- name: level
dtype: string
- name: supporting_facts
sequence:
- name: title
dtype: string
- name: sent_id
dtype: int32
- name: context
sequence:
- name: title
dtype: string
- name: sentences
sequence: string
- name: finetune_llama2_rag_response
dtype: string
- name: retrieval
sequence: string
splits:
- name: train
num_bytes: 803956
num_examples: 100
download_size: 487083
dataset_size: 803956
- config_name: gpt3_no_rag
features:
- name: id
dtype: string
- name: question
dtype: string
- name: answer
dtype: string
- name: type
dtype: string
- name: level
dtype: string
- name: supporting_facts
sequence:
- name: title
dtype: string
- name: sent_id
dtype: int32
- name: context
sequence:
- name: title
dtype: string
- name: sentences
sequence: string
- name: gpt3_no_rag_response
dtype: string
splits:
- name: train
num_bytes: 610246
num_examples: 100
download_size: 377921
dataset_size: 610246
- config_name: gpt3_rag
features:
- name: id
dtype: string
- name: question
dtype: string
- name: answer
dtype: string
- name: type
dtype: string
- name: level
dtype: string
- name: supporting_facts
sequence:
- name: title
dtype: string
- name: sent_id
dtype: int32
- name: context
sequence:
- name: title
dtype: string
- name: sentences
sequence: string
- name: gpt3_rag_response
dtype: string
- name: retrieval
sequence: string
splits:
- name: train
num_bytes: 810707
num_examples: 100
download_size: 492271
dataset_size: 810707
- config_name: gpt4_no_rag
features:
- name: id
dtype: string
- name: question
dtype: string
- name: answer
dtype: string
- name: type
dtype: string
- name: level
dtype: string
- name: supporting_facts
sequence:
- name: title
dtype: string
- name: sent_id
dtype: int32
- name: context
sequence:
- name: title
dtype: string
- name: sentences
sequence: string
- name: gpt4_no_rag_response
dtype: string
splits:
- name: train
num_bytes: 645859
num_examples: 100
download_size: 402060
dataset_size: 645859
- config_name: gpt4_rag
features:
- name: id
dtype: string
- name: question
dtype: string
- name: answer
dtype: string
- name: type
dtype: string
- name: level
dtype: string
- name: supporting_facts
sequence:
- name: title
dtype: string
- name: sent_id
dtype: int32
- name: context
sequence:
- name: title
dtype: string
- name: sentences
sequence: string
- name: gpt4_rag_response
dtype: string
- name: retrieval
sequence: string
splits:
- name: train
num_bytes: 818975
num_examples: 100
download_size: 497401
dataset_size: 818975
- config_name: llama2_no_rag
features:
- name: id
dtype: string
- name: question
dtype: string
- name: answer
dtype: string
- name: type
dtype: string
- name: level
dtype: string
- name: supporting_facts
sequence:
- name: title
dtype: string
- name: sent_id
dtype: int32
- name: context
sequence:
- name: title
dtype: string
- name: sentences
sequence: string
- name: llama2_no_rag_response
dtype: string
splits:
- name: train
num_bytes: 620418
num_examples: 100
download_size: 383496
dataset_size: 620418
- config_name: llama2_rag
features:
- name: id
dtype: string
- name: question
dtype: string
- name: answer
dtype: string
- name: type
dtype: string
- name: level
dtype: string
- name: supporting_facts
sequence:
- name: title
dtype: string
- name: sent_id
dtype: int32
- name: context
sequence:
- name: title
dtype: string
- name: sentences
sequence: string
- name: llama2_rag_response
dtype: string
- name: retrieval
sequence: string
splits:
- name: train
num_bytes: 816847
num_examples: 100
download_size: 494298
dataset_size: 816847
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- config_name: finetune_llama2_no_rag
data_files:
- split: train
path: finetune_llama2_no_rag/train-*
- config_name: finetune_llama2_rag
data_files:
- split: train
path: finetune_llama2_rag/train-*
- config_name: gpt3_no_rag
data_files:
- split: train
path: gpt3_no_rag/train-*
- config_name: gpt3_rag
data_files:
- split: train
path: gpt3_rag/train-*
- config_name: gpt4_no_rag
data_files:
- split: train
path: gpt4_no_rag/train-*
- config_name: gpt4_rag
data_files:
- split: train
path: gpt4_rag/train-*
- config_name: llama2_no_rag
data_files:
- split: train
path: llama2_no_rag/train-*
- config_name: llama2_rag
data_files:
- split: train
path: llama2_rag/train-*
---
提供机构:
xPXXX
原始信息汇总
数据集概述
数据集配置
| 配置名称 | 描述信息 |
|---|---|
| default | 包含基本的特征,如id, question, answer, type, level, supporting_facts, context |
| finetune_llama2_no_rag | 在default基础上增加finetune_llama2_no_rag_response |
| finetune_llama2_rag | 在default基础上增加finetune_llama2_rag_response和retrieval |
| gpt3_no_rag | 在default基础上增加gpt3_no_rag_response |
| gpt3_rag | 在default基础上增加gpt3_rag_response和retrieval |
| gpt4_no_rag | 在default基础上增加gpt4_no_rag_response |
| gpt4_rag | 在default基础上增加gpt4_rag_response和retrieval |
| llama2_no_rag | 在default基础上增加llama2_no_rag_response |
| llama2_rag | 在default基础上增加llama2_rag_response和retrieval |
数据集特征
- id: string
- question: string
- answer: string
- type: string
- level: string
- supporting_facts: sequence
- title: string
- sent_id: int32
- context: sequence
- title: string
- sentences: sequence: string
数据集大小
| 配置名称 | 训练集大小 (字节) | 下载大小 (字节) |
|---|---|---|
| default | 611351.1725098677 | 365378 |
| finetune_llama2_no_rag | 609709 | 377161 |
| finetune_llama2_rag | 803956 | 487083 |
| gpt3_no_rag | 610246 | 377921 |
| gpt3_rag | 810707 | 492271 |
| gpt4_no_rag | 645859 | 402060 |
| gpt4_rag | 818975 | 497401 |
| llama2_no_rag | 620418 | 383496 |
| llama2_rag | 816847 | 494298 |
数据集分割
- train: 每个配置的训练集包含100个示例。



