xPXXX/tevatron_wikipedia-nq_sample100
收藏Hugging Face2024-04-17 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/xPXXX/tevatron_wikipedia-nq_sample100
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: default
features:
- name: query_id
dtype: string
- name: query
dtype: string
- name: answers
list: string
- name: positive_passages
list:
- name: docid
dtype: string
- name: text
dtype: string
- name: title
dtype: string
- name: negative_passages
list:
- name: docid
dtype: string
- name: text
dtype: string
- name: title
dtype: string
splits:
- name: train
num_bytes: 6516114.769199276
num_examples: 100
download_size: 3750358
dataset_size: 6516114.769199276
- config_name: finetune_llama2_no_rag
features:
- name: query_id
dtype: string
- name: query
dtype: string
- name: answers
list: string
- name: positive_passages
list:
- name: docid
dtype: string
- name: text
dtype: string
- name: title
dtype: string
- name: negative_passages
list:
- name: docid
dtype: string
- name: text
dtype: string
- name: title
dtype: string
- name: finetune_llama2_no_rag_response
dtype: string
splits:
- name: train
num_bytes: 6539762
num_examples: 100
download_size: 3774951
dataset_size: 6539762
- config_name: finetune_llama2_rag
features:
- name: query_id
dtype: string
- name: query
dtype: string
- name: answers
list: string
- name: positive_passages
list:
- name: docid
dtype: string
- name: text
dtype: string
- name: title
dtype: string
- name: negative_passages
list:
- name: docid
dtype: string
- name: text
dtype: string
- name: title
dtype: string
- name: finetune_llama2_rag_response
dtype: string
- name: retrieval
sequence: string
splits:
- name: train
num_bytes: 6781481
num_examples: 100
download_size: 3872487
dataset_size: 6781481
- config_name: gpt3_no_rag
features:
- name: query_id
dtype: string
- name: query
dtype: string
- name: answers
list: string
- name: positive_passages
list:
- name: docid
dtype: string
- name: text
dtype: string
- name: title
dtype: string
- name: negative_passages
list:
- name: docid
dtype: string
- name: text
dtype: string
- name: title
dtype: string
- name: gpt3_no_rag_response
dtype: string
splits:
- name: train
num_bytes: 6530548
num_examples: 100
download_size: 3769177
dataset_size: 6530548
- config_name: gpt3_rag
features:
- name: query_id
dtype: string
- name: query
dtype: string
- name: answers
list: string
- name: positive_passages
list:
- name: docid
dtype: string
- name: text
dtype: string
- name: title
dtype: string
- name: negative_passages
list:
- name: docid
dtype: string
- name: text
dtype: string
- name: title
dtype: string
- name: gpt3_rag_response
dtype: string
- name: retrieval
sequence: string
splits:
- name: train
num_bytes: 6794643
num_examples: 100
download_size: 3882537
dataset_size: 6794643
- config_name: gpt4_no_rag
features:
- name: query_id
dtype: string
- name: query
dtype: string
- name: answers
list: string
- name: positive_passages
list:
- name: docid
dtype: string
- name: text
dtype: string
- name: title
dtype: string
- name: negative_passages
list:
- name: docid
dtype: string
- name: text
dtype: string
- name: title
dtype: string
- name: gpt4_no_rag_response
dtype: string
splits:
- name: train
num_bytes: 6565730
num_examples: 100
download_size: 3789108
dataset_size: 6565730
- config_name: gpt4_rag
features:
- name: query_id
dtype: string
- name: query
dtype: string
- name: answers
list: string
- name: positive_passages
list:
- name: docid
dtype: string
- name: text
dtype: string
- name: title
dtype: string
- name: negative_passages
list:
- name: docid
dtype: string
- name: text
dtype: string
- name: title
dtype: string
- name: gpt4_rag_response
dtype: string
- name: retrieval
sequence: string
splits:
- name: train
num_bytes: 6801342
num_examples: 100
download_size: 3885780
dataset_size: 6801342
- config_name: llama2_no_rag
features:
- name: query_id
dtype: string
- name: query
dtype: string
- name: answers
list: string
- name: positive_passages
list:
- name: docid
dtype: string
- name: text
dtype: string
- name: title
dtype: string
- name: negative_passages
list:
- name: docid
dtype: string
- name: text
dtype: string
- name: title
dtype: string
- name: llama2_no_rag_response
dtype: string
splits:
- name: train
num_bytes: 6550530
num_examples: 100
download_size: 3781797
dataset_size: 6550530
- config_name: llama2_rag
features:
- name: query_id
dtype: string
- name: query
dtype: string
- name: answers
list: string
- name: positive_passages
list:
- name: docid
dtype: string
- name: text
dtype: string
- name: title
dtype: string
- name: negative_passages
list:
- name: docid
dtype: string
- name: text
dtype: string
- name: title
dtype: string
- name: llama2_rag_response
dtype: string
- name: retrieval
sequence: string
splits:
- name: train
num_bytes: 6789056
num_examples: 100
download_size: 3877897
dataset_size: 6789056
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- config_name: finetune_llama2_no_rag
data_files:
- split: train
path: finetune_llama2_no_rag/train-*
- config_name: finetune_llama2_rag
data_files:
- split: train
path: finetune_llama2_rag/train-*
- config_name: gpt3_no_rag
data_files:
- split: train
path: gpt3_no_rag/train-*
- config_name: gpt3_rag
data_files:
- split: train
path: gpt3_rag/train-*
- config_name: gpt4_no_rag
data_files:
- split: train
path: gpt4_no_rag/train-*
- config_name: gpt4_rag
data_files:
- split: train
path: gpt4_rag/train-*
- config_name: llama2_no_rag
data_files:
- split: train
path: llama2_no_rag/train-*
- config_name: llama2_rag
data_files:
- split: train
path: llama2_rag/train-*
---
提供机构:
xPXXX
原始信息汇总
数据集概述
数据集配置
默认配置 (default)
- 特征:
query_id: 字符串query: 字符串answers: 字符串列表positive_passages: 列表docid: 字符串text: 字符串title: 字符串
negative_passages: 列表docid: 字符串text: 字符串title: 字符串
- 分割:
train:- 字节数: 6516114.769199276
- 样本数: 100
- 下载大小: 3750358
- 数据集大小: 6516114.769199276
微调 Llama2 无 RAG (finetune_llama2_no_rag)
- 特征:
query_id: 字符串query: 字符串answers: 字符串列表positive_passages: 列表docid: 字符串text: 字符串title: 字符串
negative_passages: 列表docid: 字符串text: 字符串title: 字符串
finetune_llama2_no_rag_response: 字符串
- 分割:
train:- 字节数: 6539762
- 样本数: 100
- 下载大小: 3774951
- 数据集大小: 6539762
微调 Llama2 RAG (finetune_llama2_rag)
- 特征:
query_id: 字符串query: 字符串answers: 字符串列表positive_passages: 列表docid: 字符串text: 字符串title: 字符串
negative_passages: 列表docid: 字符串text: 字符串title: 字符串
finetune_llama2_rag_response: 字符串retrieval: 字符串序列
- 分割:
train:- 字节数: 6781481
- 样本数: 100
- 下载大小: 3872487
- 数据集大小: 6781481
GPT-3 无 RAG (gpt3_no_rag)
- 特征:
query_id: 字符串query: 字符串answers: 字符串列表positive_passages: 列表docid: 字符串text: 字符串title: 字符串
negative_passages: 列表docid: 字符串text: 字符串title: 字符串
gpt3_no_rag_response: 字符串
- 分割:
train:- 字节数: 6530548
- 样本数: 100
- 下载大小: 3769177
- 数据集大小: 6530548
GPT-3 RAG (gpt3_rag)
- 特征:
query_id: 字符串query: 字符串answers: 字符串列表positive_passages: 列表docid: 字符串text: 字符串title: 字符串
negative_passages: 列表docid: 字符串text: 字符串title: 字符串
gpt3_rag_response: 字符串retrieval: 字符串序列
- 分割:
train:- 字节数: 6794643
- 样本数: 100
- 下载大小: 3882537
- 数据集大小: 6794643
GPT-4 无 RAG (gpt4_no_rag)
- 特征:
query_id: 字符串query: 字符串answers: 字符串列表positive_passages: 列表docid: 字符串text: 字符串title: 字符串
negative_passages: 列表docid: 字符串text: 字符串title: 字符串
gpt4_no_rag_response: 字符串
- 分割:
train:- 字节数: 6565730
- 样本数: 100
- 下载大小: 3789108
- 数据集大小: 6565730
GPT-4 RAG (gpt4_rag)
- 特征:
query_id: 字符串query: 字符串answers: 字符串列表positive_passages: 列表docid: 字符串text: 字符串title: 字符串
negative_passages: 列表docid: 字符串text: 字符串title: 字符串
gpt4_rag_response: 字符串retrieval: 字符串序列
- 分割:
train:- 字节数: 6801342
- 样本数: 100
- 下载大小: 3885780
- 数据集大小: 6801342
Llama2 无 RAG (llama2_no_rag)
- 特征:
query_id: 字符串query: 字符串answers: 字符串列表positive_passages: 列表docid: 字符串text: 字符串title: 字符串
negative_passages: 列表docid: 字符串text: 字符串title: 字符串
llama2_no_rag_response: 字符串
- 分割:
train:- 字节数: 6550530
- 样本数: 100
- 下载大小: 3781797
- 数据集大小: 6550530
Llama2 RAG (llama2_rag)
- 特征:
query_id: 字符串query: 字符串answers: 字符串列表positive_passages: 列表docid: 字符串text: 字符串title: 字符串
negative_passages: 列表docid: 字符串text: 字符串title: 字符串
llama2_rag_response: 字符串retrieval: 字符串序列
- 分割:
train:- 字节数: 6789056
- 样本数: 100
- 下载大小: 3877897
- 数据集大小: 6789056



