Hyukkyu/beir-fever
收藏Hugging Face2025-11-25 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/Hyukkyu/beir-fever
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- information-retrieval
- text-retrieval
tags:
- beir
- fever
- information-retrieval
- retrieval
- search
configs:
- config_name: corpus
data_files:
- split: train
path: corpus/train-*
- config_name: queries
data_files:
- split: train
path: queries/train-*
dataset_info:
- config_name: corpus
features:
- name: metadata
dtype: 'null'
- name: title
dtype: string
- name: _id
dtype: string
- name: text
dtype: string
splits:
- name: train
num_bytes: 3095105800
num_examples: 5416568
download_size: 1953809592
dataset_size: 3095105800
- config_name: queries
features:
- name: metadata
dtype: string
- name: text
dtype: string
- name: _id
dtype: string
splits:
- name: train
num_bytes: 28132663
num_examples: 123142
download_size: 11838211
dataset_size: 28132663
---
# BEIR FEVER Dataset (Migrated)
This is a migrated version of BeIR/fever that is compatible with datasets library 4.0.0+.
## Dataset Description
This dataset contains the fever dataset from the BEIR benchmark, converted from the old script-based format to Parquet format.
## Dataset Structure
### Queries
- **Split 'queries'**: 123,142 examples
- Features: ['_id', 'text', 'metadata']
- **Total examples**: 123,142
### Corpus
- **Split 'corpus'**: 5,416,568 examples
- Features: ['_id', 'title', 'text', 'metadata']
- **Total examples**: 5,416,568
## Usage
```python
from datasets import load_dataset
# Load queries (split: queries)
queries = load_dataset("Hyukkyu/beir-fever", "queries", split="queries")
# Load corpus (split: corpus)
corpus = load_dataset("Hyukkyu/beir-fever", "corpus", split="corpus")
```
## Available Splits
### Queries
- `queries`: 123,142 examples
### Corpus
- `corpus`: 5,416,568 examples
## Original Dataset
This dataset is migrated from: BeIR/fever
## Citation
If you use this dataset, please cite the original BEIR paper:
```bibtex
@article{thakur2021beir,
title={BEIR: A Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models},
author={Thakur, Nandan and Reimers, Nils and Ruckle, Andreas and Srivastava, Abhishek and Gurevych, Iryna},
journal={arXiv preprint arXiv:2104.08663},
year={2021}
}
```
提供机构:
Hyukkyu



