tomaarsen/NanoBEIR-sv
收藏Hugging Face2025-12-10 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/tomaarsen/NanoBEIR-sv
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: corpus
features:
- name: _id
dtype: string
- name: text
dtype: string
splits:
- name: NanoClimateFEVER
num_bytes: 5529258
num_examples: 3408
- name: NanoDBPedia
num_bytes: 2277864
num_examples: 6045
- name: NanoFEVER
num_bytes: 6146023
num_examples: 4996
- name: NanoFiQA2018
num_bytes: 4511769
num_examples: 4598
- name: NanoHotpotQA
num_bytes: 1914152
num_examples: 5090
- name: NanoMSMARCO
num_bytes: 1760376
num_examples: 5043
- name: NanoNFCorpus
num_bytes: 4607632
num_examples: 2953
- name: NanoNQ
num_bytes: 2826385
num_examples: 5035
- name: NanoQuoraRetrieval
num_bytes: 371488
num_examples: 5046
- name: NanoSCIDOCS
num_bytes: 2263775
num_examples: 2210
- name: NanoArguAna
num_bytes: 3973260
num_examples: 3635
- name: NanoSciFact
num_bytes: 4343030
num_examples: 2919
- name: NanoTouche2020
num_bytes: 13229575
num_examples: 5745
download_size: 31373197
dataset_size: 53754587
- config_name: qrels
features:
- name: query-id
dtype: string
- name: corpus-id
dtype: string
splits:
- name: NanoClimateFEVER
num_bytes: 4361
num_examples: 148
- name: NanoDBPedia
num_bytes: 60640
num_examples: 1158
- name: NanoFEVER
num_bytes: 1630
num_examples: 57
- name: NanoFiQA2018
num_bytes: 2200
num_examples: 123
- name: NanoHotpotQA
num_bytes: 3885
num_examples: 100
- name: NanoMSMARCO
num_bytes: 1065
num_examples: 50
- name: NanoNFCorpus
num_bytes: 64851
num_examples: 2518
- name: NanoNQ
num_bytes: 1340
num_examples: 57
- name: NanoQuoraRetrieval
num_bytes: 1359
num_examples: 70
- name: NanoSCIDOCS
num_bytes: 21472
num_examples: 244
- name: NanoArguAna
num_bytes: 3496
num_examples: 50
- name: NanoSciFact
num_bytes: 1054
num_examples: 56
- name: NanoTouche2020
num_bytes: 45452
num_examples: 932
download_size: 91853
dataset_size: 212805
- config_name: queries
features:
- name: _id
dtype: string
- name: text
dtype: string
splits:
- name: NanoClimateFEVER
num_bytes: 7492
num_examples: 50
- name: NanoDBPedia
num_bytes: 2901
num_examples: 50
- name: NanoFEVER
num_bytes: 2968
num_examples: 50
- name: NanoFiQA2018
num_bytes: 3847
num_examples: 50
- name: NanoHotpotQA
num_bytes: 6033
num_examples: 50
- name: NanoMSMARCO
num_bytes: 2462
num_examples: 50
- name: NanoNFCorpus
num_bytes: 2083
num_examples: 50
- name: NanoNQ
num_bytes: 3166
num_examples: 50
- name: NanoQuoraRetrieval
num_bytes: 3226
num_examples: 50
- name: NanoSCIDOCS
num_bytes: 6273
num_examples: 50
- name: NanoArguAna
num_bytes: 59148
num_examples: 50
- name: NanoSciFact
num_bytes: 5480
num_examples: 50
- name: NanoTouche2020
num_bytes: 2609
num_examples: 49
download_size: 95871
dataset_size: 107688
configs:
- config_name: corpus
data_files:
- split: NanoClimateFEVER
path: corpus/NanoClimateFEVER-*
- split: NanoDBPedia
path: corpus/NanoDBPedia-*
- split: NanoFEVER
path: corpus/NanoFEVER-*
- split: NanoFiQA2018
path: corpus/NanoFiQA2018-*
- split: NanoHotpotQA
path: corpus/NanoHotpotQA-*
- split: NanoMSMARCO
path: corpus/NanoMSMARCO-*
- split: NanoNFCorpus
path: corpus/NanoNFCorpus-*
- split: NanoNQ
path: corpus/NanoNQ-*
- split: NanoQuoraRetrieval
path: corpus/NanoQuoraRetrieval-*
- split: NanoSCIDOCS
path: corpus/NanoSCIDOCS-*
- split: NanoArguAna
path: corpus/NanoArguAna-*
- split: NanoSciFact
path: corpus/NanoSciFact-*
- split: NanoTouche2020
path: corpus/NanoTouche2020-*
- config_name: qrels
data_files:
- split: NanoClimateFEVER
path: qrels/NanoClimateFEVER-*
- split: NanoDBPedia
path: qrels/NanoDBPedia-*
- split: NanoFEVER
path: qrels/NanoFEVER-*
- split: NanoFiQA2018
path: qrels/NanoFiQA2018-*
- split: NanoHotpotQA
path: qrels/NanoHotpotQA-*
- split: NanoMSMARCO
path: qrels/NanoMSMARCO-*
- split: NanoNFCorpus
path: qrels/NanoNFCorpus-*
- split: NanoNQ
path: qrels/NanoNQ-*
- split: NanoQuoraRetrieval
path: qrels/NanoQuoraRetrieval-*
- split: NanoSCIDOCS
path: qrels/NanoSCIDOCS-*
- split: NanoArguAna
path: qrels/NanoArguAna-*
- split: NanoSciFact
path: qrels/NanoSciFact-*
- split: NanoTouche2020
path: qrels/NanoTouche2020-*
- config_name: queries
data_files:
- split: NanoClimateFEVER
path: queries/NanoClimateFEVER-*
- split: NanoDBPedia
path: queries/NanoDBPedia-*
- split: NanoFEVER
path: queries/NanoFEVER-*
- split: NanoFiQA2018
path: queries/NanoFiQA2018-*
- split: NanoHotpotQA
path: queries/NanoHotpotQA-*
- split: NanoMSMARCO
path: queries/NanoMSMARCO-*
- split: NanoNFCorpus
path: queries/NanoNFCorpus-*
- split: NanoNQ
path: queries/NanoNQ-*
- split: NanoQuoraRetrieval
path: queries/NanoQuoraRetrieval-*
- split: NanoSCIDOCS
path: queries/NanoSCIDOCS-*
- split: NanoArguAna
path: queries/NanoArguAna-*
- split: NanoSciFact
path: queries/NanoSciFact-*
- split: NanoTouche2020
path: queries/NanoTouche2020-*
default: true
---
数据集信息:
1. 配置名称:corpus
数据特征:
- 字段名:_id,数据类型:字符串
- 字段名:text,数据类型:字符串
数据集划分:
- 划分名称:NanoClimateFEVER,字节数:5529258,样本数:3408
- 划分名称:NanoDBPedia,字节数:2277864,样本数:6045
- 划分名称:NanoFEVER,字节数:6146023,样本数:4996
- 划分名称:NanoFiQA2018,字节数:4511769,样本数:4598
- 划分名称:NanoHotpotQA,字节数:1914152,样本数:5090
- 划分名称:NanoMSMARCO,字节数:1760376,样本数:5043
- 划分名称:NanoNFCorpus,字节数:4607632,样本数:2953
- 划分名称:NanoNQ,字节数:2826385,样本数:5035
- 划分名称:NanoQuoraRetrieval,字节数:371488,样本数:5046
- 划分名称:NanoSCIDOCS,字节数:2263775,样本数:2210
- 划分名称:NanoArguAna,字节数:3973260,样本数:3635
- 划分名称:NanoSciFact,字节数:4343030,样本数:2919
- 划分名称:NanoTouche2020,字节数:13229575,样本数:5745
下载总大小:31373197字节,数据集总大小:53754587字节
2. 配置名称:qrels
数据特征:
- 字段名:query-id,数据类型:字符串
- 字段名:corpus-id,数据类型:字符串
数据集划分:
- 划分名称:NanoClimateFEVER,字节数:4361,样本数:148
- 划分名称:NanoDBPedia,字节数:60640,样本数:1158
- 划分名称:NanoFEVER,字节数:1630,样本数:57
- 划分名称:NanoFiQA2018,字节数:2200,样本数:123
- 划分名称:NanoHotpotQA,字节数:3885,样本数:100
- 划分名称:NanoMSMARCO,字节数:1065,样本数:50
- 划分名称:NanoNFCorpus,字节数:64851,样本数:2518
- 划分名称:NanoNQ,字节数:1340,样本数:57
- 划分名称:NanoQuoraRetrieval,字节数:1359,样本数:70
- 划分名称:NanoSCIDOCS,字节数:21472,样本数:244
- 划分名称:NanoArguAna,字节数:3496,样本数:50
- 划分名称:NanoSciFact,字节数:1054,样本数:56
- 划分名称:NanoTouche2020,字节数:45452,样本数:932
下载总大小:91853字节,数据集总大小:212805字节
3. 配置名称:queries
数据特征:
- 字段名:_id,数据类型:字符串
- 字段名:text,数据类型:字符串
数据集划分:
- 划分名称:NanoClimateFEVER,字节数:7492,样本数:50
- 划分名称:NanoDBPedia,字节数:2901,样本数:50
- 划分名称:NanoFEVER,字节数:2968,样本数:50
- 划分名称:NanoFiQA2018,字节数:3847,样本数:50
- 划分名称:NanoHotpotQA,字节数:6033,样本数:50
- 划分名称:NanoMSMARCO,字节数:2462,样本数:50
- 划分名称:NanoNFCorpus,字节数:2083,样本数:50
- 划分名称:NanoNQ,字节数:3166,样本数:50
- 划分名称:NanoQuoraRetrieval,字节数:3226,样本数:50
- 划分名称:NanoSCIDOCS,字节数:6273,样本数:50
- 划分名称:NanoArguAna,字节数:59148,样本数:50
- 划分名称:NanoSciFact,字节数:5480,样本数:50
- 划分名称:NanoTouche2020,字节数:2609,样本数:49
下载总大小:95871字节,数据集总大小:107688字节
数据集配置:
1. 配置名称:corpus,数据文件:
- 划分:NanoClimateFEVER,路径:corpus/NanoClimateFEVER-*
- 划分:NanoDBPedia,路径:corpus/NanoDBPedia-*
- 划分:NanoFEVER,路径:corpus/NanoFEVER-*
- 划分:NanoFiQA2018,路径:corpus/NanoFiQA2018-*
- 划分:NanoHotpotQA,路径:corpus/NanoHotpotQA-*
- 划分:NanoMSMARCO,路径:corpus/NanoMSMARCO-*
- 划分:NanoNFCorpus,路径:corpus/NanoNFCorpus-*
- 划分:NanoNQ,路径:corpus/NanoNQ-*
- 划分:NanoQuoraRetrieval,路径:corpus/NanoQuoraRetrieval-*
- 划分:NanoSCIDOCS,路径:corpus/NanoSCIDOCS-*
- 划分:NanoArguAna,路径:corpus/NanoArguAna-*
- 划分:NanoSciFact,路径:corpus/NanoSciFact-*
- 划分:NanoTouche2020,路径:corpus/NanoTouche2020-*
2. 配置名称:qrels,数据文件:
- 划分:NanoClimateFEVER,路径:qrels/NanoClimateFEVER-*
- 划分:NanoDBPedia,路径:qrels/NanoDBPedia-*
- 划分:NanoFEVER,路径:qrels/NanoFEVER-*
- 划分:NanoFiQA2018,路径:qrels/NanoFiQA2018-*
- 划分:NanoHotpotQA,路径:qrels/NanoHotpotQA-*
- 划分:NanoMSMARCO,路径:qrels/NanoMSMARCO-*
- 划分:NanoNFCorpus,路径:qrels/NanoNFCorpus-*
- 划分:NanoNQ,路径:qrels/NanoNQ-*
- 划分:NanoQuoraRetrieval,路径:qrels/NanoQuoraRetrieval-*
- 划分:NanoSCIDOCS,路径:qrels/NanoSCIDOCS-*
- 划分:NanoArguAna,路径:qrels/NanoArguAna-*
- 划分:NanoSciFact,路径:qrels/NanoSciFact-*
- 划分:NanoTouche2020,路径:qrels/NanoTouche2020-*
3. 配置名称:queries,数据文件:
- 划分:NanoClimateFEVER,路径:queries/NanoClimateFEVER-*
- 划分:NanoDBPedia,路径:queries/NanoDBPedia-*
- 划分:NanoFEVER,路径:queries/NanoFEVER-*
- 划分:NanoFiQA2018,路径:queries/NanoFiQA2018-*
- 划分:NanoHotpotQA,路径:queries/NanoHotpotQA-*
- 划分:NanoMSMARCO,路径:queries/NanoMSMARCO-*
- 划分:NanoNFCorpus,路径:queries/NanoNFCorpus-*
- 划分:NanoNQ,路径:queries/NanoNQ-*
- 划分:NanoQuoraRetrieval,路径:queries/NanoQuoraRetrieval-*
- 划分:NanoSCIDOCS,路径:queries/NanoSCIDOCS-*
- 划分:NanoArguAna,路径:queries/NanoArguAna-*
- 划分:NanoSciFact,路径:queries/NanoSciFact-*
- 划分:NanoTouche2020,路径:queries/NanoTouche2020-*
默认配置:corpus
提供机构:
tomaarsen



