five

tomaarsen/NanoBEIR-sv

收藏
Hugging Face2025-12-10 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/tomaarsen/NanoBEIR-sv
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: corpus features: - name: _id dtype: string - name: text dtype: string splits: - name: NanoClimateFEVER num_bytes: 5529258 num_examples: 3408 - name: NanoDBPedia num_bytes: 2277864 num_examples: 6045 - name: NanoFEVER num_bytes: 6146023 num_examples: 4996 - name: NanoFiQA2018 num_bytes: 4511769 num_examples: 4598 - name: NanoHotpotQA num_bytes: 1914152 num_examples: 5090 - name: NanoMSMARCO num_bytes: 1760376 num_examples: 5043 - name: NanoNFCorpus num_bytes: 4607632 num_examples: 2953 - name: NanoNQ num_bytes: 2826385 num_examples: 5035 - name: NanoQuoraRetrieval num_bytes: 371488 num_examples: 5046 - name: NanoSCIDOCS num_bytes: 2263775 num_examples: 2210 - name: NanoArguAna num_bytes: 3973260 num_examples: 3635 - name: NanoSciFact num_bytes: 4343030 num_examples: 2919 - name: NanoTouche2020 num_bytes: 13229575 num_examples: 5745 download_size: 31373197 dataset_size: 53754587 - config_name: qrels features: - name: query-id dtype: string - name: corpus-id dtype: string splits: - name: NanoClimateFEVER num_bytes: 4361 num_examples: 148 - name: NanoDBPedia num_bytes: 60640 num_examples: 1158 - name: NanoFEVER num_bytes: 1630 num_examples: 57 - name: NanoFiQA2018 num_bytes: 2200 num_examples: 123 - name: NanoHotpotQA num_bytes: 3885 num_examples: 100 - name: NanoMSMARCO num_bytes: 1065 num_examples: 50 - name: NanoNFCorpus num_bytes: 64851 num_examples: 2518 - name: NanoNQ num_bytes: 1340 num_examples: 57 - name: NanoQuoraRetrieval num_bytes: 1359 num_examples: 70 - name: NanoSCIDOCS num_bytes: 21472 num_examples: 244 - name: NanoArguAna num_bytes: 3496 num_examples: 50 - name: NanoSciFact num_bytes: 1054 num_examples: 56 - name: NanoTouche2020 num_bytes: 45452 num_examples: 932 download_size: 91853 dataset_size: 212805 - config_name: queries features: - name: _id dtype: string - name: text dtype: string splits: - name: NanoClimateFEVER num_bytes: 7492 num_examples: 50 - name: NanoDBPedia num_bytes: 2901 num_examples: 50 - name: NanoFEVER num_bytes: 2968 num_examples: 50 - name: NanoFiQA2018 num_bytes: 3847 num_examples: 50 - name: NanoHotpotQA num_bytes: 6033 num_examples: 50 - name: NanoMSMARCO num_bytes: 2462 num_examples: 50 - name: NanoNFCorpus num_bytes: 2083 num_examples: 50 - name: NanoNQ num_bytes: 3166 num_examples: 50 - name: NanoQuoraRetrieval num_bytes: 3226 num_examples: 50 - name: NanoSCIDOCS num_bytes: 6273 num_examples: 50 - name: NanoArguAna num_bytes: 59148 num_examples: 50 - name: NanoSciFact num_bytes: 5480 num_examples: 50 - name: NanoTouche2020 num_bytes: 2609 num_examples: 49 download_size: 95871 dataset_size: 107688 configs: - config_name: corpus data_files: - split: NanoClimateFEVER path: corpus/NanoClimateFEVER-* - split: NanoDBPedia path: corpus/NanoDBPedia-* - split: NanoFEVER path: corpus/NanoFEVER-* - split: NanoFiQA2018 path: corpus/NanoFiQA2018-* - split: NanoHotpotQA path: corpus/NanoHotpotQA-* - split: NanoMSMARCO path: corpus/NanoMSMARCO-* - split: NanoNFCorpus path: corpus/NanoNFCorpus-* - split: NanoNQ path: corpus/NanoNQ-* - split: NanoQuoraRetrieval path: corpus/NanoQuoraRetrieval-* - split: NanoSCIDOCS path: corpus/NanoSCIDOCS-* - split: NanoArguAna path: corpus/NanoArguAna-* - split: NanoSciFact path: corpus/NanoSciFact-* - split: NanoTouche2020 path: corpus/NanoTouche2020-* - config_name: qrels data_files: - split: NanoClimateFEVER path: qrels/NanoClimateFEVER-* - split: NanoDBPedia path: qrels/NanoDBPedia-* - split: NanoFEVER path: qrels/NanoFEVER-* - split: NanoFiQA2018 path: qrels/NanoFiQA2018-* - split: NanoHotpotQA path: qrels/NanoHotpotQA-* - split: NanoMSMARCO path: qrels/NanoMSMARCO-* - split: NanoNFCorpus path: qrels/NanoNFCorpus-* - split: NanoNQ path: qrels/NanoNQ-* - split: NanoQuoraRetrieval path: qrels/NanoQuoraRetrieval-* - split: NanoSCIDOCS path: qrels/NanoSCIDOCS-* - split: NanoArguAna path: qrels/NanoArguAna-* - split: NanoSciFact path: qrels/NanoSciFact-* - split: NanoTouche2020 path: qrels/NanoTouche2020-* - config_name: queries data_files: - split: NanoClimateFEVER path: queries/NanoClimateFEVER-* - split: NanoDBPedia path: queries/NanoDBPedia-* - split: NanoFEVER path: queries/NanoFEVER-* - split: NanoFiQA2018 path: queries/NanoFiQA2018-* - split: NanoHotpotQA path: queries/NanoHotpotQA-* - split: NanoMSMARCO path: queries/NanoMSMARCO-* - split: NanoNFCorpus path: queries/NanoNFCorpus-* - split: NanoNQ path: queries/NanoNQ-* - split: NanoQuoraRetrieval path: queries/NanoQuoraRetrieval-* - split: NanoSCIDOCS path: queries/NanoSCIDOCS-* - split: NanoArguAna path: queries/NanoArguAna-* - split: NanoSciFact path: queries/NanoSciFact-* - split: NanoTouche2020 path: queries/NanoTouche2020-* default: true ---

数据集信息: 1. 配置名称:corpus 数据特征: - 字段名:_id,数据类型:字符串 - 字段名:text,数据类型:字符串 数据集划分: - 划分名称:NanoClimateFEVER,字节数:5529258,样本数:3408 - 划分名称:NanoDBPedia,字节数:2277864,样本数:6045 - 划分名称:NanoFEVER,字节数:6146023,样本数:4996 - 划分名称:NanoFiQA2018,字节数:4511769,样本数:4598 - 划分名称:NanoHotpotQA,字节数:1914152,样本数:5090 - 划分名称:NanoMSMARCO,字节数:1760376,样本数:5043 - 划分名称:NanoNFCorpus,字节数:4607632,样本数:2953 - 划分名称:NanoNQ,字节数:2826385,样本数:5035 - 划分名称:NanoQuoraRetrieval,字节数:371488,样本数:5046 - 划分名称:NanoSCIDOCS,字节数:2263775,样本数:2210 - 划分名称:NanoArguAna,字节数:3973260,样本数:3635 - 划分名称:NanoSciFact,字节数:4343030,样本数:2919 - 划分名称:NanoTouche2020,字节数:13229575,样本数:5745 下载总大小:31373197字节,数据集总大小:53754587字节 2. 配置名称:qrels 数据特征: - 字段名:query-id,数据类型:字符串 - 字段名:corpus-id,数据类型:字符串 数据集划分: - 划分名称:NanoClimateFEVER,字节数:4361,样本数:148 - 划分名称:NanoDBPedia,字节数:60640,样本数:1158 - 划分名称:NanoFEVER,字节数:1630,样本数:57 - 划分名称:NanoFiQA2018,字节数:2200,样本数:123 - 划分名称:NanoHotpotQA,字节数:3885,样本数:100 - 划分名称:NanoMSMARCO,字节数:1065,样本数:50 - 划分名称:NanoNFCorpus,字节数:64851,样本数:2518 - 划分名称:NanoNQ,字节数:1340,样本数:57 - 划分名称:NanoQuoraRetrieval,字节数:1359,样本数:70 - 划分名称:NanoSCIDOCS,字节数:21472,样本数:244 - 划分名称:NanoArguAna,字节数:3496,样本数:50 - 划分名称:NanoSciFact,字节数:1054,样本数:56 - 划分名称:NanoTouche2020,字节数:45452,样本数:932 下载总大小:91853字节,数据集总大小:212805字节 3. 配置名称:queries 数据特征: - 字段名:_id,数据类型:字符串 - 字段名:text,数据类型:字符串 数据集划分: - 划分名称:NanoClimateFEVER,字节数:7492,样本数:50 - 划分名称:NanoDBPedia,字节数:2901,样本数:50 - 划分名称:NanoFEVER,字节数:2968,样本数:50 - 划分名称:NanoFiQA2018,字节数:3847,样本数:50 - 划分名称:NanoHotpotQA,字节数:6033,样本数:50 - 划分名称:NanoMSMARCO,字节数:2462,样本数:50 - 划分名称:NanoNFCorpus,字节数:2083,样本数:50 - 划分名称:NanoNQ,字节数:3166,样本数:50 - 划分名称:NanoQuoraRetrieval,字节数:3226,样本数:50 - 划分名称:NanoSCIDOCS,字节数:6273,样本数:50 - 划分名称:NanoArguAna,字节数:59148,样本数:50 - 划分名称:NanoSciFact,字节数:5480,样本数:50 - 划分名称:NanoTouche2020,字节数:2609,样本数:49 下载总大小:95871字节,数据集总大小:107688字节 数据集配置: 1. 配置名称:corpus,数据文件: - 划分:NanoClimateFEVER,路径:corpus/NanoClimateFEVER-* - 划分:NanoDBPedia,路径:corpus/NanoDBPedia-* - 划分:NanoFEVER,路径:corpus/NanoFEVER-* - 划分:NanoFiQA2018,路径:corpus/NanoFiQA2018-* - 划分:NanoHotpotQA,路径:corpus/NanoHotpotQA-* - 划分:NanoMSMARCO,路径:corpus/NanoMSMARCO-* - 划分:NanoNFCorpus,路径:corpus/NanoNFCorpus-* - 划分:NanoNQ,路径:corpus/NanoNQ-* - 划分:NanoQuoraRetrieval,路径:corpus/NanoQuoraRetrieval-* - 划分:NanoSCIDOCS,路径:corpus/NanoSCIDOCS-* - 划分:NanoArguAna,路径:corpus/NanoArguAna-* - 划分:NanoSciFact,路径:corpus/NanoSciFact-* - 划分:NanoTouche2020,路径:corpus/NanoTouche2020-* 2. 配置名称:qrels,数据文件: - 划分:NanoClimateFEVER,路径:qrels/NanoClimateFEVER-* - 划分:NanoDBPedia,路径:qrels/NanoDBPedia-* - 划分:NanoFEVER,路径:qrels/NanoFEVER-* - 划分:NanoFiQA2018,路径:qrels/NanoFiQA2018-* - 划分:NanoHotpotQA,路径:qrels/NanoHotpotQA-* - 划分:NanoMSMARCO,路径:qrels/NanoMSMARCO-* - 划分:NanoNFCorpus,路径:qrels/NanoNFCorpus-* - 划分:NanoNQ,路径:qrels/NanoNQ-* - 划分:NanoQuoraRetrieval,路径:qrels/NanoQuoraRetrieval-* - 划分:NanoSCIDOCS,路径:qrels/NanoSCIDOCS-* - 划分:NanoArguAna,路径:qrels/NanoArguAna-* - 划分:NanoSciFact,路径:qrels/NanoSciFact-* - 划分:NanoTouche2020,路径:qrels/NanoTouche2020-* 3. 配置名称:queries,数据文件: - 划分:NanoClimateFEVER,路径:queries/NanoClimateFEVER-* - 划分:NanoDBPedia,路径:queries/NanoDBPedia-* - 划分:NanoFEVER,路径:queries/NanoFEVER-* - 划分:NanoFiQA2018,路径:queries/NanoFiQA2018-* - 划分:NanoHotpotQA,路径:queries/NanoHotpotQA-* - 划分:NanoMSMARCO,路径:queries/NanoMSMARCO-* - 划分:NanoNFCorpus,路径:queries/NanoNFCorpus-* - 划分:NanoNQ,路径:queries/NanoNQ-* - 划分:NanoQuoraRetrieval,路径:queries/NanoQuoraRetrieval-* - 划分:NanoSCIDOCS,路径:queries/NanoSCIDOCS-* - 划分:NanoArguAna,路径:queries/NanoArguAna-* - 划分:NanoSciFact,路径:queries/NanoSciFact-* - 划分:NanoTouche2020,路径:queries/NanoTouche2020-* 默认配置:corpus
提供机构:
tomaarsen
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作