five

tomaarsen/NanoBEIR-ja

收藏
Hugging Face2025-12-10 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/tomaarsen/NanoBEIR-ja
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: corpus features: - name: _id dtype: string - name: text dtype: string splits: - name: NanoClimateFEVER num_bytes: 6537653 num_examples: 3408 - name: NanoDBPedia num_bytes: 3125718 num_examples: 6045 - name: NanoFEVER num_bytes: 8198841 num_examples: 4996 - name: NanoFiQA2018 num_bytes: 5677326 num_examples: 4598 - name: NanoHotpotQA num_bytes: 2652164 num_examples: 5090 - name: NanoMSMARCO num_bytes: 2203454 num_examples: 5043 - name: NanoNFCorpus num_bytes: 5296700 num_examples: 2953 - name: NanoNQ num_bytes: 3539146 num_examples: 5035 - name: NanoQuoraRetrieval num_bytes: 536226 num_examples: 5046 - name: NanoSCIDOCS num_bytes: 2599191 num_examples: 2210 - name: NanoArguAna num_bytes: 4725257 num_examples: 3635 - name: NanoSciFact num_bytes: 5036450 num_examples: 2919 - name: NanoTouche2020 num_bytes: 15382584 num_examples: 5745 download_size: 35840420 dataset_size: 65510710 - config_name: qrels features: - name: query-id dtype: string - name: corpus-id dtype: string splits: - name: NanoClimateFEVER num_bytes: 4361 num_examples: 148 - name: NanoDBPedia num_bytes: 60640 num_examples: 1158 - name: NanoFEVER num_bytes: 1630 num_examples: 57 - name: NanoFiQA2018 num_bytes: 2200 num_examples: 123 - name: NanoHotpotQA num_bytes: 3885 num_examples: 100 - name: NanoMSMARCO num_bytes: 1065 num_examples: 50 - name: NanoNFCorpus num_bytes: 64851 num_examples: 2518 - name: NanoNQ num_bytes: 1340 num_examples: 57 - name: NanoQuoraRetrieval num_bytes: 1359 num_examples: 70 - name: NanoSCIDOCS num_bytes: 21472 num_examples: 244 - name: NanoArguAna num_bytes: 3496 num_examples: 50 - name: NanoSciFact num_bytes: 1054 num_examples: 56 - name: NanoTouche2020 num_bytes: 45452 num_examples: 932 download_size: 91853 dataset_size: 212805 - config_name: queries features: - name: _id dtype: string - name: text dtype: string splits: - name: NanoClimateFEVER num_bytes: 8888 num_examples: 50 - name: NanoDBPedia num_bytes: 4411 num_examples: 50 - name: NanoFEVER num_bytes: 4533 num_examples: 50 - name: NanoFiQA2018 num_bytes: 4775 num_examples: 50 - name: NanoHotpotQA num_bytes: 8060 num_examples: 50 - name: NanoMSMARCO num_bytes: 4492 num_examples: 50 - name: NanoNFCorpus num_bytes: 2447 num_examples: 50 - name: NanoNQ num_bytes: 6622 num_examples: 50 - name: NanoQuoraRetrieval num_bytes: 4630 num_examples: 50 - name: NanoSCIDOCS num_bytes: 6684 num_examples: 50 - name: NanoArguAna num_bytes: 77515 num_examples: 50 - name: NanoSciFact num_bytes: 6273 num_examples: 50 - name: NanoTouche2020 num_bytes: 3644 num_examples: 49 download_size: 116820 dataset_size: 142974 configs: - config_name: corpus data_files: - split: NanoClimateFEVER path: corpus/NanoClimateFEVER-* - split: NanoDBPedia path: corpus/NanoDBPedia-* - split: NanoFEVER path: corpus/NanoFEVER-* - split: NanoFiQA2018 path: corpus/NanoFiQA2018-* - split: NanoHotpotQA path: corpus/NanoHotpotQA-* - split: NanoMSMARCO path: corpus/NanoMSMARCO-* - split: NanoNFCorpus path: corpus/NanoNFCorpus-* - split: NanoNQ path: corpus/NanoNQ-* - split: NanoQuoraRetrieval path: corpus/NanoQuoraRetrieval-* - split: NanoSCIDOCS path: corpus/NanoSCIDOCS-* - split: NanoArguAna path: corpus/NanoArguAna-* - split: NanoSciFact path: corpus/NanoSciFact-* - split: NanoTouche2020 path: corpus/NanoTouche2020-* - config_name: qrels data_files: - split: NanoClimateFEVER path: qrels/NanoClimateFEVER-* - split: NanoDBPedia path: qrels/NanoDBPedia-* - split: NanoFEVER path: qrels/NanoFEVER-* - split: NanoFiQA2018 path: qrels/NanoFiQA2018-* - split: NanoHotpotQA path: qrels/NanoHotpotQA-* - split: NanoMSMARCO path: qrels/NanoMSMARCO-* - split: NanoNFCorpus path: qrels/NanoNFCorpus-* - split: NanoNQ path: qrels/NanoNQ-* - split: NanoQuoraRetrieval path: qrels/NanoQuoraRetrieval-* - split: NanoSCIDOCS path: qrels/NanoSCIDOCS-* - split: NanoArguAna path: qrels/NanoArguAna-* - split: NanoSciFact path: qrels/NanoSciFact-* - split: NanoTouche2020 path: qrels/NanoTouche2020-* - config_name: queries data_files: - split: NanoClimateFEVER path: queries/NanoClimateFEVER-* - split: NanoDBPedia path: queries/NanoDBPedia-* - split: NanoFEVER path: queries/NanoFEVER-* - split: NanoFiQA2018 path: queries/NanoFiQA2018-* - split: NanoHotpotQA path: queries/NanoHotpotQA-* - split: NanoMSMARCO path: queries/NanoMSMARCO-* - split: NanoNFCorpus path: queries/NanoNFCorpus-* - split: NanoNQ path: queries/NanoNQ-* - split: NanoQuoraRetrieval path: queries/NanoQuoraRetrieval-* - split: NanoSCIDOCS path: queries/NanoSCIDOCS-* - split: NanoArguAna path: queries/NanoArguAna-* - split: NanoSciFact path: queries/NanoSciFact-* - split: NanoTouche2020 path: queries/NanoTouche2020-* default: true ---

数据集信息: - 配置名称:语料库(corpus) 特征字段: - 名称:_id,数据类型:字符串 - 名称:text,数据类型:字符串 划分集: - 名称:NanoClimateFEVER,字节大小:6537653,样本数量:3408 - 名称:NanoDBPedia,字节大小:3125718,样本数量:6045 - 名称:NanoFEVER,字节大小:8198841,样本数量:4996 - 名称:NanoFiQA2018,字节大小:5677326,样本数量:4598 - 名称:NanoHotpotQA,字节大小:2652164,样本数量:5090 - 名称:NanoMSMARCO,字节大小:2203454,样本数量:5043 - 名称:NanoNFCorpus,字节大小:5296700,样本数量:2953 - 名称:NanoNQ,字节大小:3539146,样本数量:5035 - 名称:NanoQuoraRetrieval,字节大小:536226,样本数量:5046 - 名称:NanoSCIDOCS,字节大小:2599191,样本数量:2210 - 名称:NanoArguAna,字节大小:4725257,样本数量:3635 - 名称:NanoSciFact,字节大小:5036450,样本数量:2919 - 名称:NanoTouche2020,字节大小:15382584,样本数量:5745 下载总大小:35840420 字节 数据集总大小:65510710 字节 - 配置名称:相关性标注集(qrels) 特征字段: - 名称:查询ID(query-id),数据类型:字符串 - 名称:语料库ID(corpus-id),数据类型:字符串 划分集: - 名称:NanoClimateFEVER,字节大小:4361,样本数量:148 - 名称:NanoDBPedia,字节大小:60640,样本数量:1158 - 名称:NanoFEVER,字节大小:1630,样本数量:57 - 名称:NanoFiQA2018,字节大小:2200,样本数量:123 - 名称:NanoHotpotQA,字节大小:3885,样本数量:100 - 名称:NanoMSMARCO,字节大小:1065,样本数量:50 - 名称:NanoNFCorpus,字节大小:64851,样本数量:2518 - 名称:NanoNQ,字节大小:1340,样本数量:57 - 名称:NanoQuoraRetrieval,字节大小:1359,样本数量:70 - 名称:NanoSCIDOCS,字节大小:21472,样本数量:244 - 名称:NanoArguAna,字节大小:3496,样本数量:50 - 名称:NanoSciFact,字节大小:1054,样本数量:56 - 名称:NanoTouche2020,字节大小:45452,样本数量:932 下载总大小:91853 字节 数据集总大小:212805 字节 - 配置名称:查询集(queries) 特征字段: - 名称:_id,数据类型:字符串 - 名称:text,数据类型:字符串 划分集: - 名称:NanoClimateFEVER,字节大小:8888,样本数量:50 - 名称:NanoDBPedia,字节大小:4411,样本数量:50 - 名称:NanoFEVER,字节大小:4533,样本数量:50 - 名称:NanoFiQA2018,字节大小:4775,样本数量:50 - 名称:NanoHotpotQA,字节大小:8060,样本数量:50 - 名称:NanoMSMARCO,字节大小:4492,样本数量:50 - 名称:NanoNFCorpus,字节大小:2447,样本数量:50 - 名称:NanoNQ,字节大小:6622,样本数量:50 - 名称:NanoQuoraRetrieval,字节大小:4630,样本数量:50 - 名称:NanoSCIDOCS,字节大小:6684,样本数量:50 - 名称:NanoArguAna,字节大小:77515,样本数量:50 - 名称:NanoSciFact,字节大小:6273,样本数量:50 - 名称:NanoTouche2020,字节大小:3644,样本数量:49 下载总大小:116820 字节 数据集总大小:142974 字节 配置项: - 配置名称:corpus 数据文件: - 划分集:NanoClimateFEVER,路径:corpus/NanoClimateFEVER-* - 划分集:NanoDBPedia,路径:corpus/NanoDBPedia-* - 划分集:NanoFEVER,路径:corpus/NanoFEVER-* - 划分集:NanoFiQA2018,路径:corpus/NanoFiQA2018-* - 划分集:NanoHotpotQA,路径:corpus/NanoHotpotQA-* - 划分集:NanoMSMARCO,路径:corpus/NanoMSMARCO-* - 划分集:NanoNFCorpus,路径:corpus/NanoNFCorpus-* - 划分集:NanoNQ,路径:corpus/NanoNQ-* - 划分集:NanoQuoraRetrieval,路径:corpus/NanoQuoraRetrieval-* - 划分集:NanoSCIDOCS,路径:corpus/NanoSCIDOCS-* - 划分集:NanoArguAna,路径:corpus/NanoArguAna-* - 划分集:NanoSciFact,路径:corpus/NanoSciFact-* - 划分集:NanoTouche2020,路径:corpus/NanoTouche2020-* - 配置名称:qrels 数据文件: - 划分集:NanoClimateFEVER,路径:qrels/NanoClimateFEVER-* - 划分集:NanoDBPedia,路径:qrels/NanoDBPedia-* - 划分集:NanoFEVER,路径:qrels/NanoFEVER-* - 划分集:NanoFiQA2018,路径:qrels/NanoFiQA2018-* - 划分集:NanoHotpotQA,路径:qrels/NanoHotpotQA-* - 划分集:NanoMSMARCO,路径:qrels/NanoMSMARCO-* - 划分集:NanoNFCorpus,路径:qrels/NanoNFCorpus-* - 划分集:NanoNQ,路径:qrels/NanoNQ-* - 划分集:NanoQuoraRetrieval,路径:qrels/NanoQuoraRetrieval-* - 划分集:NanoSCIDOCS,路径:qrels/NanoSCIDOCS-* - 划分集:NanoArguAna,路径:qrels/NanoArguAna-* - 划分集:NanoSciFact,路径:qrels/NanoSciFact-* - 划分集:NanoTouche2020,路径:qrels/NanoTouche2020-* - 配置名称:queries 数据文件: - 划分集:NanoClimateFEVER,路径:queries/NanoClimateFEVER-* - 划分集:NanoDBPedia,路径:queries/NanoDBPedia-* - 划分集:NanoFEVER,路径:queries/NanoFEVER-* - 划分集:NanoFiQA2018,路径:queries/NanoFiQA2018-* - 划分集:NanoHotpotQA,路径:queries/NanoHotpotQA-* - 划分集:NanoMSMARCO,路径:queries/NanoMSMARCO-* - 划分集:NanoNFCorpus,路径:queries/NanoNFCorpus-* - 划分集:NanoNQ,路径:queries/NanoNQ-* - 划分集:NanoQuoraRetrieval,路径:queries/NanoQuoraRetrieval-* - 划分集:NanoSCIDOCS,路径:queries/NanoSCIDOCS-* - 划分集:NanoArguAna,路径:queries/NanoArguAna-* - 划分集:NanoSciFact,路径:queries/NanoSciFact-* - 划分集:NanoTouche2020,路径:queries/NanoTouche2020-* 默认配置:启用
提供机构:
tomaarsen
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作