five

TeraflopAI/beir-minus-nanobeir-docs

收藏
Hugging Face2026-03-25 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/TeraflopAI/beir-minus-nanobeir-docs
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: NanoArguAna features: - name: _id dtype: string - name: title dtype: string - name: text dtype: string splits: - name: corpus num_bytes: 5433474 num_examples: 5039 download_size: 2961131 dataset_size: 5433474 - config_name: NanoClimateFEVER features: - name: _id dtype: string - name: title dtype: string - name: text dtype: string splits: - name: corpus num_bytes: 3089467432 num_examples: 5413185 download_size: 1949972183 dataset_size: 3089467432 - config_name: NanoDBPedia features: - name: _id dtype: string - name: title dtype: string - name: text dtype: string splits: - name: corpus num_bytes: 1636799801 num_examples: 4629877 download_size: 986616486 dataset_size: 1636799801 - config_name: NanoFEVER features: - name: _id dtype: string - name: title dtype: string - name: text dtype: string splits: - name: corpus num_bytes: 3088701452 num_examples: 5411572 download_size: 1949398317 dataset_size: 3088701452 - config_name: NanoFiQA2018 features: - name: _id dtype: string - name: title dtype: string - name: text dtype: string splits: - name: corpus num_bytes: 41083327 num_examples: 53040 download_size: 25208153 dataset_size: 41083327 - config_name: NanoHotpotQA features: - name: _id dtype: string - name: title dtype: string - name: text dtype: string splits: - name: corpus num_bytes: 1619295134 num_examples: 5228239 download_size: 979697846 dataset_size: 1619295134 - config_name: NanoMSMARCO features: - name: _id dtype: string - name: title dtype: string - name: text dtype: string splits: - name: corpus num_bytes: 3148196528 num_examples: 8836780 download_size: 1639745630 dataset_size: 3148196528 - config_name: NanoNFCorpus features: - name: _id dtype: string - name: title dtype: string - name: text dtype: string splits: - name: corpus num_bytes: 1047656 num_examples: 680 download_size: 581234 dataset_size: 1047656 - config_name: NanoNQ features: - name: _id dtype: string - name: title dtype: string - name: text dtype: string splits: - name: corpus num_bytes: 1378549857 num_examples: 2676433 download_size: 766204206 dataset_size: 1378549857 - config_name: NanoQuoraRetrieval features: - name: _id dtype: string - name: title dtype: string - name: text dtype: string splits: - name: corpus num_bytes: 41463258 num_examples: 517885 download_size: 23430956 dataset_size: 41463258 - config_name: NanoSCIDOCS features: - name: _id dtype: string - name: title dtype: string - name: text dtype: string splits: - name: corpus num_bytes: 29954935 num_examples: 23447 download_size: 17332895 dataset_size: 29954935 - config_name: NanoSciFact features: - name: _id dtype: string - name: title dtype: string - name: text dtype: string splits: - name: corpus num_bytes: 3350113 num_examples: 2264 download_size: 1932817 dataset_size: 3350113 - config_name: NanoTouche2020 features: - name: _id dtype: string - name: title dtype: string - name: text dtype: string splits: - name: corpus num_bytes: 665203792 num_examples: 376800 download_size: 350046263 dataset_size: 665203792 configs: - config_name: NanoArguAna data_files: - split: corpus path: NanoArguAna/corpus-* - config_name: NanoClimateFEVER data_files: - split: corpus path: NanoClimateFEVER/corpus-* - config_name: NanoDBPedia data_files: - split: corpus path: NanoDBPedia/corpus-* - config_name: NanoFEVER data_files: - split: corpus path: NanoFEVER/corpus-* - config_name: NanoFiQA2018 data_files: - split: corpus path: NanoFiQA2018/corpus-* - config_name: NanoHotpotQA data_files: - split: corpus path: NanoHotpotQA/corpus-* - config_name: NanoMSMARCO data_files: - split: corpus path: NanoMSMARCO/corpus-* - config_name: NanoNFCorpus data_files: - split: corpus path: NanoNFCorpus/corpus-* - config_name: NanoNQ data_files: - split: corpus path: NanoNQ/corpus-* - config_name: NanoQuoraRetrieval data_files: - split: corpus path: NanoQuoraRetrieval/corpus-* - config_name: NanoSCIDOCS data_files: - split: corpus path: NanoSCIDOCS/corpus-* - config_name: NanoSciFact data_files: - split: corpus path: NanoSciFact/corpus-* - config_name: NanoTouche2020 data_files: - split: corpus path: NanoTouche2020/corpus-* ---
提供机构:
TeraflopAI
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作