raul3820/nomic-embed-supervised-data
收藏Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/raul3820/nomic-embed-supervised-data
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: query
dtype: string
- name: document
dtype: string
- name: dataset
dtype: string
splits:
- name: msmarco_distillation_simlm_rescored_reranked_min15
num_bytes: 16562728203.994059
num_examples: 485721
- name: medi_sts_flickr_sampled
num_bytes: 1745404884.3876216
num_examples: 51186
- name: nq_cocondensor_hn_mine_reranked_min15
num_bytes: 2387016620.109059
num_examples: 70002
- name: nli_simcse_50negs_fixed
num_bytes: 9397586432.087027
num_examples: 275595
- name: medi_sts_wiki_rephrasal
num_bytes: 852481579.1366885
num_examples: 25000
- name: medi_supernli_sampled
num_bytes: 6057359009.450488
num_examples: 177639
- name: fever_hn_mine
num_bytes: 4776795280.53452
num_examples: 140085
- name: hotpotqa_hn_mine_shuffled
num_bytes: 5796874738.129482
num_examples: 170000
- name: medi_sts_stackexchange_dupe
num_bytes: 3430078981.0775456
num_examples: 100591
download_size: 553637224
dataset_size: 1056232214.0
configs:
- config_name: default
data_files:
- split: msmarco_distillation_simlm_rescored_reranked_min15
path: data/msmarco_distillation_simlm_rescored_reranked_min15-*
- split: medi_sts_flickr_sampled
path: data/medi_sts_flickr_sampled-*
- split: nq_cocondensor_hn_mine_reranked_min15
path: data/nq_cocondensor_hn_mine_reranked_min15-*
- split: nli_simcse_50negs_fixed
path: data/nli_simcse_50negs_fixed-*
- split: medi_sts_wiki_rephrasal
path: data/medi_sts_wiki_rephrasal-*
- split: medi_supernli_sampled
path: data/medi_supernli_sampled-*
- split: fever_hn_mine
path: data/fever_hn_mine-*
- split: hotpotqa_hn_mine_shuffled
path: data/hotpotqa_hn_mine_shuffled-*
- split: medi_sts_stackexchange_dupe
path: data/medi_sts_stackexchange_dupe-*
---
提供机构:
raul3820



