projetomemoreba/mteb_mmarco_retrieval
收藏Hugging Face2024-04-23 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/projetomemoreba/mteb_mmarco_retrieval
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: corpus
features:
- name: _id
dtype: string
- name: text
dtype: string
- name: title
dtype: string
splits:
- name: corpus
num_bytes: 41083836
num_examples: 106813
download_size: 22665320
dataset_size: 41083836
- config_name: default
features:
- name: query-id
dtype: string
- name: document-id
dtype: string
- name: score
dtype: int64
splits:
- name: test
num_bytes: 217670
num_examples: 7437
download_size: 114174
dataset_size: 217670
- config_name: queries
features:
- name: _id
dtype: string
- name: text
dtype: string
splits:
- name: queries
num_bytes: 364422
num_examples: 6980
download_size: 250361
dataset_size: 364422
configs:
- config_name: corpus
data_files:
- split: corpus
path: corpus/corpus-*
- config_name: default
data_files:
- split: test
path: data/test-*
- config_name: queries
data_files:
- split: queries
path: queries/queries-*
---
提供机构:
projetomemoreba
原始信息汇总
数据集概述
配置名称:corpus
- 特征信息:
_id: 数据类型为字符串text: 数据类型为字符串title: 数据类型为字符串
- 分割信息:
corpus分割:- 字节数:41,083,836
- 示例数:106,813
- 下载大小: 22,665,320字节
- 数据集大小: 41,083,836字节
配置名称:default
- 特征信息:
query-id: 数据类型为字符串document-id: 数据类型为字符串score: 数据类型为int64
- 分割信息:
test分割:- 字节数:217,670
- 示例数:7,437
- 下载大小: 114,174字节
- 数据集大小: 217,670字节
配置名称:queries
- 特征信息:
_id: 数据类型为字符串text: 数据类型为字符串
- 分割信息:
queries分割:- 字节数:364,422
- 示例数:6,980
- 下载大小: 250,361字节
- 数据集大小: 364,422字节



