hgissbkh/T2-reranking
收藏Hugging Face2024-05-23 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/hgissbkh/T2-reranking
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: acge_text_embedding
features:
- name: query
dtype: string
- name: docs
sequence: string
- name: query_enc
sequence: float64
- name: docs_enc
sequence:
sequence: float64
- name: cos_scores
sequence: float64
- name: target
sequence: int64
splits:
- name: train
num_bytes: 1730220911
num_examples: 6123
download_size: 1286604489
dataset_size: 1730220911
- config_name: gte-large-zh
features:
- name: query
dtype: string
- name: docs
sequence: string
- name: query_enc
sequence: float64
- name: docs_enc
sequence:
sequence: float64
- name: cos_scores
sequence: float64
- name: target
sequence: int64
splits:
- name: train
num_bytes: 1078238063
num_examples: 6123
download_size: 789336708
dataset_size: 1078238063
- config_name: multilingual-e5-base
features:
- name: query
dtype: string
- name: docs
sequence: string
- name: query_enc
sequence: float64
- name: docs_enc
sequence:
sequence: float64
- name: cos_scores
sequence: float64
- name: target
sequence: int64
splits:
- name: train
num_bytes: 860910447
num_examples: 6123
download_size: 621666566
dataset_size: 860910447
- config_name: multilingual-e5-large
features:
- name: query
dtype: string
- name: docs
sequence: string
- name: query_enc
sequence: float64
- name: docs_enc
sequence:
sequence: float64
- name: cos_scores
sequence: float64
- name: target
sequence: int64
splits:
- name: train
num_bytes: 1078238063
num_examples: 6123
download_size: 788026921
dataset_size: 1078238063
- config_name: multilingual-e5-small
features:
- name: query
dtype: string
- name: docs
sequence: string
- name: query_enc
sequence: float64
- name: docs_enc
sequence:
sequence: float64
- name: cos_scores
sequence: float64
- name: target
sequence: int64
splits:
- name: train
num_bytes: 534919023
num_examples: 6123
download_size: 370065934
dataset_size: 534919023
- config_name: stella-mrl-large-zh-v3.5-1792d
features:
- name: query
dtype: string
- name: docs
sequence: string
- name: query_enc
sequence: float64
- name: docs_enc
sequence:
sequence: float64
- name: cos_scores
sequence: float64
- name: target
sequence: int64
splits:
- name: train
num_bytes: 1730220911
num_examples: 6123
download_size: 1289439878
dataset_size: 1730220911
configs:
- config_name: acge_text_embedding
data_files:
- split: train
path: acge_text_embedding/train-*
- config_name: gte-large-zh
data_files:
- split: train
path: gte-large-zh/train-*
- config_name: multilingual-e5-base
data_files:
- split: train
path: multilingual-e5-base/train-*
- config_name: multilingual-e5-large
data_files:
- split: train
path: multilingual-e5-large/train-*
- config_name: multilingual-e5-small
data_files:
- split: train
path: multilingual-e5-small/train-*
- config_name: stella-mrl-large-zh-v3.5-1792d
data_files:
- split: train
path: stella-mrl-large-zh-v3.5-1792d/train-*
---
提供机构:
hgissbkh
原始信息汇总
数据集概述
数据集配置名称:acge_text_embedding
- 特征:
query: 字符串类型docs: 字符串序列类型query_enc: 浮点数序列类型docs_enc: 浮点数序列类型cos_scores: 浮点数序列类型target: 整数序列类型
- 分割:
train: 1730220911字节,6123个样本
- 下载大小: 1286604489字节
- 数据集大小: 1730220911字节
数据集配置名称:gte-large-zh
- 特征:
query: 字符串类型docs: 字符串序列类型query_enc: 浮点数序列类型docs_enc: 浮点数序列类型cos_scores: 浮点数序列类型target: 整数序列类型
- 分割:
train: 1078238063字节,6123个样本
- 下载大小: 789336708字节
- 数据集大小: 1078238063字节
数据集配置名称:multilingual-e5-base
- 特征:
query: 字符串类型docs: 字符串序列类型query_enc: 浮点数序列类型docs_enc: 浮点数序列类型cos_scores: 浮点数序列类型target: 整数序列类型
- 分割:
train: 860910447字节,6123个样本
- 下载大小: 621666566字节
- 数据集大小: 860910447字节
数据集配置名称:multilingual-e5-large
- 特征:
query: 字符串类型docs: 字符串序列类型query_enc: 浮点数序列类型docs_enc: 浮点数序列类型cos_scores: 浮点数序列类型target: 整数序列类型
- 分割:
train: 1078238063字节,6123个样本
- 下载大小: 788026921字节
- 数据集大小: 1078238063字节
数据集配置名称:multilingual-e5-small
- 特征:
query: 字符串类型docs: 字符串序列类型query_enc: 浮点数序列类型docs_enc: 浮点数序列类型cos_scores: 浮点数序列类型target: 整数序列类型
- 分割:
train: 534919023字节,6123个样本
- 下载大小: 370065934字节
- 数据集大小: 534919023字节
数据集配置名称:stella-mrl-large-zh-v3.5-1792d
- 特征:
query: 字符串类型docs: 字符串序列类型query_enc: 浮点数序列类型docs_enc: 浮点数序列类型cos_scores: 浮点数序列类型target: 整数序列类型
- 分割:
train: 1730220911字节,6123个样本
- 下载大小: 1289439878字节
- 数据集大小: 1730220911字节



