cheafdevo56/InfluentialQueries
收藏Hugging Face2023-12-18 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/cheafdevo56/InfluentialQueries
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: query
struct:
- name: abstract
dtype: string
- name: corpus_id
dtype: int64
- name: title
dtype: string
- name: pos
struct:
- name: abstract
dtype: string
- name: corpus_id
dtype: int64
- name: title
dtype: string
- name: neg
struct:
- name: abstract
dtype: string
- name: corpus_id
dtype: int64
- name: score
dtype: int64
- name: title
dtype: string
splits:
- name: train
num_bytes: 170148959.9875996
num_examples: 41224
- name: validation
num_bytes: 18907733.012400392
num_examples: 4581
download_size: 112593393
dataset_size: 189056693.0
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: validation
path: data/validation-*
---
The dataset includes three main features: query, pos, and neg. Each feature contains fields such as abstract, corpus_id, and title, with neg also including an additional score field. The dataset is divided into a training set and a validation set, containing 41224 and 4581 samples respectively. The total download size of the dataset is 112593393 bytes, and the total size is 189056693 bytes. The dataset is configured as the default configuration, with the training and validation set data files stored in the data/train-* and data/validation-* paths respectively.
提供机构:
cheafdevo56
原始信息汇总
数据集概述
数据特征
- query:
- abstract: 数据类型为
string - corpus_id: 数据类型为
int64 - title: 数据类型为
string
- abstract: 数据类型为
- pos:
- abstract: 数据类型为
string - corpus_id: 数据类型为
int64 - title: 数据类型为
string
- abstract: 数据类型为
- neg:
- abstract: 数据类型为
string - corpus_id: 数据类型为
int64 - score: 数据类型为
int64 - title: 数据类型为
string
- abstract: 数据类型为
数据分割
- train:
- 字节数: 170148959.9875996
- 样本数: 41224
- validation:
- 字节数: 18907733.012400392
- 样本数: 4581
数据集大小
- 下载大小: 112593393
- 数据集大小: 189056693.0
配置
- default:
- train: 文件路径为
data/train-* - validation: 文件路径为
data/validation-*
- train: 文件路径为



