theojiang/hotpot-search-gte1.5-256
收藏Hugging Face2024-06-30 更新2024-07-06 收录
下载链接:
https://hf-mirror.com/datasets/theojiang/hotpot-search-gte1.5-256
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含查询和文档对,每个查询和文档都被标记化,并包含目标掩码和是否修剪的信息。数据集分为训练集和验证集,训练集包含85,924个示例,验证集包含4,523个示例。数据集的下载大小为352,784,736字节,总大小为797,953,471字节。
This dataset contains pairs of queries and documents, each of which is tokenized and includes target masks and information on whether it has been trimmed. The dataset is divided into a training set and a validation set, with the training set containing 85,924 examples and the validation set containing 4,523 examples. The download size of the dataset is 352,784,736 bytes, and the total size is 797,953,471 bytes.
提供机构:
theojiang
原始信息汇总
数据集概述
数据集特征
- query: 字符串类型
- document: 字符串类型
- query_tokens: 整数序列类型
- document_tokens: 整数序列类型
- target_mask: 浮点数序列类型
- is_trimmed: 布尔类型
数据集分割
- train:
- 字节数: 757984589
- 样本数: 85924
- val:
- 字节数: 39968882
- 样本数: 4523
数据集大小
- 下载大小: 352784736 字节
- 数据集总大小: 797953471 字节
配置
- config_name: default
- data_files:
- train: data/train-*
- val: data/val-*
- data_files:



